Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Strata + Hadoop World 2016 Speakers

New speakers are added regularly. Please check back to see the latest updates to the agenda.

Search Speakers

Jose Abelenda (Hotwire)

Jose Abelenda is the director of marketing analytics at Hotwire. Prior to that, he worked as a data scientist at PayPal.

Lior Abraham
Lior Abraham (Interana)

Lior Abraham is a cofounder of Interana, Inc. Lior was instrumental in scaling Facebook’s infrastructure from a few million users to over a billion, as well as bringing its most technically challenging products to market. This included building many of the backend systems that powered... Read More.

Armando Acosta

Armando Acosta has been in the IT industry for 14 years. With experience ranging from sales to product marketing and management to developing big data solutions, recently Armando has been focusing on the hyperscale market in server product management, hardware design, and big data solutions.... Read More.

Joseph Adler
Joseph Adler (Facebook), @jadler

Joseph Adler has many years of experience in data mining and data analysis at companies including DoubleClick, Verisign, and LinkedIn. Currently, he is director of product management and data science at Confluent. He is the holder of several patents for computer security and cryptography and... Read More.

Nidhi Aggarwal
Nidhi Aggarwal (Tamr, Inc.)

Nidhi Aggarwal leads strategy and marketing at Tamr. Prior to joining Tamr, Nidhi founded Cloud vLab, makers of qwikLAB, a software-learning platform used to create and deploy on-demand lab environments. In the years before Cloud vLab, Nidhi worked at McKinsey & Company, advising Fortune... Read More.

Sara Ahmadian
Sara Ahmadian (Seamless Planet), @saraahmadian

Sara Ahmadian is Seamless Planet’s globetrotting CEO and the relentless catalyst for Seamless Planet’s great journey. Sara has several years of experience developing large-scale business infrastructures at successful B2B startups in Silicon Valley. She was invited by President Obama to participate in the 2015... Read More.

John Akred
John Akred (Silicon Valley Data Science), @BigDataAnalysis

With over 15 years in advanced analytical applications and architecture, John Akred is dedicated to helping organizations become more data driven. As CTO of Silicon Valley Data Science, John combines deep expertise in analytics and data science with business acumen and dynamic engineering leadership.

... Read More.
Brad Allen
Brad Allen (Silicon Valley Data Science), @bradaallen

Brad Allen’s career has centered around the development and scaling of technologies that provide broad social benefit—for example, Peloton Technology (automated transportation), Aclima (air pollution sensing), and Embrace (developing-world healthcare). At Embrace, Brad led the late design and early market development trials for the Embrace... Read More.

T.J. Alumbaugh
T.J. Alumbaugh (Continuum Analytics), @talumbau

T.J. Alumbaugh is a developer at Continuum Analytics. He likes array-oriented computing, Python, and C++.

Franz Aman
Franz Aman (Informatica), @franzaman

Franz Aman is senior vice president of brand and demand at Informatica, where he is responsible for branding, global demand generation, marketing operations, content, and digital marketing. Previously, Franz held numerous executive positions within industry-leading technology companies, including SAP, BusinessObjects, BEA Systems, Read More.

Xavier Amatriain

Xavier Amatriain is VP of engineering at Quora, where he leads the team building the best source of knowledge in the Internet. With over 50 publications in different fields, Xavier is best known for his work on machine learning in general and recommender systems in... Read More.

Jeremy Anderson

Jeremy Anderson is a Design Lead at the Spark Technology Center, in San Francisco, focused on designing better experiences for the open source data community. Jeremy’s team has been active in contributing to Apache projects, including Zeppelin and SystemML. Prior to joining IBM, Jeremy... Read More.

Jesse Anderson
Jesse Anderson (Big Data Institute), @jessetanderson

Jesse Anderson is a data engineer, creative engineer, and managing director of the Big Data Institute. Jesse trains employees on big data—including cutting-edge technology like Apache Kafka, Apache Hadoop, and Apache Spark. He’s taught thousands of students at companies ranging from startups to... Read More.

Erik Andrejko
Erik Andrejko (The Climate Corporation)

Erik Andrejko leads the data science and research organization at the Climate Corporation, which applies large-scale statistical machine learning and data science to solve challenging problems in numerous domains such as climatology, agronomic modeling, and geospatial applications. Erik’s contributions to the Climate Corporation include defining... Read More.

Bruce Andrews
Bruce Andrews (US Department of Commerce), @DepSecAndrews

Bruce Andrews was confirmed as the deputy secretary of commerce on July 24, 2014, after being named acting deputy secretary of commerce by President Obama and Secretary Penny Pritzker on June 9, 2014. Previously, Bruce served as chief of staff to the secretary at the... Read More.

Ian Andrews

Ian Andrews is VP of products at Pivotal, where he is responsible for product strategy and marketing for Pivotal Cloud Foundry, Spring, and Big Data Suite. Prior to Pivotal, Ian was involved with market-defining startups such as Netscape, Opsware, and Aster Data.

Robert Bagley
Robert Bagley (ClickFox), @wrbagley

Robert Bagley is the Chief Research Officer at ClickFox, where he oversees the innovative application of data science and advanced analytics to chart future product features and offerings. Prior to his current role, he held various leadership positions at ClickFox focused on building high-performing engineering... Read More.

Brandon Ballinger

Brandon Ballinger is a cofounder at Cardiogram. Previously a cofounder at Sift Science and an engineer at Google on speech recognition and ads quality, Brandon was also called in by the White House to help fix He graduated from the University of Washington with... Read More.

Vishal Bamba
Vishal Bamba (Transamerica), @vishalbamba

Vishal Bamba is vice president of strategy and architecture at Transamerica Technology, where he leads a team focusing on innovation initiatives within the enterprise. Vishal has over 15 years of experience in distributed systems and has led many innovation projects. He has consulted and worked... Read More.

Nenshad  Bardoliwalla

Nenshad Bardoliwalla is the founding vice president of products at Paxata, where he is responsible for product strategy, product management, and product marketing. Nenshad is an executive and thought leader with a proven track record of success leading product strategy, product management, and development in... Read More.

Paul Barth
Paul Barth (Podium Data), @PodiumData

Paul Barth is founder and CEO of Podium Data, creator of the industry-leading Podium data lake software platform, which is redefining enterprise data management. He has spent decades developing advanced data and analytics solutions for Fortune 100 companies and is a recognized thought leader... Read More.

Pierre Barthelemy
Pierre Barthelemy (Coursera)

Pierre Thomas Barthelemy is the engineering lead of the Data Infrastructure team at Coursera. The team is responsible for introducing core data systems (e.g., data warehouse using Redshift, ETL using Data Pipeline and Scalding), while also helping build products that create a developer-friendly ecosystem... Read More.

Joel Baxter
Joel Baxter (BlueData)

Joel Baxter is an engineer at BlueData, where he focuses on virtualization, containers, and Hadoop-related technologies to build an infrastructure platform for big data analytics. His background is in the provisioning and configuration of virtual compute, storage, and networking to serve the needs of application... Read More.

Maxime Beauchemin

Maxime Beauchemin is a senior software engineer at Lyft, where he develops open source products that reduce friction and help generate insights from data. He’s the creator and a lead maintainer of data pipeline workflow engine Apache Airflow (incubating) and data visualization platform Apache Superset... Read More.

Marie Beaugureau
Marie Beaugureau (O'Reilly Media, Inc. )

Marie Beaugureau is the lead data editor for O’Reilly Media.

Alexander Behm (Cloudera)

Alex Behm is a software engineer at Cloudera, working on the Impala team. He holds a PhD in computer science from UC Irvine.

John Belchamber
John Belchamber (Telefónica)

John Belchamber is global head of business intelligence for the Advanced Analytics team at Telefónica. John has 10 years of experience in marketing and 10 more in the telecom industry, where he held strategic roles in innovation and business intelligence. Recognized as Data Professional of... Read More.

Tim Berglund
Tim Berglund (Confluent), @tlberglund

Tim Berglund is the senior director of developer experience with Confluent, where he serves as a teacher, author, and technology leader. Tim can frequently be found speaking at conferences internationally and in the United States. He’s the copresenter of various O’Reilly training videos on topics... Read More.

Kristina Bergman (Ignition Partners)
Lucy Bernholz
Lucy Bernholz (Stanford University), @p2173

Lucy Bernholz is a senior research scholar at Stanford University, where she runs the Digital Civil Society Lab. Lucy blogs about philanthropy, nonprofits, and technology at

Christopher Berry
Christopher Berry (Canadian Broadcasting Corporation)

Christopher Berry is a data scientist at the Canadian Broadcasting Corporation and the founder of Authintic (acquired by 500px). Christopher has implemented breakthrough social-analytics programs for AB-Inbev, Research In Motion, and Coca-Cola. He participated in ecommerce redesigns at Gucci and Dell, mobile integrations for Best... Read More.

John Berryman
John Berryman (Eventbrite), @JnBrymn

John Berryman started out in the field of aerospace engineering, but soon found that he was more interested in math and software than in satellites and aircraft. He made the leap into software development, specializing in search and recommendation technologies. John’s a senior software engineer... Read More.

David Beyer
David Beyer (Amplify Partners), @dbeyer123

David Beyer is currently an investor with Amplify Partners, a $50M VC firm focused exclusively on early-stage IT infrastructure and data companies. David began his career in technology as the cofounder and CEO of, a pioneering provider of cloud-based data visualization and analytics.... Read More.

Milind Bhandarkar
Milind Bhandarkar (Ampool, Inc.), @techmilind

Milind Bhandarkar was the founding member of the team at Yahoo that took Apache Hadoop from 20-node prototype to data center-scale production system and has been contributing and working with Hadoop since version 0.1.0. Milind started the Yahoo Grid solutions team focused on training, consulting,... Read More.

Anurag  Bhardwaj
Anurag Bhardwaj (Quad Analytix)

Anurag Bhardwaj currently leads data science efforts at Quad Analytix, where he focuses on large-scale product classification, large-scale smart extraction, and various other machine-learning techniques. Previously, he worked on image understanding at eBay Research Labs. Anurag received his PhD and MS from the State University... Read More.

LORI BIEDA (Bank of Montreal), @loribieda

Lori Bieda is head of the Bank of Montreal’s Analytics Centre of Excellence, where she oversees analytics, including revenue, risk, and price trade-off decisions, product analytics, customer optimization, database marketing, predictive analytics, customer experience, sales, and service optimization. She also leads enterprise customer journey analytics,... Read More.

Keith Bigelow
Keith Bigelow (3D Robotics), @keithbigelow

Keith Bigelow leads the commercial cloud and drone division at 3DR. Prior to 3DR, Keith was the GM and SVP of the Analytics Cloud, the fastest-growing product line in Salesforce history. Previously, Keith held executive positions at SAP, where he served as Read More.

Sarah Bird
Sarah Bird (Continuum Analytics)

Sarah Bird is a software engineer at Continuum Analytics. She has been a core Bokeh developer since 2015 and has given numerous talks and tutorials on Bokeh. Previously, she worked at Aptivate as a full stack web developer building IT solutions for the international development... Read More.

Joerg Blumtritt
Joerg Blumtritt (Datarella), @jbenno

Joerg Blumtritt is the founder and CEO of Datarella, a computational social science startup delivering mobile analytics, self-tracking solutions, and data science consulting. After graduating from university with a thesis on machine learning, Joerg worked as a researcher in behavioral sciences, focused on nonverbal... Read More.

Farrah Bostic
Farrah Bostic (The Difference Engine), @farrahbostic

Farrah Bostic is the founder of the Difference Engine, which she created based on her belief that deep understanding of customer needs is essential to growing businesses through great products and services. Farrah has honed her customer-centric insights as an advisor to some of the... Read More.

Roni Burd
Roni Burd (Microsoft)

Yaron “Roni” Burd is a principal program manager on the Big Data team at Microsoft working on Hadoop and Azure Data Lake, where he focuses on making machine learning with big data scalable and easy. Roni has spent eight years helping Microsoft build its internal... Read More.

Mark Burnette
Mark Burnette (Pentaho, a Hitachi Group Company)

Mark Burnette is a director of sales engineering and major accounts at Pentaho, where he leads teams of engineers across western US and Japan that focus on designing and proving out big data and embedded solutions for Fortune 500 companies, including cybersecurity, telematics, mobile network... Read More.

Matt Butner
Matt Butner (Stride Health), @butner

Matt Butner is CTO and cofounder of Stride Health, which delivers intelligent healthcare, coverage, and tax compliance to self-employed and independent working Americans. Stride’s suite of benefits for independents is directly integrated into the largest on-demand marketplaces, including Uber, Postmates, and TaskRabbit. Backed by... Read More.

Mike Cafarella
Mike Cafarella (University of Michigan), @MikeCafarella

Mike Cafarella is one of the cofounders of the Apache Hadoop and Nutch open source projects. Mike is also an assistant professor of computer science and engineering at the University of Michigan. His research interests include databases, information extraction, data integration, and data mining. Recently,... Read More.

JR Cahill
JR Cahill (Kellogg)

JR Cahill leads the Enterprise Analytics Architecture team at Kellogg, supporting the Global Analytics team that consists of Data Science, Advanced Analytics, Data Services, Visualizations and Reporting. JR has 20 years of operational development and architecture experience in data warehousing and analytics. He is also... Read More.

Arturo Canales
Arturo Canales (Telefónica)

Arturo Canales leads the analytics team in the Global BI & Big Data unit at Telefónica. Arturo has been involved in the creation of many data products for internal BI teams across all the countries where Telefónica operates, from a social-network-analysis-approach product to better understand... Read More.

Arno Candel

Arno Candel is the chief architect at H2O, a distributed and scalable open source machine-learning platform. Arno is also the main author of H2O’s Deep Learning. Before joining H2O, Arno was a founding senior MTS at Skytree, where he designed and implemented high-performance machine-learning... Read More.

John Canny
John Canny (UC Berkeley)

John F. Canny is a computer scientist and the Paul and Stacy Jacobs Distinguished Professor of Engineering in the Computer Science Department of the University of California, Berkeley. John has made significant contributions in various areas of computer science and mathematics, including artificial intelligence, robotics,... Read More.

Matt Cardillo (FINRA)

Matt Cardillo is a senior director of FINRA technology. Matt is an avid Scrum evangelist at FINRA and exercises it in the delivery of highly usable, innovative big data analytic solutions.

Amber Case
Amber Case (MIT Media Lab), @caseorganic

Amber Case studies the interaction between humans and computers and how our relationship with information is changing the way cultures think, act, and understand their worlds. She is currently a fellow at Harvard University’s Berkman Klein Center for Internet and Society and a visiting researcher... Read More.

Michele Chambers
Michele Chambers (Continuum Analytics), @mcAnalytics

An entrepreneurial executive with over 25 years of industry experience, Michele Chambers is currently CMO of Continuum Analytics. Prior to Continuum Analytics, Michele held executive leadership roles at database and analytic companies Netezza, IBM, Revolution Analytics, MemSQL, and RapidMiner. In her career, Michele... Read More.

Evan Chan
Evan Chan (Tuplejump), @evanfchan

Evan Chan is a distinguished software engineer at Tuplejump. Evan loves to design, build, and improve bleeding-edge distributed data and backend systems using the latest open source technologies. He has led the design and implementation of multiple big data platforms based on Storm, Spark, Kafka,... Read More.

Vinoth Chandar
Vinoth Chandar (Apache Hudi), @byte_array

Vinoth Chandar is the cocreator of the Hudi project at Uber and also PMC and lead of Apache Hudi (Incubating). Previously, he was a senior staff engineer at Uber, where he led projects across various technology areas like data infrastructure, data architecture, and mobile... Read More.

Manjeet Chayel
Manjeet Chayel (Amazon Web Services)

Manjeet Chayel is a specialist SA for AWS working on big data technology solutions. Manjeet focuses on Amazon EMR and helps customers solve their big data problems using the right techniques and tools for the job.

Ewen Cheslack-Postava

Ewen Cheslack-Postava is an engineer at Confluent building a stream data platform based on Apache Kafka to help organizations reliably and robustly capture and leverage all their real-time data. Ewen received his PhD from Stanford University, where he developed Sirikata, an open source system for... Read More.

Adam Cheyer
Adam Cheyer (Samsung), @acheyer

Adam Cheyer is a vice president of R&D at Samsung. Previously, he was cofounder and vice president of engineering at artificial intelligence company Viv (acquired by Samsung in 2016); was cofounder and vice president of engineering at Siri (acquired by Apple in 2010); cofounded Sentient... Read More.

Trina Chiasson
Trina Chiasson (Tableau Software), @trinachi

Trina Chiasson lives at the intersection of data, design, and code. Trina is a senior product manager at Tableau Software, where she enjoys helping people see and understand data. Previously, she was the cofounder and CEO of Infoactive, a web app for turning live... Read More.

Mok Choe
Mok Choe (TD Bank Group )

Mok Choe is an accomplished technologist whose career spans a diverse group of financial services businesses and successful Internet companies. Mok is a proven transformational leader, with extensive experience leading enterprise architecture at firms including TD Bank Group, Commonwealth Bank of Australia, Union Bank of... Read More.

Kelvin Chu (Uber)

Kelvin Chu is a founding member of the Hadoop team at Uber, where he creates tools and services on top of Spark to support multitenancy and large-scale computation-intensive applications. Kelvin is the creator and lead engineer of the Spark Uber development kit, Paricon, SparkPlug, and... Read More.

Brian Clapper
Brian Clapper (Databricks)

Brian Clapper is a senior instructor and curriculum developer at Databricks. Brian has more than 32 years’ experience as a software developer. Brian has worked for a stock exchange, the US Navy, a large software company, several startups, and small companies and, most recently, as... Read More.

Brian Clark
Brian Clark (Objectivity)

Brian Clark is VP of product management at Objectivity. Brian has nearly 30 years of software and technology experience and was one of the early architects of Objectivity/DB. Before joining Objectivity, Brian worked at Automation Technology Products, providing leading tools in the MCAD market.... Read More.

Christopher Colburn

Christopher Colburn is just another data scientist at Netflix.

Eric Colson
Eric Colson (Stitch Fix), @ericcolson

Eric Colson is chief algorithms officer at Stitch Fix, where he leads a team of 100+ data scientists and is responsible for the multitude of algorithms that are pervasive to nearly every function of the company, from merchandise, inventory, and marketing to forecasting and demand,... Read More.

Michael Conover

Mike Conover builds machine-learning technologies that leverage the behavior and relationships of hundreds of millions of people. A staff data scientist at LinkedIn, Mike has a PhD in complex systems analysis with a focus on information propagation in large-scale social networks. His work has appeared... Read More.

James Crawford
James Crawford (Orbital Insight), @orbital_insight

James Crawford is the founder and CEO of Orbital Insight, where he leads the company’s efforts to leverage artificial intelligence to create geospatial analytics for an interconnected world. Previously, Jimi was the SVP of science and engineering at the Climate Corporation; CTO... Read More.

Charlie Crocker
Charlie Crocker (Autodesk)

Charlie Crocker is a data geek with 20 years of experience bringing data out of the shadows to drive business value and optimize operational costs. At Autodesk, he is currently working across divisions to identify and validate potential reliable data sources and access mechanisms, while... Read More.

Alistair Croll
Alistair Croll (Solve For Interesting), @acroll

Alistair Croll is an entrepreneur with a background in web performance, analytics, cloud computing, and business strategy. In 2001, he cofounded Coradiant (acquired by BMC in 2011) and has since helped launch Rednod, CloudOps, Bitcurrent, Year One Labs, and several other early-stage companies. He works... Read More.

Nick Curcuru
Nick Curcuru (Mastercard)

Nick Curcuru is vice president of enterprise information management at Mastercard, where he’s responsible for leading a team that works with organizations to generate revenue through smart data, architect next-generation technology platforms, and protect data assets from cyberattacks by leveraging Mastercard’s information technology and information... Read More.

Doug Cutting
Doug Cutting (Cloudera), @cutting

Doug Cutting is the chief architect at Cloudera and the founder of numerous successful open source projects, including Lucene, Nutch, Avro, and Hadoop. Doug joined Cloudera from Yahoo, where he was a key member of the team that built and deployed a production Hadoop storage-and-analysis... Read More.

Michelangelo D'Agostino

Michelangelo D’Agostino is the vice president of data science and engineering at ShopRunner, where he leads a team that develops statistical models and writes software that leverages their unique cross-retailer ecommerce dataset. Previously, Michelangelo led the data science R&D team at Civis Analytics, a Chicago-based... Read More.

Timothy Danford
Timothy Danford (Tamr, Inc.)

Timothy Danford is a computer scientist working on advanced automation approaches to big data variety in the pharmaceutical and healthcare industries. Previously, Timothy was a software architect, engineer, and founding team member for Genome Bridge LLC, a Broad Institute subsidiary organized to develop... Read More.

Tathagata Das
Tathagata Das (Databricks)

Tathagata Das is an Apache Spark committer and a member of the PMC. He is the lead developer behind Spark Streaming, which he started while a PhD student in the UC Berkeley AMPLab, and is currently employed at Databricks. Prior to Databricks, Tathagata worked... Read More.

Sudipto Dasgupta
Sudipto Dasgupta (Infosys Limited)

Sudipto Shankar Dasgupta is a AVP and head of engineering for the Platforms group at Infosys Ltd., where he works on big data and analytics platform development for large enterprises. Prior to that he was chief architect with SAP, working on SAP... Read More.

Michael Dauber
Michael Dauber (Amplify Partners), @dauber

Michael Dauber is a general partner at Amplify Partners. Previously, Mike spent over six years at Battery Ventures, where he led early-stage enterprise investments on the West Coast, including Battery’s investment in a stealth security company that is also in Amplify’s portfolio. Mike has served... Read More.

michael dddd

Michael Armbrust is the lead developer of the Spark SQL and Structured Streaming projects at Databricks. Michael’s interests broadly include distributed systems, large-scale structured storage, and query optimization. Michael holds a PhD from UC Berkeley, where his thesis focused on building systems that allow... Read More.

Bolke de Bruin is putting advanced analytics in the heart of the wholesale business line of European bank ING.

Donna Denio
Donna Denio (Team Dynamics Boston)

Donna Denio is a communications and business development specialist who is passionate about teamwork and generating productive relationships. Donna has over 20 years’ experience helping leaders of multinational companies identify and secure new business opportunities in design and construction. Ten years ago, Donna’s search for... Read More.

Anthony Dina
Anthony Dina (Dell EMC)

Anthony Dina serves as the director of enterprise technologists at Dell, Inc., where he leads a team of solutions architects with expertise in big data and application acceleration to work with customers on how to transform IT into better business outcomes. Anthony has 17 years... Read More.

Renee DiResta
Renee DiResta (New Knowledge), @noupside

Renee DiResta is Director of Research at cybersecurity company New Knowledge and a Mozilla Fellow in Media, Misinformation, and Trust. She investigates the spread of disinformation and malign narratives across social networks, and has advised the Congress, the State Department, and senior executives on how... Read More.

Scott Donaldson

Scott Donaldson is senior director for Market Regulation Technology at FINRA. Scott leads the data and analytics teams responsible for the surveillance of US equities and fixed-income markets.

Mark Donsky
Mark Donsky (Okera)

Mark Donsky is a director of product management at Okera, a software provider that provides discovery, access control, and governance at scale for today’s modern heterogenous data environments, where he leads product management. Previously, Mark led data management and governance solutions at Cloudera, and he’s... Read More.

Scott Draves
Scott Draves (Two Sigma Open Source), @BeakerNotebook

Scott Draves is an award-winning software artist, VJ, and pioneer of the open source movement. His clients and exhibitions range from the likes of, LACMA, Google, and the Adler Planetarium to Skrillex. He has a PhD in computer science from Carnegie Mellon University... Read More.

Chris DuBois

Chris DuBois is a data scientist focused on building tools for other data scientists. At Dato, Chris has helped design and implement tools for creating recommendation systems and for large-scale text analysis. His current work makes it simpler to train models that generalize well. After... Read More.

Ted Dunning
Ted Dunning (MapR, now part of HPE), @ted_dunning

Ted Dunning is the chief technology officer at MapR, an HPE company. He’s also a board member for the Apache Software Foundation, a PMC member, and committer on a number of projects. Ted has years of experience with machine learning and other big... Read More.

Don Bosco Durai
Don Bosco Durai (Privacera)

Don Bosco Durai (Bosco) is a thought leader in enterprise security and is a committer in open source projects like Apache Ranger, Apache Ambari, and Apache HAWQ. He has also contributed towards the security for most of the Hadoop components. Bosco was the co-founder... Read More.

Glynn Durham
Glynn Durham (Cloudera)

Glynn Durham is a senior instructor at Cloudera. Previously, he worked for Oracle, Forté Software, MySQL, and Cloudera, spending five or more years at each.

Gary Dusbabek
Gary Dusbabek (Silicon Valley Data Science)

An Apache Cassandra committer and PMC member, Gary Dusbabek specializes in building distributed systems. His recent experience includes creating an open source high-volume metrics processing pipeline and building out several geographically distributed API services in the cloud.

Joey Echeverria

Joey Echeverria is the director of engineering at Rocana, where he builds applications for scaling IT operations built on the Apache Hadoop platform. Joey is a committer on the Kite SDK, an Apache-licensed data API for the Hadoop ecosystem. Joey was previously a... Read More.

Helena Edelson

Committer to several open source projects including the Spark Cassandra Connector, Cassandra Kafka Connector, a previous contributor to Akka (2 new features in Akka Cluster), Spring Integration and several others. She is also a speaker at international Big Data and Scala conferences: Kafka Summit, Spark... Read More.

Alyosha Efros
Alyosha Efros (UC Berkeley), @UCBerkeley

Alexei (Alyosha) Efros is an associate professor of electrical engineering and computer science at UC Berkeley. Previously, Alyosha spent nine years on the faculty of Carnegie Mellon University and has also been affiliated with École Normale Supérieure/Inria and the University of Oxford. Alyosha’s research is... Read More.

Jana Eggers
Jana Eggers (Nara Logics), @jeggers

Jana Eggers is CEO of Nara Logics, a neuroscience-inspired artificial intelligence company providing a platform for recommendations and decision support. A math and computer nerd who took the business path, Jana has had a career that’s taken her from a three-person business to fifty-thousand-plus-person... Read More.

Stephen Elston
Stephen Elston (Quantia Analytics, LLC)
R Day (Full Day) Tutorial

Stephen Elston is an experienced big data geek, data scientist, and software business leader. Steve is principal consultant at Quantia Analytics, LLC, where he leads the building of new business lines, manages P&L, and takes software products from concept and financing through development, intellectual... Read More.

Bin Fan
Bin Fan (Alluxio)

Bin Fan is a software engineer at Alluxio and a PMC member of the Alluxio project. Previously, Bin worked at Google, building next-generation storage infrastructure, where he won Google’s technical infrastructure award. He holds a PhD in computer science from Carnegie Mellon University.

Moty Fania
Moty Fania (Intel)

Moty Fania is a principle engineer for big data analytics at Intel IT, where he drives the overall technology and architectural roadmap and owns development and architecture. Moty has over 13 years of experience in BI, data warehousing, and decision-support solutions. He holds a bachelor’s... Read More.

Faisal Farooq
Faisal Farooq (IBM Watson Health)

Faisal Farooq is currently the principal scientist in the Watson Health group of IBM Watson, where he works on next-generation healthcare software to improve patient care. Faisal is an expert in applying machine learning in the healthcare domain, and his general areas of interest... Read More.

Sameer Farooqui

Sameer Farooqui is a client services engineer at Databricks, where he works with customers on Apache Spark deployments. Sameer works with the Hadoop ecosystem, Cassandra, Couchbase, and general NoSQL domain. Prior to Databricks, he worked as a freelance big data consultant and trainer globally and... Read More.

Camille Fournier
Camille Fournier (Independent), @skamille

Camille Fournier is the former head of engineering at Rent the Runway. She was previously a vice president at Goldman Sachs. Camille is an Apache ZooKeeper committer and PMC member and a Dropwizard framework PMC member.

Michael Franklin
Michael Franklin (AMPLab/UC Berkeley), @franklinmj

Michael Franklin is the Thomas M. Siebel Professor of Computer Science at UC Berkeley and the director of the AMPLab. The AMPLab, which received an NSF CISE Expeditions in Computing award announced as part of the White House Big Data Research Initiative in... Read More.

Eric Frenkiel

Eric Frenkiel is the cofounder and CEO of MemSQL, an in-memory distributed database that combines real-time and historical big data analytics. MemSQL is a Y Combinator company that has raised more than $45M in venture capital. Prior to MemSQL, Eric worked at Facebook on... Read More.

Julia Galef
Julia Galef (Center for Applied Rationality), @juliagalef

Julia Galef cofounded the Center for Applied Rationality (CFAR), a nonprofit devoted to developing cognitive, science-based strategies for reasoning and decision making. In addition to research, CFAR runs workshops for companies and talented individuals who want to use rationality to address global problems.... Read More.

Tanya Gallagher
Tanya Gallagher (DataStax)

Tanya Gallagher is a veteran technical instructor with thousands of hours of classroom experience across a 20-year career. Tanya has spent the past two years at DataStax writing curriculum and leading the curriculum development team. Prior to DataStax, she was a curriculum developer and technical... Read More.

Ilya Ganelin
Ilya Ganelin (Capital One Data Innovation Lab)

Ilya Ganelin is a roboticist turned data engineer. After a few years building self-discovering robots at the University of Michigan and another few years working on embedded DSP software with cell phones and radios at Boeing, he landed in the world of big data... Read More.

Siddha Ganju

Siddha Ganju is a self-driving architect at NVIDIA. She was featured on the Forbes 30 under 30 list, and she guides teams at NASA as an AI domain expert and is a featured jury member in several informational tech competitions. Previously, she developed... Read More.

Yael Garten
Yael Garten (LinkedIn)

Yael Garten is director of data science at LinkedIn, where she leads a team that focuses on understanding and increasing growth and engagement of LinkedIn’s 400 million members across mobile and desktop consumer products. Yael is an expert at converting data into actionable product and... Read More.

Deepak Gattala is a big data architect in IT project management at Dell.

Matthew Gee
Matthew Gee (Impact Lab/University of Chicago )

Matthew Gee is cofounder and principal at the Impact Lab, a data-analytics company focused exclusively on developing scalable data science solutions to social-sector problems. He is also a senior research scientist at the University of Chicago’s Center for Data Science and Public Policy and a... Read More.

Lise Getoor
Lise Getoor (University of California, Santa Cruz)

Lise Getoor is a professor in the Computer Science Department at the University of California, Santa Cruz, and director of the UCSC Data Science Research Center. Her research areas include machine learning, data integration, and reasoning under uncertainty, with an emphasis on graph and... Read More.

Charles Givre
Charles Givre (Deutsche Bank), @cgivre

Charles Givre is an unapologetic data geek who is passionate about helping others learn about data science and become passionate about it themselves. For the last five years, Charles has worked as a data scientist at Booz Allen Hamilton for various government clients and... Read More.

Colette Glaeser
Colette Glaeser (Silicon Valley Data Science), @ColetteGlaeser

Colette Glaeser is a principal data strategist at Silicon Valley Data Science. With a proven track record in applying analytics to provide a competitive advantage, Colette brings over 20 years of experience in driving business development, customer insight, operational analysis, and continuous process improvement across... Read More.

Dennis Gleeson
Dennis Gleeson (1010data)

Dennis Gleeson is the chief evangelist at 1010data. Prior to joining 1010data, Dennis was a director of strategy in the Central Intelligence Agency (CIA)’s Directorate of Analysis. He began his career with the CIA in 2002 as a political analyst. In 2009, he... Read More.

Scott Gnau
Scott Gnau (Hortonworks)

Scott Gnau is the CTO of Hortonworks, a company at the forefront of emerging connected data platforms, where he works intimately with leaders in the Fortune 1000 undergoing business transformation through real-time data. Scott has spent his entire career in the data industry; previously,... Read More.

Joe Goldberg
Joe Goldberg (BMC Software), @GoldbergJoe

Joe Goldberg is the lead solutions marketing manager at BMC Software, where he helps BMC products leverage new technology to deliver market-leading solutions with a focus on workload automation and big data. Joe has more than 35 years of experience in the design,... Read More.

Kevin Goode
Kevin Goode (Inmar)

Kevin Goode is the director of platform engineering at Inmar. Kevin has 20 years of IT experience, 19 years of which has been SQL-server focused, starting with version 6.5. For the past four years, he has been focused on big data, Hadoop, and NoSQL.... Read More.

Alex Gorelik
Alex Gorelik (Waterline Data), @gorelikalex

Alex Gorelik is the founder and CEO of Waterline Data, a startup focused on enhancing the value of Hadoop through data self-service and governance. Alex is a serial entrepreneur and innovator who has spent over 25 years inventing and bringing to market cutting-edge data-oriented... Read More.

Jonathan Gosier
Jonathan Gosier (AuDigent), @jongos

Jon Gosier is a serial tech entrepreneur and venture capitalist working at the intersection of data science and design. Based in Philadelphia, Jon is also the cofounder of Predictive Pop (aka PredPop), a data company changing way the music industry monitors and monetizes music. Prior... Read More.

Alexander Gray
Alexander Gray (Skytree, Inc.), @skytreeHQ

Alexander Gray is an associate professor at Georgia Tech and the CEO of Skytree, Inc. His research focuses on scaling up all of the major practical methods of machine learning (ML) to massive datasets. Alex began working on this problem at NASA in... Read More.

Dave Gray
Dave Gray (XPLANE), @davegray

Dave Gray is the founder and chairman of XPLANE, the visual thinking company. Founded in 1993, XPLANE has grown to be the world’s leading consulting and design firm focused on information-driven communications. Dave spends his time researching and writing about visual business, as... Read More.

Garrett Grolemund
Garrett Grolemund (RStudio)
R Day (Full Day) Tutorial

Garrett Grolemund is a data scientist and chief instructor for RStudio, Inc. Garrett is a longtime user and advocate of R; he wrote the popular lubridate package for working with dates and times in R. Garrett designed and delivered the highly rated O’Reilly video series... Read More.

Robert Grossman
Robert Grossman (University of Chicago)

Robert Grossman is a faculty member and the chief research informatics officer in the Biological Sciences Division of the University of Chicago. Robert is the director of the Center for Data Intensive Science (CDIS) and a senior fellow at both the Computation Institute... Read More.

Mark Grover

Mark Grover is a product manager at Lyft. Mark’s a committer on Apache Bigtop, a committer and PPMC member on Apache Spot (incubating), and a committer and PMC member on Apache Sentry. He’s also contributed to a number of open source projects, including... Read More.

Carlos Guestrin
Carlos Guestrin (Dato Inc.), @guestrin

Carlos Guestrin is the Amazon Professor of Machine Learning in Computer Science & Engineering at the University of Washington and the cofounder and CEO of Dato. Carlos also coteaches the Machine Learning Specialization through UW and Coursera. His previous positions include the Finmeccanica Associate... Read More.

Kanu Gulati
Kanu Gulati (Zetta Venture Partners)

Kanu Gulati is a senior associate at Zetta Venture Partners. Kanu has over 10 years of operating experience as an engineer, scientist, and strategist. She owned Intel’s multicore CAD algorithms research roadmap, developed advanced parallel CAD solutions, and pioneered metrics-driven methodology improvements for... Read More.

Sijie Guo
Sijie Guo (StreamNative), @sijieg

Sijie Guo is the founder and CEO of StreamNative. StreamNative is a data infrastructure startup offering cloud native event streaming platform based on Apache Pulsar for enterprises. Previously, he was the tech lead for the Messaging Group at Twitter, and worked on push notification... Read More.

Vida Ha
Vida Ha (Databricks), @femineer

Vida Ha is currently a solutions engineer at Databricks. Previously, she worked on scaling Square’s reporting analytics system. Vida first began working with distributed computing at Google, where she improved search rankings of mobile-specific web content and built and tuned language models for speech recognition... Read More.

Patrick Hall
Patrick Hall (SAS)

Patrick Hall is a senior staff scientist at SAS and an adjunct professor in the Department of Decision Sciences at George Washington University. Patrick designs new data-mining and machine-learning technologies. He is the 11th person worldwide to become a Cloudera certified data scientist. Patrick... Read More.

Jordan Hambleton
Jordan Hambleton (Cloudera)

Jordan Hambleton is a Consulting Manager and Senior Architect at Cloudera, where he partners with customers to build and manage scalable enterprise products on the Hadoop stack. Previously, Jordan was a member of technical staff at NetApp, where he designed and implemented the NRT... Read More.

Bob Hansen
Bob Hansen (HPE)

Bob Hansen, the engineer in charge of making Vertica a vibrant part of the greater Hadoop ecosystem, turns customers’ needs into new features, making Vertica a peaceful island floating in the center of your data lake. Over his entire career, Bob has been dedicated to... Read More.

Moritz Hardt
Moritz Hardt (Google), @mrtz

Moritz Hardt is a senior research scientist at Google Research, where his mission is to build the theory and tools that make machine learning more reliable. After obtaining a PhD in computer science from Princeton University, Moritz spent three years at IBM Research Almaden... Read More.

Todd Harple

Todd Harple is an experience engineer at Intel, where he has worked since 2005. Todd has conducted global ethnographic and design research and presently he leads strategic innovation and pathfinding activities within Intel’s New Devices Group. Over the past three years, his focus has increasingly... Read More.

Derrick Harris

Derrick Harris works for datacenter software startup Mesosphere. He was previously a technology journalist, most notably covering cloud computing, big data, and other emerging IT trends for Gigaom since 2009. There’s a strong possibility that Derrick has written the words “cloud” and “Hadoop” more than... Read More.

Kate Heddleston
Kate Heddleston (Kate Heddleston LLC), @heddle317

Kate Heddleston is a software engineer who focuses on using open source tools to build web applications, with a particular interest in the portions of the product that interface with the user. When she’s not programming, Kate is involved with organizations like Hackbright Academy, PyLadies,... Read More.

Joe Hellerstein

Joseph M. Hellerstein is the Jim Gray Chair of Computer Science at UC Berkeley and cofounder and CSO at Trifacta. Joe’s work focuses on data-centric systems and the way they drive computing. He is an ACM fellow, an Alfred P. Sloan fellow, and... Read More.

Hylke Hendriksen

Hylke Hendriksen is a data scientist at ING. Hylke studied computer science at Delft University of Technology. After demonstrating his graduate thesis project to the ING Wholesale Banking Advanced Analytics team on real-time anomalous click path detection, Hylke is now implementing this in... Read More.

Bill Hinderman

Bill Hinderman is the engineering manager for air site optimization at Expedia, and was the senior site optimization UI engineer at Orbitz Worldwide. In human terms: he built the A/B testing development practice from the ground up. He and his team focus on experimenting and... Read More.

Allen Hoem
Allen Hoem (Teradata)

Throughout his eight-year tenure in the advanced electronics industry, Allen Hoem has a focused on process optimization and product development. At Roku Inc., Allen streamlined the firmware deployment model for the New Products, Television division. Prior to that, Allen was the development lead and process coach for developing Read More.

Josh Hoffman
Josh Hoffman (Zymergen)

Joshua Hoffman is the CEO of Zymergen. Prior to Zymergen, Josh was a partner at Norcob Capital and before that a managing director in merchant banking at Rothschild, where he was a member of the Management Committee. He began his career at McKinsey &... Read More.

Jeff Holoman

Jeff Holoman is a systems engineer at Cloudera. Jeff is a Kafka contributor and has focused on helping customers with large-scale Hadoop deployments, primarily in financial services. Prior to his time at Cloudera, Jeff worked as an application developer, system administrator, and Oracle technology specialist.

... Read More.
Jeremy Howard
Jeremy Howard ( | USF | and, @jeremyphoward

Jeremy Howard is an entrepreneur, business strategist, developer, and educator. Jeremy is a founding researcher at, a research institute dedicated to making deep learning more accessible. He is also a Distinguished Research Scientist at the University of San Francisco, a faculty member at Singularity... Read More.

Johnson Hsieh
Johnson Hsieh (Cardiogram)

Johnson Hsieh is a cofounder at Cardiogram, where he is applying deep learning to medicine. Previously a software engineer at Google building user models (e.g. user interests) to improve cross-product personalization/recommendation using various ML techniques. He also worked on the Google Voice Assistant (a.k.a. “Ok... Read More.

John Hugg
John Hugg (VoltDB), @johnhugg

John Hugg has spent his entire career working with databases and information management. In 2008, John was lured away from a PhD program by Mike Stonebraker to work on what became VoltDB. As the first engineer on the product, he liaised with a team of... Read More.

Leah Hunter
Leah Hunter (Tech Journalist), @leahthehunter

Leah Hunter writes about the human side of tech for Fast Company, the Guardian, and O’Reilly. She is authoring two upcoming books—one on augmented reality from O’Reilly and the other on the future in five years. Leah speaks about both topics, as well as fashion... Read More.

Alysa Z. Hutnik
Alysa Z. Hutnik (Kelley Drye & Warren LLP)

Alysa Z. Hutnik is a partner at Kelley Drye & Warren LLP in Washington, DC, where she delivers comprehensive expertise in all areas of privacy, data security, and advertising law. Alysa’s experience ranges from counseling to defending clients in FTC and state attorneys... Read More.

Tim Hwang
Tim Hwang (ROFLCon / The Web Ecology Project), @timhwang

Tim Hwang is a lawyer and researcher focusing on the intersection of intelligent agents and society, currently at the Intelligence and Autonomy project at Data & Society in New York. He has formerly served in research roles with the Stanford Center for Legal Informatics, the... Read More.

Noah Illinsky
Noah Illinsky (Amazon Web Services), @noahi

Noah Iliinsky is a senior UX architect with Amazon Web Services. Noah strongly believes in the power of intentionally crafted communication and has spent the last decade researching, writing, and speaking about best practices for designing visualizations, informed by his graduate work in user experience... Read More.

Mario Inchiosa
Mario Inchiosa (Microsoft)

Mario Inchiosa is a principal software engineer at Microsoft, where he focuses on scalable machine learning and AI. Previously, Mario served as Revolution Analytics’s chief scientist; analytics architect in IBM’s Big Data organization, where he worked on advanced analytics in Hadoop, Teradata, and R; US... Read More.

Alex Ingerman
Alex Ingerman (Amazon Web Services)

Alex Ingerman leads the product management team for Amazon Machine Learning. He joined Amazon in 2012 after working on products including web-scale search, content recommendation systems, immersive data-exploration environments, and enterprise email and content servers. Alex holds a BS in computer science and an MS... Read More.

Marco Ippolito
Marco Ippolito (CGG GeoSoftware)

Marco M. Ippolito is the data model architect for French-based geophysical services company, CGG, Inc., a fully integrated geoscience company providing leading geological, geophysical, and reservoir capabilities to a broad base of customers primarily from the global oil and gas industry. Since joining Read More.

Sreeni Iyer
Sreeni Iyer (quadanalytix), @av0gadr0

Sreeni Iyer is CTO, CIO, and cofounder of Quad Analytix, a big data company in the ecommerce vertical. Sreeni is focused on machine learning, big data in batch and quasi real time, and insightful visualizations. Sreeni’s previous positions include director of architecture for... Read More.

Mridul Jain
Mridul Jain (Yahoo)

Mridul Jain is a senior principal architect for Yahoo’s monitoring platform. He has been using Storm and Kafka to solve various real-time problems at Yahoo for almost three years. Mridul is also the author of Pig on Storm. His interests are mostly in the area... Read More.

Rohit Jain
Rohit Jain (Esgyn)

Rohit Jain is the CTO at Esgyn for Trafodion, a transactional SQL-on-HBase RDBMS. Rohit worked for Hewlett-Packard for 28 years on applications and databases, undertaking such roles as solutions architect, consultant, software engineer, architect, development and QA manager, product manager, and chief... Read More.

Jeroen Janssens
Jeroen Janssens (Data Science Workshops), @jeroenhjanssens

Jeroen Janssens is the founder, CEO, and an instructor of Data Science Workshops, which provides on-the-job training and coaching in data visualization, machine learning, and programming. Previously, he was an assistant professor at Jheronimus Academy of Data Science and a data scientist at... Read More.

Calvin Jia
Calvin Jia (Alluxio), @JiaCalvin

Calvin Jia is the release manager for Alluxio and is a core maintainer of the project. He is also the top contributor to the Alluxio project and one of its earliest contributors. Calvin holds a BS from the University of California, Berkeley.

Aaron Kalb
Aaron Kalb (Alation)

Aaron Kalb has spent his career crafting and empowering delightful human-computer interactions, especially through natural language interfaces. Aaron currently leads the design team and guides the product vision at Alation, after leaving Stanford with a BS and an MS in symbolic systems and working at... Read More.

Dave Kale
Dave Kale (Skymind)

David Kale is a deep learning engineer at Skymind and a PhD candidate in computer science at the University of Southern California, where he is advised by Greg Ver Steeg of the USC Information Sciences Institute. His research uses machine learning to extract... Read More.

Holden Karau
Holden Karau (Independent), @holdenkarau

Holden Karau is a transgender Canadian software engineer working in the bay area. Previously, she worked at IBM, Alpine, Databricks, Google (twice), Foursquare, and Amazon. Holden is the coauthor of Learning Spark, High Performance Spark, and another Spark book that’s a bit more out... Read More.

Aneesh Karve

Aneesh Karve is the CTO of Quilt Data, a Y Combinator company advancing an open source standard for versioned data. Previously, Aneesh was a product manager, lead designer, and software engineer at companies including Microsoft, NVIDIA, and Matterport and the general manager and... Read More.

Mubashir Kazia
Mubashir Kazia (Cloudera)

Mubashir Kazia is a principal solutions architect at Cloudera and an SME in Apache Hadoop security in Cloudera’s Professional Services practice, where he helps customers secure their Hadoop clusters and comply to internal security policies. He also helps new customers transition to Hadoop platform... Read More.

Brian Kent
Brian Kent (Dato)

Brian Kent is a machine-learning engineer at Dato. His passion is developing statistical and machine learning tools and using these tools to help people solve problems with data. Brian holds a PhD in statistics from Carnegie Mellon University. His research focused on clustering methods and... Read More.

Paul Kent

Paul Kent is vice president of big data initiatives at SAS, where he divides his time between customers, partners, and the research and development teams discussing, evangelizing, and developing software at the confluence of big data and high-performance computing. Previously, Paul was vice president... Read More.

Grega Kespret
Grega Kespret (Celtra Inc.), @gregakespret

Grega Kešpret is the director of engineering for analytics at Celtra, where he builds analytics pipeline and optimization systems. Grega also leads teams of engineers and data scientists in San Francisco and Ljubljana working on Celtra’s analytics platform. Prior to Celtra, Grega worked at Read More.

Amandeep Khurana
Amandeep Khurana (Cloudera)

Amandeep Khurana is a solutions architect at Cloudera, where he’s involved in the entire lifecycle of Hadoop adoption for customers from use-case discovery to taking systems to production. Amandeep is also a coauthor of HBase In Action, a book geared toward building applications using HBase.... Read More.

Spencer Kimball
Spencer Kimball (Cockroach Labs), @cockroachdb

Spencer Kimball is the cofounder and CEO of Cockroach Labs, where he maintains a delicate balance between a love for programming distributed systems and the excitement of helping the company grow smoothly. He cut his teeth on databases during the dot-com heyday and had... Read More.

Jonathan King
Jonathan King (Ericsson), @jhk24

Jonathan H. King is the Head of Cloud Strategy for Ericsson. He was previously Head of Cloud Strategy and business development for CenturyLink Technology Solutions. Prior to that Jonathan was SVP of WW business development at Joyent, an innovative cloud-computing company based in San... Read More.

Adam  Kocoloski

Adam is an IBM Distinguished Engineer and CTO of the Cloud Data Services group. He joined IBM in 2014 via the acquisition of Cloudant, where he built a highly available, scalable database and drove the development of the systems required to offer... Read More.

Benedikt Koehler

Benedikt Koehler studied sociology, anthropology, and psychology in Munich, where he received his PhD in 2006. After founding a mobile-Web startup in the late 1990s, he worked as a consultant for Internet and media companies. In 2008, Benedikt cofounded the Social Media Association, the first... Read More.

Marcel Kornacker
Marcel Kornacker (Cloudera)

Marcel Kornacker is a tech lead at Cloudera and the architect of Apache Impala (incubating). Marcel has held engineering jobs at a few database-related startup companies and at Google, where he worked on several ad-serving and storage infrastructure projects. His last engagement was as the... Read More.

Jay Kreps
Jay Kreps (Confluent)

Jay Kreps is the cofounder and CEO of Confluent, a company focused on Apache Kafka. Previously, Jay was one of the primary architects for LinkedIn, where he focused on data infrastructure and data-driven products. He was among the original authors of a number of... Read More.

Balaji Krishna has been with SAP for over 16 years, with customer-facing experience as support consultant, RIG, solution management, and currently product management. He has been a trusted advisor to customers in architecting and implementing the best end-to-end EDW and analytics solutions.... Read More.

Balaji Krishnapuram (IBM Watson Health)

Balaji Krishnapuram is responsible for analytics at IBM Watson Health, where he currently leads the development of two products and a cloud-based analytics platform for healthcare. Previously, Balaji led teams that launched seven commercially successful products using machine learning over the last 10 years... Read More.

Chi-Yi Kuan
Chi-Yi Kuan (LinkedIn)

Chi-Yi Kuan is director of data science at LinkedIn. He has over 15 years of extensive experience applying big data analytics, business intelligence, risk and fraud management, data science, and marketing mix modeling across various business domains (social network, ecommerce, SaaS, and consulting) at both... Read More.

Scott Kurth
Scott Kurth (Silicon Valley Data Science)

Scott Kurth is the vice president of client solutions at Silicon Valley Data Science, where he helps clients define and execute the strategies and data architectures that enable differentiated business growth. Building on 20 years of experience making emerging technologies relevant to enterprises, he has... Read More.

Yann Landrin-Schweitzer

Yann Landrin is a data scientist and data engineer with over 15 years of personalization and big data experience. He has worked on all aspects of big data, from large-scale machine learning to infrastructure optimization. At Autodesk, he is working on the next-generation data platform,... Read More.

Costin Leau
Costin Leau (Elastic), @costinl

Costin Leau is an engineer at Elasticsearch, where he leads big data efforts. An open source veteran, Costin led various Spring projects (Spring OSGi, GemFire, Redis, Hadoop) and authored an OSGi spec. He has spoken about Java, big data, and Elasticsearch-related topics at a number... Read More.

Alex Leblang
Alex Leblang (Cloudera)

Alex Leblang is an engineer at Cloudera on the RecordService team. Previously, Alex was an Apache Impala (incubating) engineer and interned at Vertica. He holds a bachelor’s degree from Brown University with concentrations in computer science and Latin American studies.

Erin Ledell
Erin Ledell (, @ledell

Erin Ledell is the chief machine learning scientist at, the company that created the open source distributed machine learning platform H2O. Previously, she was the principal data scientist at (acquired by GE Digital in 2016) and Marvin Mobile Security (acquired by Veracode in... Read More.

Mike Lee Williams
Mike Lee Williams (Cloudera Fast Forward Labs), @mikepqr

Mike Lee Williams is a research engineer at Cloudera Fast Forward Labs, where he builds prototypes that bring the latest ideas in machine learning and AI to life and helps Cloudera’s customers understand how to make use of these new technologies. Mike holds a PhD... Read More.

Bob Levy
Bob Levy (Virtual Cove, Inc.), @VirtualCove

Bob Levy is CEO of Virtual Cove, Inc., commercializing new uses of virtual and augmented reality for making sense of data at scale. He brings over two decades’ industry and product leadership experience with firms including IBM & MathWorks. Mr. Levy was founding... Read More.

Linus Liang
Linus Liang (Embrace)

Linus Liang is a serial entrepreneur with expertise in technology, medical devices, and social enterprises. He most recently cofounded Embrace, a social enterprise that develops and distributes a low-cost infant incubators to developing countries. Unlike traditional incubators that cost up to $20,000, the Embrace Infant... Read More.

Todd Lipcon
Todd Lipcon (Cloudera), @tlipcon

Todd Lipcon is an engineer at Cloudera, where he primarily contributes to open source distributed systems in the Apache Hadoop ecosystem. Previously, he focused on Apache HBase, HDFS, and MapReduce, where he designed and implemented redundant metadata storage for the NameNode (QuorumJournalManager), ZooKeeper-based automatic... Read More.

Zachary Lipton
Zachary Lipton (University of California, San Diego)

Zack Lipton is a graduate student in the Artificial Intelligence group at the University of California, San Diego. He works on the theory and application of machine learning, particularly deep learning and multilabel classification, and develops algorithms to exploit sparsity, enabling the efficient training of... Read More.

Darren Lo
Darren Lo (Cloudera)

Darren Lo is currently a lead engineer on Cloudera Manager. He previously worked on the Model Repository Server at Informatica.

Bill Loconzolo

Bill Loconzolo is the vice president of Intuit’s Data Engineering and Analytics team, where he leads the development of Intuit’s central big data platform, which leverages the power of the collective data of 45 million Intuit customers. The platform creates unique data-driven insights and product... Read More.

David Loftesness
David Loftesness (On sabbatical, most recently at Twitter), @dloft
Scaling teams Cultivate

David Loftesness has been a software engineer and manager at a range of tech companies, including Amazon, Twitter, Xmarks, and Geoworks, each with its own unique strengths and challenges. David is currently taking time to share what he’s learned through talks and blog posts before... Read More.

Michael Lopp
Michael Lopp (Rands), @rands

Michael Lopp is a Silicon Valley-based engineering leader who builds both teams and software at companies such as Borland, Netscape, Palantir, and Apple. Michael has written two books. His first book, Managing Humans, a popular guide to the art of engineering leadership, clearly explains that... Read More.

Ben Lorica
Ben Lorica (O'Reilly), @bigdata

Ben Lorica is the chief data scientist at O’Reilly. Ben has applied business intelligence, data mining, machine learning, and statistical analysis in a variety of settings, including direct marketing, consumer and market research, targeted advertising, text mining, and financial engineering. His background includes stints with... Read More.

Michael Ludden is an IBMer in developer relations at Watson. Previously, Michael was developer marketing manager lead at Google, head of developer marketing at Samsung, a developer evangelist at HTC, and global director of developer relations at startups Quixey and Nexmo and was involved... Read More.

Roger Magoulas
Roger Magoulas (O'Reilly Media), @rogerm

Roger Magoulas is the vice president of O’Reilly Radar. Previously, Roger was the research director at O’Reilly, where he and his team built the company’s analysis infrastructure and provided analytic services and insights on technology-adoption trends to business decision makers at O’Reilly and beyond. He... Read More.

Seshadri Mahalingam

Seshadri Mahalingam is a software engineer at Trifacta, where, in addition to building out Wrangle, Trifacta’s domain-specific language for expressing data transformation, he develops the low-latency compute framework that powers Trifacta’s fluid and immersive data wrangling experience. Seshadri holds a BS in EECS from... Read More.

Ted Malaska
Ted Malaska (Capital One), @TedMalaska

Ted Malaska is a director of enterprise architecture at Capital One. Previously, he was the director of engineering in the Global Insight Department at Blizzard; principal solutions architect at Cloudera, helping clients find success with the Hadoop ecosystem; and a lead architect at the Financial... Read More.

James Malone
James Malone (Google)

James Malone is a product manager for Google Cloud Platform and manages Cloud Dataproc and Apache Beam (incubating). Previously, James worked at Disney and Amazon. He’s a big fan of open source software because it shows what’s possible when people come together to solve common... Read More.

Vikash Mansinghka

Vikash Mansinghka is a research scientist at MIT, where he leads the Probabilistic Computing Project, and a cofounder of Empirical Systems, a new venture-backed AI startup aimed at improving the credibility and transparency of statistical inference. Previously, Vikash cofounded a venture-backed startup based on... Read More.

Keith Manthey

Keith is the CTO focused on Analytics for Dell EMC. He brings more than 24 years of Identity Fraud Analytics experience, alternative and traditional data architectures experience, and Financial Systems and Analytics experience. Keith is an advisory board member of the University of... Read More.

Sanjay Mathur
Sanjay Mathur (Silicon Valley Data Science)

As the CEO and cofounder of Silicon Valley Data Science, Sanjay Mathur has brought together a team of world-class data scientists and engineers to help companies become more data driven. Previously, Sanjay was a partner in Accenture’s R&D organization, Accenture Technology Labs, where he... Read More.

Drew Mattison
Drew Mattison (XPLANE)

Drew Mattison is a connector, facilitator, communicator, rationalizer, strategist, and advocate who helps clients get things done. For the last 20 years, he has worked where business and strategy intersect with design and communications. Drew is responsible for ensuring XPLANE teams exceed expectations and... Read More.

Patrick McFadin

Patrick McFadin is the vice president of developer relations at DataStax, where he leads a team devoted to making users of DataStax products successful. Previously, he was chief evangelist for Apache Cassandra and a consultant for DataStax, where he helped build some of the largest... Read More.

Pat McGarry
Pat McGarry (Ryft)

Pat McGarry brings extensive technology and leadership experience in hardware and software engineering to his role as vice president of engineering at Ryft. Pat joined Ryft from Ixia Communications, where he was responsible for federal security systems engineering programs. During his tenure at Ixia and... Read More.

Emma McGrattan
Emma McGrattan (Actian)

Emma McGrattan is SVP of engineering at Actian, where she leads the Actian Vector, Actian Vector Hadoop Edition, and Actian Matrix development teams. A leading authority in DBMS technologies, Emma has over 20 years’ experience managing, supporting, and developing a variety of databases,... Read More.

Denise McInerney

Denise McInerney is a data professional with over 16 years of experience. Denise began her career as a database administrator, managing and developing databases for online transactional systems. She now works as a data architect at Intuit, where she designs and implements BI and analytics... Read More.

Wes McKinney
Wes McKinney (Two Sigma Investments), @wesmckinn

Wes McKinney is a software architect at Two Sigma Investments. He is the creator of Python’s pandas library and a PMC member for Apache Arrow and Apache Parquet. He wrote the book Python for Data Analysis. Previously, Wes worked for Cloudera and was the... Read More.

Eric McNulty
Eric McNulty (, @richerearth

Eric McNulty helps leaders and organizations create long-term value and increase their positive impact on the full range of stakeholders. Eric is a writer, speaker and conversation catalyst, teacher, and advisor and holds an appointment as director of research and professional programs at the National... Read More.

Stephen Merity
Stephen Merity (Salesforce Research), @smerity

Stephen Merity is a senior research scientist at Salesforce Research (formerly MetaMind), where he works on researching and implementing deep learning models for vision and text, with a focus on memory networks and neural attention mechanisms for computer vision and natural language processing tasks.... Read More.

Leo Meyerovich
Leo Meyerovich (Graphistry)

Leo Meyerovich cofounded Graphistry, Inc. to help enterprise and federal teams easily scale visual investigations of their event and graph data. Graphistry’s original approach of connecting GPUs in browsers to GPUs in datacenters builds upon the founding team’s work at UC Berkeley on the first... Read More.

Claire Michell

Claire Mitchell is a product experience designer at Temboo in NYC. With experience ranging from creative strategy to design, Claire has designed interfaces that show the potential for the future, developed conceptual pitches for award-winning commercials, and built and managed teams that can effectively... Read More.

kai miller
kai miller (Stanford University), @kaijoshuamiller

i’m a neurosurgery resident at stanford. i have doctorates in physics, medicine, and neuroscience. surgically i am focused on epilepsy, brain tumors, and deep brain stimulation. my research focuses on neural engineering, human electrophysiology, and imaging in neurosurgery. i’m enthusiastic about how machine learning and... Read More.

Donald Miner
Donald Miner (Miner & Kasch)

Donald Miner is the founder of the data science consulting firm Miner & Kasch and specializes in large-scale data analysis and applying machine learning to real-world problems. Donald is author of the O’Reilly book MapReduce Design Patterns and multiple industry reports. He’s architected and implemented... Read More.

Sophie-Charlotte Moatti
Sophie-Charlotte Moatti (Products That Count), @scmoatti

As an executive at mobile pioneers such as Facebook, Trulia, and Nokia, SC Moatti has launched and monetized mobile products that are used by billions of people and have received prestigious awards, including an Emmy nomination. Currently, SC runs Products That Count, a company that... Read More.

Prat Moghe

Prat Moghe is the founder and CEO of Cazena. Prat is a successful big data entrepreneur with nearly 20 years of experience inventing next-generation products and building strong teams in the technology sector. Prior to founding Cazena, as SVP of strategy, products, and... Read More.

Rajat Monga
Rajat Monga (Google)

Rajat Monga leads TensorFlow, an open source machine learning library and the center of Google’s efforts at scaling up deep learning. He is one of the founding members of the Google Brain team and is interested in pushing machine learning research forward toward general AI.... Read More.

Aurelia Moser
Aurelia Moser (Mozilla Science), @auremoser

Aurelia Moser is a developer and curious cartographer building communities around code at Mozilla Open Science. Recent projects include mapping sensor data to support agricultural security and sustainable apis ecosystems in the Global South. She’s been working in the open tech and nonprofit science space... Read More.

John Mount
John Mount (Win-Vector LLC)
R Day (Full Day) Tutorial

John Mount is a principal consultant at Win-Vector LLC, a San Francisco data science consultancy. John has worked as a computational scientist in biotechnology and a stock-trading algorithm designer and has managed a research team for (now an eBay company). He... Read More.

Conrad Mulcahy
Conrad Mulcahy (K2 Intelligence)

Conrad Mulcahy is an associate managing director and director of data analytics in K2 Intelligence’s New York Office. In his time at K2 Intelligence, Conrad has conducted numerous investigations targeting risk, fraud, corruption, anti-money laundering, and bankruptcy for clients such as law firms, government agencies,... Read More.

Sean Murphy
Sean Murphy (PingThings), @sayhitosean

Sean Patrick Murphy serves as the chief data scientist for PingThings, an Industrial Internet of Things (IIoT) startup bringing advanced data science and machine learning to the nation’s electric grid. He also advises several startups and provides learning-analytics consulting for EverFi. Previously, he served as... Read More.

Justin Murray
Justin Murray (VMware)

Justin Murray is a technical product marketing manager in big data at VMware, where he works with VMware’s customers and field engineering to create guidelines and best practices for using virtualization technology for big data. He has spoken at a variety of conferences on these... Read More.

Jacques Nadeau
Jacques Nadeau (Dremio)

Jacques Nadeau is the cofounder and CTO of Dremio. Previously, he ran MapR’s distributed systems team; was CTO and cofounder of YapMap, an enterprise search startup; and held engineering leadership roles at Quigo, Offermatica, and aQuantive. Jacques is cocreator and PMC chair... Read More.

Nina Narelle
Nina Narelle (XPLANE)

As a catalyst of systems change, Nina Narelle brings over 15 years of experience in organizational design and systems thinking to inform her work leading organizational transformation. Nina helps groups dream big about their future state and emerge with stronger relationships and clear agreements for... Read More.

Neha Narkhede

Neha Narkhede is the cofounder and CTO at Confluent, a company backing the popular Apache Kafka messaging system. Previously, Neha led streams infrastructure at LinkedIn, where she was responsible for LinkedIn’s petabyte-scale streaming infrastructure built on top of Apache Kafka and Apache Samza. Neha... Read More.

Tony Ng
Tony Ng (WeWork), @tony_ng

Tony Ng is a Sr. Director of Engineering at WeWork, where he is responsible for WeWork’s Data Platform.

Christopher Nguyen

Christopher Nguyen is president and CEO of Arimo, a Panasonic company in Silicon Valley, where he leads the development of AI platforms and solutions for the enterprise. Previously, he was engineering director of Google Apps and cofounded two other successful startups. As a professor,... Read More.

Robert Nishihara
Robert Nishihara (University of California, Berkeley), @robertnishihara

Robert Nishihara is a fourth-year PhD student working in the University of California, Berkeley, RISELab with Michael Jordan. He works on machine learning, optimization, and artificial intelligence.

Alex Nisnevich
Alex Nisnevich (Bayes Impact), @AlexNisnevich

Alex Nisnevich is a data scientist at Bayes Impact. Previously, he worked on machine-learning pipelines at Workday and built natural language interfaces for databases at UPSHOT. He received his MS in NLP at UC Berkeley.

Jack Norris
Jack Norris (MapR Technologies), @Norrisjack

Jack Norris is the senior vice president of data and applications at MapR Technologies, where he works with leading customers and partners worldwide to drive the understanding and adoption of new applications enabled by data and analytics. With over 25 years of enterprise software experience,... Read More.

Amy O'Connor
Amy O'Connor (Cloudera), @imamyo

Amy O’Connor is a big data evangelist and telecommunications specialist at Cloudera, the leading big data vendor. She advises customers globally as they introduce big data solutions and adopt enterprise-wide big data delivery capabilities. Amy was recently named one of Information Management’s 10 Big Data... Read More.

Kevin O'Dell

Kevin O’Dell currently works as a field engineer for Rocana, helping companies take IT operations to the next level, and has been an HBase contributor since 2012. Kevin regularly works to architect, size, and deploy big data applications across a wide variety of verticals in... Read More.

Stephen O'Sullivan
Stephen O'Sullivan (Data Whisperers), @steveos

A leading expert on big data architectures, Stephen O’Sullivan has 25 years of experience creating scalable, high-availability data and applications solutions. A veteran of Silicon Valley Data Science, @WalmartLabs, Sun, and Yahoo. Stephen is an independent adviser to enterprises on all things data..

Travis Oliphant
Travis Oliphant (Continuum Analytics), @continuumIO

As CEO of Continuum Analytics, Travis Oliphant engages customers in all industries, develops business strategy, and helps guide the technical direction of the company. Travis actively contributes to software development and engages with the wider open source community in the Python ecosystem. He has... Read More.

Silvia Oliveros
Silvia Oliveros (Silicon Valley Data Science), @soliverost

Silvia Oliveros is a data engineer at Silicon Valley Data Science, where she helps clients explore and analyze their data. Silvia has a background in computer engineering and visual analytics and is interested in building and optimizing the infrastructure and data pipelines used to gather... Read More.

Dan Olsen
Dan Olsen (The Lean Product Playbook), @danolsen

Dan Olsen is a product management consultant, speaker, and author. At Olsen Solutions, he works with CEOs and product leaders to build great products and strong product teams, often as interim VP of product. He has helped product teams at Facebook, Box, Microsoft, Medallia,... Read More.

Matt Olson
Matt Olson (CenturyLink)

Matt Olson is a principal network architect at CenturyLink. Matt’s current focus is on big data analytics for SDN/NFV performance management with the aim of building automated feedback loops for adaptive intelligent network services. Matt has many years of experience leveraging data analytics,... Read More.

John Omernik
John Omernik (MapR Technologies), @mandoskippy

As Distinguished Technologist at MapR, John Omernik brings an analytical approach to big data, utilizing modern tools to identify patterns to facilitate security program improvements and reduce risk to organizations. Prior to MapR, John was SVP Security Innovations at Bank of America. Previously, he... Read More.

Jerry Overton

Jerry Overton is a data scientist and distinguished technologist in DXC’s Analytics Group, where he is the principal data scientist for industrial machine learning, a strategic alliance between DXC and Microsoft comprising enterprise-scale applications across six different industries: banking and capital markets, energy... Read More.

Todd Palino
Todd Palino (LinkedIn), @bonkoif

Todd Palino is a site reliability engineer at LinkedIn tasked with keeping Zookeeper, Kafka, and Samza deployments fed and watered. His days are spent, in part, developing monitoring systems and tools to make that job a breeze. Previously, Todd was a systems engineer at Verisign,... Read More.

Ganesan Pandurangan
Ganesan Pandurangan (Infosys Limited)

Ganesan Pandurangan is a principal technology architect at Infosys Ltd. He has around 20 years of experience in building large-scale online, batch, and data warehouse systems. Ganesan is currently leading the development and implementation of Infosys’s big data platform, the Infosys Information Platform, and has... Read More.

Josh Patterson
Josh Patterson (Patterson Consulting), @jpatanooga

Josh Patterson is CEO of Patterson Consulting, a solution integrator at the intersection of big data and applied machine learning. In this role, he brings his unique perspective blending a decade of big data experience and wide-ranging deep learning experience to Fortune 500 projects.... Read More.

Joshua Patterson

Joshua Patterson is a director of AI infrastructure at NVIDIA leading engineering for RAPIDS.AI. Previously, Josh was a White House Presidential Innovation Fellow and worked with leading experts across public sector, private sector, and academia to build a next-generation cyberdefense platform. His current... Read More.

Rob Peglar
Rob Peglar (Micron Technology, Inc)

Robert Peglar is vice president of advanced storage solutions at Micron Technology. A 38-year industry veteran and published author, Robert leads efforts in advanced storage systems strategy, leads executive-level planning with key customers and partners worldwide for Micron’s Storage Business Unit, and defines future storage... Read More.

Paulo Pereira

Paulo Pereira is the GE executive responsible the technical aspects of data security and governance for GE Digital. In this capacity, Paulo leads the efforts around big data cloud infrastructure, governance, and security. Working with heavily regulated data from several GE businesses and clients, Paulo... Read More.

Don Perigo
Don Perigo (GE Power)

Don Perigo is the IT chief enterprise architect of GE Power Services, a $15B organization within GE Power. Power Services is a combination of two of the best service teams in the power industry—GE’s Power Generation Services and the former Alstom Thermal Services (acquired Q4... Read More.

Daniella Perlroth (Lyra Health)

Daniella Perlroth is chief data scientist at Lyra Health, a technology company transforming behavioral health with data and a human touch, where she is developing treatment and provider recommendations to help mental health patients get access to the best quality care. Prior to Lyra, Daniella... Read More.

Thomas Phelan
Thomas Phelan (HPE BlueData), @tapbluedata

Thomas Phelan is cofounder and chief architect of BlueData. Previously, a member of the original team at Silicon Graphics that designed and implemented XFS, the first commercially availably 64-bit file system; and an early employee at VMware, a senior staff engineer and a key... Read More.

Sebastien Pierre

Sébastien Pierre is the director of FFunction, an award-winning data visualization studio. He has worked with clients such as HP, National Geographic, the Bill & Melinda Gates Foundation, Edelman, and many other high-profile organizations. Trained both as a software engineer and a designer, Sébastien regularly... Read More.

Jeff Pohlmann
Jeff Pohlmann (Oracle)

Jeff Pohlmann is the vice president of NA big data at Oracle. Jeff has more than 30 years of leadership and management experience with over 15 years of it managing and consulting with Fortune 500 companies deploying analytical information solutions. Prior to joining Oracle, Jeff... Read More.

Jake Porway
Jake Porway (DataKind)

Jake Porway is the founder and executive director of DataKind, a nonprofit that harnesses the power of data science in the service of humanity. He is an alum of the New York Times R&D Lab and has worked at Google and Bell Labs. A recognized... Read More.

TJ Potter (Lucidworks ), @thelabdude

Timothy Potter is a senior member of the engineering team at Lucidworks, a committer on the Apache Solr project, and the coauthor of Solr In Action, a comprehensive guide to using Solr 4. Tim focuses on scalability and hardening the distributed features in Solr. Previously,... Read More.

Paula Poundstone
Paula Poundstone (Star of NPR's #1 radio show, "Wait Wait...Don't Tell Me"), @paulapoundstone

Thirty-two years ago, Paula Poundstone climbed on a Greyhound bus and traveled across the country—stopping in at open mic nights at comedy clubs as she went. Today, she is one of our country’s foremost humorists. You can hear her through your laughter as a regular... Read More.

James Powell

James Powell is a NYC-based Python programmer and master trainer with experience in quantitative finance and data science. James is very active in the Python community in NYC, where he organizes NYC Python (the world’s largest and most active Python meetup group).... Read More.

Mr Prabhat
Mr Prabhat (Berkeley Lab)

Prabhat leads the Data and Analytics Services team at NERSC. His current research interests include scientific data management, parallel I/O, high-performance computing, and scientific visualization. He is also interested in applied statistics, machine learning, computer graphics, and computer vision. Prabhat received an ScM in... Read More.

Nitin Prabhu (Transamerica)

Nitin Prabhu has been with Transamerica for over a decade in IT roles. He now serves as manager for strategy and architecture.

Peter Prettenhofer (DataRobot)

Peter Prettenhofer is a data scientist and software engineer at DataRobot. He is a contributor to scikit-learn, where he coauthored a number of modules such as Gradient Boosted Regression Trees, Stochastic Gradient Descent, and Decision Trees. Peter studied computer science at Graz University of Technology,... Read More.

Megan Price
Megan Price (Human Rights Data Analysis Group), @hrdag

As the executive director at the Human Rights Data Analysis Group, Megan Price designs strategies and methods for statistical analysis of human rights data for projects in a variety of locations including Guatemala, Colombia, and Syria. Megan’s work in Guatemala includes serving as the lead... Read More.

Richard Probst is VP of infrastructure technology strategy at SAP and is currently focused on working with SAP partners on innovative cloud architectures for SAP application landscapes to help SAP customers become more agile.

Erin Ptacek
Erin Ptacek (

Erin Ptacek is a cofounder and VP of engineering at Starfighter, a startup that is changing the way technical recruiting is done by building CTFs (games you play by programming). During her long tenure in the security industry, Erin has helped turn green recruits... Read More.

Yvonne Quacken
Yvonne Quacken (Siemens)

Yvonne Quacken is a senior big data architect and engineer at Siemens. In her role as BI and big data technology lead, Yvonne is responsible for platform and solution architecture, cloud integration, and DevOps for BI and big data. Yvonne has been working in this... Read More.

Rachel Quint
Rachel Quint (Hewlett Foundation), @RMQuint

Rachel Quint is a fellow in the Global Development and Population Program at the Hewlett Foundation. Before joining the Hewlett Foundation, Rachel lived in Addis Ababa, Ethiopia, where she worked in the UN World Food Programme’s Africa office, serving as a liaison to the African... Read More.

Mohammad Quraishi

Mohammad Quraishi has worked in the Healthcare industry for 23 years. He is a Senior Principal Technologist at Cigna Corporation within the Data & Analytics organization. He graduated with a BS in Computer Science & Engineering from the University of Connecticut at Storrs.
... Read More.

Phillip Radley

Phillip Radley is chief data architect on the core enterprise architecture team at BT, where he’s responsible for data architecture across the company. Based at BT’s Adastral Park campus in the UK, Phill leads BT’s MDM and big data initiatives, driving associated strategic architecture... Read More.

Siva Raghupathy
Siva Raghupathy (Amazon Web Services)

Siva Raghupathy leads the Americas Big Data Solutions Architecture team at AWS, where he guides developers and architects in building successful big data solutions on AWS. Previously, as a principal technical program manager for AWS Database Service, Siva gathered emerging NoSQL requirements... Read More.

Karthik Ramasamy

Karthik Ramasamy is the engineering manager and technical lead for real-time analytics at Twitter. Karthik is the cocreator of Heron and has more than two decades of experience working in parallel databases, big data infrastructure, and networking. He cofounded Locomatix, a company that specializes in... Read More.

Jun Rao
Jun Rao (Confluent)

Jun Rao is the cofounder of Confluent, a company that provides a streaming data platform on top of Apache Kafka. Previously, Jun was a senior staff engineer at LinkedIn, where he led the development of Kafka, and a researcher at IBM’s Almaden research data center,... Read More.

Naveen Rao
Naveen Rao (Intel)

Naveen Rao is the vice president and general manager of artificial intelligence solutions at Intel. Naveen’s fascination with computation in synthetic and neural systems began around age 9 when he began learning about circuits that store information and encountered the AI themes prevalent in sci-fi... Read More.

Chris Rawles
Chris Rawles (Pivotal)

Chris Rawles is a data scientist at Pivotal, where he works with customers across a variety of domains, building models to derive insight and business value from their data. Prior to joining Pivotal, Chris worked in both the oil and gas and alternative energy industries,... Read More.

atish ray
atish ray (Accenture)

Atish Ray has more than 15 years of technology and management experience in application architecture and delivery. He has broad experience in planning, estimation, design, integration and implementation of tiered web-centric applications. Based in the Washington DC metro area, he is the Data Engineering Lead... Read More.

Tom Reilly
Tom Reilly (Cloudera), @cloudera

Tom Reilly is the CEO of Cloudera. Tom has had a distinguished 30-year career in the enterprise software market. Previously, Tom was vice president and general manager of enterprise security at HP; CEO of enterprise security company ArcSight, where he led the company... Read More.

Dieter Reuther
Dieter Reuther (Team Dynamics Boston), @DieterReuther

Dieter Reuther is a leadership consultant who focuses on people, process, and technology. He helps organizations balance creative chaos with structure to bring out the best in teams and individuals. As a strong believer in the power positive leadership can have on people’s motivation, performance,... Read More.

Evan Richards (Uber)

Evan Richards is a member of the Hadoop Compute Platform team at Uber, where he works as the tech lead for the schemas and schema management projects. Previously, Evan interned with Zappos, helping monitor the migration of their catalog from their proprietary format and local... Read More.

Katrina Riehl
Katrina Riehl (Continuum Analytics)

Katrina Riehl is a senior data scientist at Continuum Analytics, where she leads the Memex team. Over the last decade, Katrina has worked extensively in the fields of scientific computing, machine learning, data mining, and visualization. Most notably, she worked at Enthought, the signal and... Read More.

Travis Ringger

Travis Ringger is a manager in PwC’s Risk & Compliance Systems and Analytics practice. Travis specializes in designing and delivering analytical solutions that provide better awareness and understanding of compliance and business risk, particularly solutions incorporating unstructured data and natural language processing. He has deep... Read More.

Cody Rioux
Cody Rioux (Netflix (Real-time Analytics))

Cody Rioux is a senior analytics engineer at Netflix working in the real-time analytics space to design fully autonomous systems that support availability and reliability in the Netflix cloud environment on Amazon Web Services. Cody is passionate about using stream processing, functional programming, and Bayesian... Read More.

Julia Rodriguez
Julia Rodriguez (Eagle Investment Systems), @juliargentinag

Julie Rodriguez is vice president of product management and user experience at Eagle Investment Systems. An experience designer focusing on user research, analysis, and design for complex systems, Julie has patented her work in data visualizations for MATLAB and publishes industry articles on user... Read More.

Monica Rogati
Monica Rogati (Data Natives), @mrogati

Monica Rogati is an independent data science executive and advisor who has built key data products and teams at Jawbone and LinkedIn; she now helps startups make the most out of their data. As the VP of data at Jawbone, Monica built Jawbone’s data science... Read More.

Bob Rogers

Bob Rogers is chief data scientist for big data solutions at Intel, where he applies his experience solving problems with big data and analytics to help Intel build world-class customer solutions. Prior to joining Intel, Bob was cofounder and chief scientist at Apixio, a big... Read More.

Brandon Rohrer
Brandon Rohrer (Microsoft)

Brandon Rohrer is a data scientist in Microsoft’s Azure Machine Learning group. He creates end-to-end data science solutions for external customers and supports the development of core algorithms and functionality in Azure ML. Brandon obtained his data science skills working in a variety of applications,... Read More.

Irene Ros
Irene Ros (Bocoup), @ireneros

Irene Ros is the director of data visualization at Bocoup and the program chair of OpenVis Conf, a two-day conference on data visualization on the open web. Irene is an information visualization researcher and developer, making engaging, informative, and interactive data-driven stories,... Read More.

Alan Ross
Alan Ross (Intel), @intel

Alan Ross is a senior principal engineer and chief cloud security architect at Intel. Alan has more than 20 years of information security experience in various capacities, from policy and awareness and security/risk analysis to engineering and architecture. Previously, Alan worked as a security administrator... Read More.

Laurel Ruma
Laurel Ruma (O'Reilly Media), @laurelatoreilly
Closing remarks Cultivate
Closing remarks Cultivate
Welcome Cultivate
Welcome Cultivate

Laurel Ruma is a content director at O’Reilly Media. Laurel has chaired a number of O’Reilly conferences and workshops, including Next:Economy, Cultivate, Where 2.0, OSCON Java, and Gov 2.0 Expo.

Sandy Ryza
Sandy Ryza (Clover Health), @s_ryz

Sandy Ryza is a senior data scientist at Clover Health. He was previously at Cloudera doing engineering and data science. He is an author of O’Reilly’s Advanced Analytics with Spark, as well as a Spark committer and member of the Hadoop project management committee. He... Read More.

Mohan Sadashiva
Mohan Sadashiva (Waterline Data)

Mohan Sadashiva is VP of products at Waterline Data, where he leverages his extensive experience in managing large-scale software products and cloud services to drive new innovations in big data. Previoiusly, Mohab was the SVP of products and business development at Narus, a cybersecurity... Read More.

Neelesh Salian
Neelesh Salian (Stitch Fix), @NeelS7

Neelesh Srinivas Salian is a software engineer on the data platform team at Stitch Fix, where he works on the compute infrastructure used by the company’s data scientists. Previously, he worked at Cloudera, where he worked with Apache projects like YARN, Spark, and Kafka.

... Read More.
Chris Sanden

Chris Sanden is a senior analytics engineer at Netflix with a focus on real-time analytics and machine learning. He is part of the Insight Engineering team responsible for building systems that allow everyone at Netflix visibility into the state of the cloud environment. Chris is... Read More.

Majken Sander
Majken Sander (Majken Sander), @majsander

Majken Sander is a data nerd, business analyst, and solution architect. Previously, Majken worked with IT, management information, analytics, BI, and DW for 20+ years. Armed with strong analytical expertise, she’s keen on “data driven” as a business principle, data science, the IoT, and all... Read More.

Krishna Sankar
Krishna Sankar (U.S.Bank), @ksankar

Krishna Sankar is a Distinguished Engineer − Artificial Intelligence & Machine Learning at U.S. Bank focusing on augmented intelligence, digital human as well as areas like AI explainability. Earlier stints include Senior Data Scientist with Volvo Cars, Chief Data Scientist at, Data Scientist/Tata America... Read More.

Kaz Sato

Kaz Sato is a staff developer advocate on the cloud platform team at Google, where he leads the developer advocacy team for machine learning and data analytics products such as TensorFlow, the Vision API, and BigQuery. Kaz has been leading and supporting developer communities... Read More.

Bill Schmarzo

William Schmarzo is the CTO of Dell EMC’s Big Data practice, where he is responsible for setting the strategy and defining the service line offerings and capabilities for the EMC Consulting Enterprise Information Management and Analytics service line. Bill has more than two... Read More.

Andreas Schmidt
Andreas Schmidt (Blue Yonder), @aschmidt_42

Andreas Schmidt is a product manager at Blue Yonder, a leading European company for predictive applications in retail. Previously, he was a senior data scientist there for several years, designing and implementing applications such as replenishment optimization for fresh and perishable goods. During his PhD... Read More.

Jim Scott
Jim Scott (NVIDIA), @kingmesal

Jim Scott is the head of developer relations, data science, at NVIDIA. He’s passionate about building combined big data and blockchain solutions. Over his career, Jim has held positions running operations, engineering, architecture, and QA teams in the financial services, regulatory, digital advertising, IoT,... Read More.

Kim Scott
Kim Scott (Radical Candor, Inc.), @kimballscott

Kim Malone Scott is an advisor at Dropbox, Kurbo, Qualtrics, Rolltape, Shyp, Twitter, and several Silicon Valley startups. Kim was a member of the faculty at Apple University and before that led AdSense, YouTube, and Doubleclick online sales and operations at Google. Known for her... Read More.

Partha Seetala (Robin Systems), @robinsystems

Currently the CTO of Robin Systems, Partha Seetala has more than 16 years of technology and product expertise. Previously, Partha was a distinguished engineer and senior director of engineering at Veritas, Symantec’s information management business, where he conceived, architected, and led engineering teams to... Read More.

Jonathan Seidman

Jonathan Seidman is a software engineer on the cloud team at Cloudera. Previously, he was a lead engineer on the big data team at Orbitz, helping to build out the Hadoop clusters supporting the data storage and analysis needs of one of the most heavily... Read More.

Debora Seys
Debora Seys (eBay)

Debora Seys works on delivering a trusted self-service data experience at eBay. She’s been helping users help themselves to find, use, and collaborate with information and data for 15+ years. Prior to her current role, Deb drove search and taxonomy technology capabilities at Kaiser Permanente... Read More.

HIREN SHAH (Microsoft), @HirenShahTW

Hiren Shah is currently a principal program manager in Microsoft’s Cortana Analytics group, where he focuses on big data analytics and data science. Over the last seven years, Hiren has worked on a variety of big data technologies in Bing and Azure. Hiren has a... Read More.

Abin Shahab
Abin Shahab (Altiscale)

Abin Shahab is a senior software engineer at Altiscale as well as a contributor to Docker and LXC. Abin’s work at Altiscale is focused on multitenant Hadoop clusters using Docker containers. Prior to joining Altiscale, Abin worked on graph databases and search engines at Guidewire, Symantec, and Vivisimo... Read More.

Gwen Shapira
Gwen Shapira (Confluent), @gwenshap

Gwen Shapira is a system architect at Confluent, where she helps customers achieve success with their Apache Kafka implementations. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in... Read More.

Ben Sharma

Ben Sharma is founder and CEO of Zaloni. Ben is a passionate technologist with experience in solutions architecture and service delivery of big data, analytics, and enterprise infrastructure solutions and expertise ranging from development to production deployment in a wide array of technologies, including... Read More.

Chang She
Chang She (Cloudera)

Chang She is a software engineer at Cloudera currently working on metadata management tools for Hadoop. Prior to joining Cloudera, Chang was cofounder and CTO of DataPad, a next-gen BI/analytics company. An early core contributor to the pandas library, Chang’s passion is creating data... Read More.

Jayant Shekhar
Jayant Shekhar (Sparkflows Inc.), @jshekhar

Jayant Shekhar is the founder of Sparkflows Inc., which enables machine learning on large datasets using Spark ML and intelligent workflows. Jayant focuses on Spark, streaming, and machine learning and is a contributor to Spark. Previously, Jayant was a principal solutions architect at Cloudera working... Read More.

Jeffrey Shmain
Jeffrey Shmain (Cloudera)

Jeff Shmain is a principal solutions architect at Cloudera. He has 16+ years of financial industry experience with a strong understanding of security trading, risk, and regulations. Over the last few years, Jeff has worked on various use-case implementations at 8 out of 10 of... Read More.

Alex Silva
Alex Silva (Pluralsight)

Alex Silva is a chief data architect at Pluralsight, where he leads the development of the company’s data infrastructure and services. He’s been instrumental in establishing Pluralsight’s data initiative by architecting a platform to capture valuable insights on real-time video analytics while integrating several data... Read More.

Jiri Simsa
Jiri Simsa (Alluxio), @jsimsa

Jiri Simsa is a software engineer at Alluxio, Inc., where he is one of the maintainers and top contributors of the Alluxio open source project. Before joining Alluxio, Inc., Jiri was a software engineer at Google, working on yet another distributed applications framework. He earned... Read More.

Sumeet Singh

Sumeet Singh is a senior director of product management for cloud and big data platforms at Yahoo. In his current role, he leads the Hadoop products team responsible for both Apache open source contributions and Yahoo projects. Sumeet is responsible for introducing several new multitenant... Read More.

Vartika Singh
Vartika Singh (Cloudera)

Vartika Singh is a solutions architect at Cloudera with over 12 years of experience applying machine learning techniques to big data problems.

Joseph Sirosh

Joseph Sirosh is the chief technology officer at Compass. Previously, he was the corporate vice president of the Cloud AI Platform at Microsoft, where he lead the company’s enterprise AI strategy and products such as Azure Machine Learning, Azure Cognitive Services, Azure Search, and Bot... Read More.

Ram Shankar Siva Kumar
Ram Shankar Siva Kumar (Microsoft (Azure Security Data Science))

Ram Shankar is a security data wrangler in Azure Security Data Science, where he works on the intersection of ML and security. Ram’s work at Microsoft includes a slew of patents in the large intrusion detection space (called “fundamental and groundbreaking” by evaluators). In addition,... Read More.

Paul Soldera
Paul Soldera (Equation Research)

Paul Soldera is currently head of strategy at Equation Research, a full-service market-research company focused on helping clients design, execute, and internalize data and insights from customer- and consumer-focused surveys. Paul also acts as an advisor to other companies looking to grow internal research and... Read More.

Jean-Marc Spaggiari

Jean-Marc Spaggiari is a Cloudera senior solution architect with many years’ experience as a big data architect, specializing in HBase solutions. An active HBase contributor, Jean-Marc has contributed more than 50 patches to the community and participates in all release testing. Prior to Cloudera, Jean-Marc... Read More.

Benjamin Spivey
Benjamin Spivey (Cloudera)

Ben Spivey is a principal solutions architect at Cloudera providing consulting services for large financial-services customers. Ben specializes in Hadoop security and operations. He is the coauthor of Hadoop Security from O’Reilly Media (2015).

Vikram Sreekanti
Vikram Sreekanti (Berkeley AMP Lab), @viksree

Vikram Sreekanti is a software engineer working on research in the AMPLab at UC Berkeley. A graduate of Berkeley’s computer science department, he has served as a teaching assistant at Berkeley and an intern at Cloudera and Yammer.

Srikrishna Sridhar

Krishna Sridhar is a data scientist at Dato. He holds a PhD in computer science from the University of Wisconsin-Madison, where he worked on high-performance software for large-scale problems in mathematical optimization and data analysis. Krishna’s work has been used in applications such as healthcare,... Read More.

Vikram Srivastava
Vikram Srivastava (Cloudera)

Vikram Srivastava is the technical lead for Altus Workload Analytics and Cloudera Manager monitoring. Vikram holds an MS in electrical engineering from Stanford University and a bachelor’s degree in electronics and communications engineering from IIT Roorkee, India.

Jeremy Stanley

Jeremy is currently the VP of data science at Instacart, where he works closely with data scientists who are integrated into product teams to drive growth and profitability through logistics, catalog, search, consumer, shopper, and partner applications. Previously, Jeremy was chief data scientist and Read More.

Sandy Steier
Sandy Steier (1010data)

Sandy Steier is the cofounder and CEO of 1010date. With more than a quarter century of industry experience, Sandy is recognized as an innovator behind the adoption of advanced analytic technologies by financial services institutions. Before cofounding 1010data, Sandy was a vice president and... Read More.

Louis Suarez-Potts
Louis Suarez-Potts (Age of Peers, Inc.), @luispo

As community strategist at Age of Peers, Louis Suarez-Potts strategizes the formation of and manages productive commons-based peer networks (open source communities). Louis helps communities consolidate good work, good connections, and good intentions into a force held in common, producing something all can look at... Read More.

Anand Subbaraj (Microsoft)

Anand Subbaraj is a principal program manager in the Microsoft Information Management & Machine Learning division. Anand has over 12 years of experience in the IT industry delivering products and services that solve challenging business problems and delight customers. Anand currently specializes in big data... Read More.

Brian Suda
Brian Suda (, @briansuda

Brian Suda is a master informatician currently residing in Reykjavík, Iceland. Since first logging on in the mid-’90s, he has spent a good portion of each day connected to the internet. When he is not hacking on microformats or writing about web technologies, he enjoys... Read More.

Adam Sugano
Adam Sugano (Autodesk)

Adam Sugano serves as the head of predictive modeling and advanced analytics at Autodesk, where he leads a team of both internal and external data scientists charged with delivering innovative, actionable data-driven solutions that help empower Autodesk’s customer-retention and engagement-optimization efforts across the customer lifecycle.

... Read More.
Roshan Sumbaly

Roshan Sumbaly leads various computer vision efforts at Facebook AI. Previously, he led various teams at Coursera and LinkedIn, working on data products and infrastructure.

Chao Sun (Cloudera)

Chao Sun is currently a software engineer at Cloudera working on the RecordService project. Before that, Chao worked on the Hive on Spark project. He holds a PhD in computer science from the University of Wisconsin-Milwaukee, where he focused on type systems and programming languages.

... Read More.
Jagane Sundar
Jagane Sundar (WANdisco)

Jagane Sundar is the CTO at WANdisco. Jagane has extensive big data, cloud, virtualization, and networking experience. He joined WANdisco through its acquisition of AltoStor, a Hadoop-as-a-service platform company. Previously, Jagane was founder and CEO of AltoScale, a Hadoop- and HBase-as-a-platform company acquired... Read More.

David Taieb

David Taieb is the STSM for the Cloud Data Services Developer Advocacy team at IBM, where he leads a team of avid technologists with the mission of educating developers on the art of possible with cloud technologies. He’s passionate about building open source... Read More.

Roopa Tangirala

Roopa Tangirala is an experienced engineering leader with extensive background in databases, be they distributed or relational. She manages the database engineering team at Netflix responsible for operating cloud persistent and semipersistent run-time stores for Netflix, which includes Cassandra, Elasticsearch, and MySQL databases, by ensuring... Read More.

Piotr Teterwak

Piotr Teterwak works on the toolkit development team at Dato. He received a BA in computer science from Dartmouth College, where he conducted work exploring the learning of convolutional deep neural nets with applications in computer vision.

Arun Thangamani

Arun Thangamani is a software architect for CDK Global (formerly ADP Dealer Services), where he helped lay the foundation for the Open BI Platform (a big-data initiative), which provides integrated value to CDK Global customers. Before CDK, Arun spent about a... Read More.

Robin Thottungal
Robin Thottungal (US Environmental Protection Agency), @rathottungal

Robin Thottungal is the EPA’s first chief data scientist, focused creating and implementing an agency-wide vision on analytics for effective data-driven decision making. Previously, at Deloitte Consulting, Robin provided clients with strategic advising on different aspect of creating a culture of using data within their... Read More.

Richard Tibbetts

Richard Tibbetts is currently a Principal Product Manager at Tableau. He was founder and CEO of Empirical Systems (acquired by Tableau 2018), a MIT spinout building an AI-based data platform that provided decision support to organizations that use structured data. Prior to Empirical,... Read More.

Kathleen Ting
Kathleen Ting (Cloudera)

Kathleen Ting is currently a technical account manager at Cloudera, where she helps strategic customers deploy and use the Hadoop ecosystem in production. Kathleen has spoken on Hadoop, ZooKeeper, and Sqoop at many big data conferences, including Hadoop World, ApacheCon, and OSCON. She’s contributed... Read More.

Sravya Tirukkovalur

Sravya Tirukkovalur is a software engineer at Cloudera focusing on Hadoop security, specifically working on authorization. Sravya is one of the core contributors of Apache Sentry. She is also a committer and a PPMC member of the project driving the Apache community. Sravya has... Read More.

Steven Totman
Steven Totman (Cloudera)

Steven Totman is the financial services industry lead for Cloudera’s Field Technology Office, where he helps companies monetize their big data assets using Cloudera’s Enterprise Data Hub. Prior to Cloudera, Steve ran strategy for a mainframe-to-Hadoop company and drove product strategy at IBM for... Read More.

Anh Trinh
Anh Trinh (Arimo, Inc.), @chickamade

Anh Trinh is a software architect at Arimo (née Adatao), where he coauthored three patent-pending inventions: the Distributed Data Framework for Data Analytics, Collaboration using Shared Documents for Processing Distributed Data, and Multi-language Support for Interfacing with Distributed Data. He is also a coauthor of... Read More.

Eric Tschetter

Eric Tschetter is the creator and one of the main contributors to Druid, an open source, real-time analytical data store. Eric is currently a distinguished engineer at Yahoo, where he works on speeding up analytics with a mix of data science and traditional BI. Eric... Read More.

Daniel Tunkelang

Daniel Tunkelang is a data science and engineering executive who has built and led some of the strongest teams in the software industry. He was a founding employee and chief scientist of Endeca, a search pioneer that Oracle acquired for $1.1B. He led a local... Read More.

Joseph Turian
Joseph Turian (Workday), @turian

Joseph Turian is currently a principal engineer at Workday. He headed the machine-learning consultancy MetaOptimize LLC and founded the startup UPSHOT (acquired by Workday), which allowed users to query enterprise data from a mobile device using natural language.

Joseph holds a PhD in... Read More.

Nick Turner
Nick Turner (Markerstudy)

Nick Turner has made a career in data that spans more than 25 years. Since 2013, Nick has led the Enterprise Data team at Markerstudy, where he oversees the award-winning Big Data Insights project and is responsible for the collection, analysis, and visualization of hundreds... Read More.

Kostas Tzoumas
Kostas Tzoumas (data Artisans), @kostas_tzoumas

Kostas Tzoumas is a PMC member of the Apache Flink project and cofounder of data Artisans, the company founded by the original development team that created Flink. Kostas has spoken extensively about Flink, including at Hadoop Summit San Jose 2015.

Alexander Ulanov
Alexander Ulanov (Hewlett Packard Labs)

Alexander Ulanov is a senior researcher at Hewlett Packard Labs, where he focuses his research on machine learning on a large scale. Currently, Alexander works on deep learning and graphical models. He has made several contributions to Apache Spark; in particular, he implemented the multilayer... Read More.

Amy Unruh
Amy Unruh (Google), @amygdala

Amy Unruh is a developer programs engineer for the Google Cloud Platform, where she focuses on machine learning and data analytics, as well as other Cloud Platform technologies. Amy has an academic background in CS/AI, and she’s worked at several startups as well as... Read More.

Matthew Van Adelsberg

Matt van Adelsberg is chief data scientist at CACI, where he is responsible for managing the development of advanced, scalable solutions to complex data-analytics problems from small to big data regimes. Matt’s data science team provides end-to-end solutions to support customers throughout the commercial... Read More.

Bryan Van de Ven
Bryan Van de Ven (Continuum Analytics), @ContinuumIO

Bryan Van de Ven is a software engineer at Continuum Analytics. Previously, Bryan worked at the Applied Research Labs, developing software for sonar feature detection and classification systems on US Naval submarine platforms, and Enthought, where he worked on problems in financial risk modeling and... Read More.

Jake VanderPlas
Jake VanderPlas (eScience Institute, University of Washington)

Jake Vanderplas is the director of research in the physical sciences at the University of Washington’s eScience Institute, where his research is primarily in the area of data-driven astronomy and astrophysics. In addition, Jake is a maintainer and/or frequent contributor to many open source Python... Read More.

Krishnan Venkata
Krishnan Venkata (LatentView Analytics), @latentview

Krishnan Venkata is the director for the US West Coast at LatentView Analytics, where he’s responsible for sales leadership and relationship management for LatentView’s clients, especially in the technology sector. Krishnan has over 11 years of experience in global IT services delivery in the US,... Read More.

Mythili Venkatakrishnan

Mythili Venkatakrishnan is an IBM senior technical staff member and is the z Systems architecture and technology lead. Mythili has been with IBM for 25 years, all in the mainframe environment working with clients in various capacities. Her focus areas have been diverse... Read More.

Pratik Verma
Pratik Verma (BlueTalon), @pratverm

Pratik Verma is the founder and chief product officer at BlueTalon. Pratik founded BlueTalon to accelerate big data deployments and remove security as a barrier to adoption. Previously, he led AgeTak, a healthcare startup build on technologies created by Rakesh Verma. He is an angel... Read More.

Amit Walia
Amit Walia (Informatica)

Amit Walia is the executive vice president and chief product officer at Informatica, where he is responsible for product development, product management, product marketing, and engineering. Previously, Amit was the senior vice president and general manager for Informatica’s Data Integration and Data Security business unit.... Read More.

Laura Waller
Laura Waller (UC Berkeley), @optrickster

Laura Waller is an assistant professor at UC Berkeley in the Department of Electrical Engineering and Computer Sciences (EECS) and a senior fellow at the Berkeley Institute of Data Science (BIDS), with affiliations in Bioengineering and Applied Sciences & Technology. Previously, Laura was... Read More.

Dean Wampler

Dean Wampler is an expert in streaming data systems, focusing on applications of machine learning and artificial intelligence (ML/AI). He’s head of developer relations at Anyscale, which is developing Ray for distributed Python, primarily for ML/AI. Previously, he was an engineering VP at... Read More.

Guozhang Wang

Guozhang is a an engineer at Confluent, building a stream data platform on top of Apache Kafka. Prior to Confluent, Guozhang was a senior software engineer at LinkedIn, developing and maintaining its backbone streaming infrastructure on Apache Kafka and Apache Samza. He holds a PhD... Read More.

Haojun Wang
Haojun Wang (Baidu)

Haojun Wang is a tech lead on Baidu’s US autonomous driving car team. Currently, Haojun is driving the in-car computing platform and offline data platform. Prior to Baidu, he worked at the IBM Silicon Valley Lab, focusing on database core development and big data... Read More.

Wei Wang
Wei Wang (Hortonworks)

Wei Wang is the senior director of product marketing at Hortonworks, where she serves as the primary leadership force behind strategic marketing execution, with a focus on boosting Hortonworks Data Platform market expansion and revenue generation globally. Wei is an accomplished international marketing executive with... Read More.

Daniel Weeks
Daniel Weeks (Netflix)

Daniel Weeks manages the big data compute team at Netflix and is a Parquet committer. Previously, Daniel focused on research in big data solutions and distributed systems.

Director managing development of global integrated marketing solutions, processes, and technologies for Dell marketing units. Marketing thought leader for researching emerging technology, solutions, and business development opportunities across worldwide groups.

Dave Wells
Dave Wells (Paxata)

Dave Wells is actively involved in information management and business management, especially at their intersection. Dave is a consultant and educator dedicated to building meaningful connections throughout the path from data to business value. Knowledge sharing and skills development are Dave’s passions, carried out through... Read More.

Mike Wendt

Mike Wendt is an engineering manager in the AI Infrastructure Group at NVIDIA. His research work has focused on leveraging GPUs for big data analytics, data visualizations, and stream processing. Previously, Mike led engineering work on big data technologies like Hadoop, Datastax Cassandra, Storm,... Read More.

Timoni West
Timoni West (Unity Labs), @timoni

Timoni West leads design for Unity Labs, focusing on new game development and creation tools in VR. Previously, Timoni was SVP of design at Alphaworks, a new startup helping to democratize small business funding, and cofounder and creative director of Recollect, a social backup... Read More.

Edd Wilder-James

Edd Wilder-James is a strategist at Google, where he is helping build a strong and vital open source community around TensorFlow. A technology analyst, writer, and entrepreneur based in California, Edd previously helped transform businesses with data as vice president of strategy for... Read More.

Cack Wilhelm
Cack Wilhelm (Scale Venture Partners)

Cack Wilhelm is a principal at Scale Venture Partners, where she focuses on investments in early-stage software companies, with an eye toward those helping businesses better utilize data, automate workflows, incorporate AI, and build more resilient software. Looking further ahead, Cack is watching closely as... Read More.

Roseanne Wincek
Roseanne Wincek (Institutional Venture Partners)

Roseanne Wincek joined IVP in March 2015. She focuses on investing in later-stage, high-growth consumer and enterprise companies. Roseanne actively works with IVP portfolio companies Compass and PopSugar. Roseanne was previously a principal at Canaan Partners, a leading early-stage venture firm, where she... Read More.

Christina Wodtke
Christina Wodtke (Wodtke Consulting), @cwodtke

Christina Wodtke has led redesigns and initial product offerings for such companies as LinkedIn, Myspace, Zynga, Yahoo, Hot Studio, and eGreetings. Christina has founded two consulting startups, a product startup, and Boxes and Arrows, an online magazine of design. She also cofounded the Information Architecture... Read More.

Kristi Wolff
Kristi Wolff (Kelley Drye & Warren LLP)

Kristi Wolff is special counsel in Kelley Drye’s Washington, DC, office. Kristi’s practice focuses on food, dietary supplements, medical devices, and emerging health/wearable technology and privacy issues. Kristi has extensive experience advising clients whose products are within the overlapping jurisdictions of the Food and Drug... Read More.

Steve Wooledge (MapR Technologies)

Steve Wooledge is vice president of product marketing at MapR, where he is responsible for communicating the business value and technical advantages of MapR innovations and solutions for Hadoop. Steve was previously vice president of marketing for Teradata Unified Data Architecture, where he drove big... Read More.

Kristi Woolsey

Kristine Woolsey is the practice lead for creative environments at MAYA, a design and technology innovation consultancy. Kristi is well known as a behavioral strategist with years of speaking and research on the impact that the physical environment has on human behavior. She joined... Read More.

Ian Wrigley
Ian Wrigley (StreamSets), @iwrigley

Ian Wrigley is a Technical Director at StreamSets, the company behind the industry’s first data operations platform. Over his 25-year career, Ian has taught tens of thousands of students subjects ranging from C programming to Hadoop development and administration.

Jennifer Wu
Jennifer Wu (Cloudera)

Jennifer Wu is director of product management for cloud at Cloudera, where she focuses on cloud services and data engineering. Previously, Jennifer worked as a product line manager at VMware, working on the vSphere and Photon system management platforms.

Yinglian Xie
Yinglian Xie (DataVisor), @datavisor

Yinglian Xie is the CEO and cofounder of DataVisor, a startup in the area of big data analytics for security. Yinglian has been working in the area of internet security and privacy for over 10 years and has helped improve the security of billions... Read More.

Reynold Xin
Reynold Xin (Databricks)

Reynold Xin is a cofounder and chief architect at Databricks as well as an Apache Spark PMC member and release manager for Spark’s 2.0 release. Prior to Databricks, Reynold was pursuing a PhD at the UC Berkeley AMPLab, where he worked on large-scale data... Read More.

Caiming Xiong
Caiming Xiong (Metamind)

Caiming Xiong is a senior researcher at Metamind. Before that, he was a postdoctoral researcher in the Department of Statistics at the University of California, Los Angeles. Caiming holds a PhD in computer science and engineering from SUNY Buffalo and a BS and MS... Read More.

Fangjin Yang
Fangjin Yang (Imply)

Fangjin Yang is a coauthor of the open source Druid project and a cofounder of Imply, a data analytics startup based in San Francisco. Previously, Fangjin held senior engineering positions at Metamarkets and Cisco Systems. Fangjin has a BASc in electrical engineering and an MASc... Read More.

Chuck Yarbrough
Chuck Yarbrough (Pentaho)

Chuck Yarbrough is the senior director of solutions marketing and management at Pentaho, a leading big data analytics company that helps organizations engineer big data connections, blend data, and report and visualize all of their data. Chuck is responsible for creating and driving Pentaho solutions... Read More.

Martin Yip
Martin Yip (VMware)

Martin Yip is a product line marketing manager for VMware’s Cloud Platform business unit, where he oversees product marketing for a portfolio of products including vSphere, vSphere with Operations Management, and Big Data. Martin has been in the high technology industry for over 10 years... Read More.

Michael Yoder
Michael Yoder (Cloudera)

Mike Yoder is a software engineer at Cloudera who has worked on a variety of Hadoop security features and internal security initiatives. Most recently, he implemented log redaction and the encryption of sensitive configuration values in Cloudera Manager. Prior to Cloudera, he was a security... Read More.

Jin Zhang
Jin Zhang (CA Technologies), @jinz1

Jin Zhang is a passionate technology leader who is currently leading analytics at CA Technologies. Previously, Jin led Apigee to its IPO as their VP of engineering and was an engineering executive with IBM, where she was responsible for managing large teams as... Read More.

Owen Zhang
Owen Zhang (DataRobot)

Owen Zhang is the chief product officer at DataRobot. Owen spent most of his career in the property and casualty insurance industry. Most recently Owen served as vice president of modeling of the newly formed AIG Science team.

After spending several years in IT... Read More.

Tiger Zhang
Tiger Zhang (LinkedIn)

Yongzheng Zhang is a senior manager of data mining at LinkedIn and an active researcher and practitioner of text mining and machine learning. He’s developed many practical and scalable solutions for utilizing unstructured data for ecommerce and social networking applications, including search, merchandising, social commerce,... Read More.

Weidong Zhang
Weidong Zhang (LinkedIn)

Weidong Zhang is an engineering manager on the Data Analytics Infrastructure team at LinkedIn and leads the marketing and customer-service data warehouse vertical. Weidong has a passion for analytics, research, and data-driven decision making. He spent 10+ years in the data warehouse ETL and... Read More.

Alice Zheng

Alice Zheng is a senior manager of applied science on the machine learning optimization team on Amazon’s advertising platform. She specializes in research and development of machine learning methods, tools, and applications. She’s the author of Feature Engineering for Machine Learning. Previously, Alice has worked... Read More.

Wei Zheng
Wei Zheng (Trifacta)

As vice president of products at Trifacta, Wei Zheng combines her passion for technology with experience in enterprise software to define and shape Trifacta’s product offerings. Having founded several startups of her own, Wei believes strongly in innovative technology that solves real-world business problems. Most... Read More.

Shivon Zilis
Shivon Zilis (Bloomberg Beta), @shivon

Shivon Zilis is a venture capitalist and founding member of Bloomberg Beta, where she focuses on early-stage data and machine-intelligence investments. Shivon has led 12 investments since launch. One, Newsle, was acquired by LinkedIn; others include Context Relevant, Alation, and InfluxDB. She recently released a... Read More.

Nina Zumel
Nina Zumel (Win-Vector LLC)
R Day (Full Day) Tutorial

Nina Zumel is cofounder and principal at Win-Vector LLC, a data science consultancy based in San Francisco. She frequently writes and speaks on statistics and machine learning. She is also the coauthor of the popular book Practical Data Science with R (Manning 2014).