Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Strata + Hadoop World 2016 Speakers

New speakers are added regularly. Please check back to see the latest updates to the agenda.

Search Speakers

Jose Abelenda (Hotwire)

Jose Abelenda is the director of marketing analytics at Hotwire. Prior to that, he worked as a data scientist at PayPal.

Lior Abraham
Lior Abraham (Interana)

Lior Abraham is a cofounder of Interana, Inc. Lior was instrumental in scaling Facebook’s infrastructure from a few million users to over a billion, as well as bringing its most technically challenging products to market. This included building many of the backend systems that powered... Read More.

Armando Acosta

Armando Acosta has been in the IT industry for 14 years. With experience ranging from sales to product marketing and management to developing big data solutions, recently Armando has been focusing on the hyperscale market in server product management, hardware design, and big data solutions.... Read More.

Joseph Adler
Joseph Adler (Confluent), @jadler

Joseph Adler has many years of experience in data mining and data analysis at companies including DoubleClick, Verisign, and LinkedIn. Currently, he is director of product management and data science at Confluent. He is the holder of several patents for computer security and cryptography and... Read More.

Nidhi Aggarwal
Nidhi Aggarwal (Tamr, Inc.)

Nidhi Aggarwal leads strategy and marketing at Tamr. Prior to joining Tamr, Nidhi founded Cloud vLab, makers of qwikLAB, a software-learning platform used to create and deploy on-demand lab environments. In the years before Cloud vLab, Nidhi worked at McKinsey & Company, advising Fortune... Read More.

Sara Ahmadian
Sara Ahmadian (Seamless Planet), @saraahmadian

Sara Ahmadian is Seamless Planet’s globetrotting CEO and the relentless catalyst for Seamless Planet’s great journey. Sara has several years of experience developing large-scale business infrastructures at successful B2B startups in Silicon Valley. She was invited by President Obama to participate in the 2015... Read More.

John Akred
John Akred (Silicon Valley Data Science), @BigDataAnalysis

With over 15 years in advanced analytical applications and architecture, John Akred is dedicated to helping organizations become more data driven. As CTO of Silicon Valley Data Science, John combines deep expertise in analytics and data science with business acumen and dynamic engineering leadership.

... Read More.
T.J. Alumbaugh
T.J. Alumbaugh (Continuum Analytics), @talumbau

T.J. Alumbaugh is a developer at Continuum Analytics. He likes array-oriented computing, Python, and C++.

Franz Aman
Franz Aman (Informatica), @franzaman

Franz Aman is senior vice president of brand and demand at Informatica, where he is responsible for branding, global demand generation, marketing operations, content, and digital marketing. Previously, Franz held numerous executive positions within industry-leading technology companies, including SAP, BusinessObjects, BEA Systems, Read More.

Xavier Amatriain

Xavier Amatriain is VP of engineering at Quora, where he leads the team building the best source of knowledge in the Internet. With over 50 publications in different fields, Xavier is best known for his work on machine learning in general and recommender systems in... Read More.

Jeremy Anderson is a UX design lead at the Spark Technology Center in San Francisco, where he focuses on designing better experiences for the data community. Previously, Jeremy ran a small design consultancy in San Francisco. He has worked with both seasoned industry leaders as... Read More.

Jesse Anderson

Jesse Anderson is a Data Engineer, Creative Engineer and CEO of Smoking Hand.

He trains at companies ranging from startups to Fortune 100 companies on Big Data. This includes training on cutting edge technology like Apache Kafka, Apache Hadoop and Apache Spark. He... Read More.

Erik Andrejko
Erik Andrejko (The Climate Corporation)

Erik Andrejko leads the data science and research organization at the Climate Corporation, which applies large-scale statistical machine learning and data science to solve challenging problems in numerous domains such as climatology, agronomic modeling, and geospatial applications. Erik’s contributions to the Climate Corporation include defining... Read More.

Bruce Andrews
Bruce Andrews (US Department of Commerce), @DepSecAndrews

Bruce Andrews was confirmed as the deputy secretary of commerce on July 24, 2014, after being named acting deputy secretary of commerce by President Obama and Secretary Penny Pritzker on June 9, 2014. Previously, Bruce served as chief of staff to the secretary at the... Read More.

Ian Andrews

Ian Andrews is VP of products at Pivotal, where he is responsible for product strategy and marketing for Pivotal Cloud Foundry, Spring, and Big Data Suite. Prior to Pivotal, Ian was involved with market-defining startups such as Netscape, Opsware, and Aster Data.

Michael Armbrust

Michael Armbrust is the lead developer of the Spark SQL project at Databricks. Michael’s interests broadly include distributed systems, large-scale structured storage, and query optimization. Michael has a PhD from UC Berkeley. His thesis focused on building systems that allow developers to rapidly build... Read More.

Robert Bagley
Robert Bagley (ClickFox), @ClickFox

Robert Bagley is the vice president of analytics science at ClickFox, where he oversees the innovative application of data science practices for client engagements and product features, charting future analytic and technology strategies. Prior to his current role, he held various leadership positions at ClickFox... Read More.

Brandon Ballinger

Brandon Ballinger is a cofounder at Cardiogram. Previously a cofounder at Sift Science and an engineer at Google on speech recognition and ads quality, Brandon was also called in by the White House to help fix Healthcare.gov. He graduated from the University of Washington with... Read More.

Vishal Bamba
Vishal Bamba (Transamerica), @vishalbamba

Vishal Bamba is vice president of strategy and architecture at Transamerica Technology, where he leads a team focusing on innovation initiatives within the enterprise. He has over 15 years of experience in distributed systems and has led many innovation projects. He has consulted and worked... Read More.

Nenshad  Bardoliwalla

Nenshad Bardoliwalla is an executive and thought leader with a proven track record of success leading product strategy, product management, and development in business analytics. Nenshad is the founding vice president of products at Paxata, where he is responsible for product strategy, product management, and... Read More.

Paul Barth
Paul Barth (Podium Data), @PodiumData

Paul Barth is founder and CEO of Podium Data, creator of the industry-leading Podium data lake software platform, which is redefining enterprise data management. He has spent decades developing advanced data and analytics solutions for Fortune 100 companies and is a recognized thought leader... Read More.

Pierre Barthelemy
Pierre Barthelemy (Coursera)

Pierre Thomas Barthelemy is the engineering lead of the Data Infrastructure team at Coursera. The team is responsible for introducing core data systems (e.g., data warehouse using Redshift, ETL using Data Pipeline and Scalding), while also helping build products that create a developer-friendly ecosystem... Read More.

Joel Baxter
Joel Baxter (BlueData)

Joel Baxter is an engineer at BlueData, where he focuses on virtualization, containers, and Hadoop-related technologies to build an infrastructure platform for big data analytics. His background is in the provisioning and configuration of virtual compute, storage, and networking to serve the needs of application... Read More.

Maxime Beauchemin

Maxime Beauchemin recently joined Airbnb as a data engineer developing tools to help streamline and automate data-engineering processes. He mastered his data-warehousing fundamentals at Ubisoft and was an early adopter of Hadoop/PIG while at Yahoo in 2007. More recently, at Facebook, he developed analytics-as-a-service... Read More.

Marie Beaugureau
Marie Beaugureau (O'Reilly Media, Inc. )

Marie Beaugureau is the lead data editor for O’Reilly Media.

Alexander Behm (Cloudera)

Alex Behm is a software engineer at Cloudera, working on the Impala team. He holds a PhD in computer science from UC Irvine.

John Belchamber
John Belchamber (Telefónica)

John Belchamber is global head of business intelligence for the Advanced Analytics team at Telefónica. John has 10 years of experience in marketing and 10 more in the telecom industry, where he held strategic roles in innovation and business intelligence. Recognized as Data Professional of... Read More.

Tim Berglund

Tim Berglund is a teacher, author, and technology leader with DataStax. He has spoken at numerous conferences internationally and in the United States and contributes to the Denver tech community as president of the Denver Open Source User Group. He is the copresenter of various... Read More.

Kristina Bergman (Ignition Partners)
Lucy Bernholz
Lucy Bernholz (Stanford University), @p2173

Lucy Bernholz is a senior research scholar at Stanford University, where she runs the Digital Civil Society Lab. Lucy blogs about philanthropy, nonprofits, and technology at Philanthropy2173.com.

Christopher Berry
Christopher Berry (Canadian Broadcasting Corporation)

Christopher Berry is a data scientist at the Canadian Broadcasting Corporation and the founder of Authintic (acquired by 500px). Christopher has implemented breakthrough social-analytics programs for AB-Inbev, Research In Motion, and Coca-Cola. He participated in ecommerce redesigns at Gucci and Dell, mobile integrations for Best... Read More.

John Berryman
John Berryman (Eventbrite), @JnBrymn

John Berryman’s first career was as an aerospace engineer, but after several years in aerospace, he found that he most loved his job when he was either programming or working on a good math problem. Eventually, John cut out the aircraft and satellites and started... Read More.

David Beyer
David Beyer (Amplify Partners), @dbeyer123

David Beyer is currently an investor with Amplify Partners, a $50M VC firm focused exclusively on early-stage IT infrastructure and data companies. David began his career in technology as the cofounder and CEO of Chartio.com, a pioneering provider of cloud-based data visualization and analytics.... Read More.

Milind Bhandarkar
Milind Bhandarkar (Ampool, Inc.), @techmilind

Milind Bhandarkar was the founding member of the team at Yahoo that took Apache Hadoop from 20-node prototype to data center-scale production system and has been contributing and working with Hadoop since version 0.1.0. Milind started the Yahoo Grid solutions team focused on training, consulting,... Read More.

Anurag  Bhardwaj
Anurag Bhardwaj (Quad Analytix)

Anurag Bhardwaj currently leads data science efforts at Quad Analytix, where he focuses on large-scale product classification, large-scale smart extraction, and various other machine-learning techniques. Previously, he worked on image understanding at eBay Research Labs. Anurag received his PhD and MS from the State University... Read More.

Lori Bieda
Lori Bieda (Bank of Montreal), @loribieda

Lori Bieda is VP of business analytics and insights at the Bank of Montreal. Lori has 20 years of analytics, marketing, technology, and leadership experience across the financial services, insurance, telecommunications, technology, retail, publishing, and marketing service provider sectors. Lori’s proven successes include leading all... Read More.

Keith Bigelow
Keith Bigelow (3D Robotics), @keithbigelow

Keith Bigelow leads the commercial cloud and drone division at 3DR. Prior to 3DR, Keith was the GM and SVP of the Analytics Cloud, the fastest-growing product line in Salesforce history. Previously, Keith held executive positions at SAP, where he served as Read More.

Sarah Bird
Sarah Bird (Continuum Analytics)

-

Joerg Blumtritt
Joerg Blumtritt (Datarella), @jbenno

Joerg Blumtritt is the founder and CEO of Datarella, a computational social science startup delivering mobile analytics, self-tracking solutions, and data science consulting. After graduating from university with a thesis on machine learning, Joerg worked as a researcher in behavioral sciences, focused on nonverbal... Read More.

Farrah Bostic
Farrah Bostic (The Difference Engine), @farrahbostic

Farrah Bostic created the Difference Engine based on her belief that deep understanding of customer needs is essential to growing businesses through great products and services. Farrah has honed her customer-centric insights as an advisor to some of the world’s most respected brands, including Apple,... Read More.

Roni Burd
Roni Burd (Microsoft)

Yaron “Roni” Burd is a principal program manager on the Big Data team at Microsoft working on Hadoop and Azure Data Lake, where he focuses on making machine learning with big data scalable and easy. Roni has spent eight years helping Microsoft build its internal... Read More.

Mark Burnette
Mark Burnette (Pentaho)

Mark Burnette is an enterprise sales engineer at Pentaho, where he partners with large organizations to develop successful pilots and proofs of concept for big data solutions and embedded analytics using Pentaho, Hadoop, and other technologies. Prior to Pentaho, Mark ran an IT consulting practice... Read More.

Matt Butner
Matt Butner (Stride Health), @butner

Matt Butner is CTO and cofounder of Stride Health, which delivers intelligent healthcare, coverage, and tax compliance to self-employed and independent working Americans. Stride’s suite of benefits for independents is directly integrated into the largest on-demand marketplaces, including Uber, Postmates, and TaskRabbit. Backed by... Read More.

Mike Cafarella
Mike Cafarella (University of Michigan), @MikeCafarella

Mike Cafarella is one of the cofounders of the Apache Hadoop and Nutch open source projects. Mike is also an assistant professor of computer science and engineering at the University of Michigan. His research interests include databases, information extraction, data integration, and data mining. Recently,... Read More.

JR Cahill
JR Cahill (Kellogg)

JR Cahill leads the Enterprise Analytics Architecture team at Kellogg, supporting the Global Analytics team that consists of Data Science, Advanced Analytics, Data Services, Visualizations and Reporting. JR has 20 years of operational development and architecture experience in data warehousing and analytics. He is also... Read More.

Arturo Canales
Arturo Canales (Telefónica)

Arturo Canales leads the analytics team in the Global BI & Big Data unit at Telefónica. Arturo has been involved in the creation of many data products for internal BI teams across all the countries where Telefónica operates, from a social-network-analysis-approach product to better understand... Read More.

Arno Candel

Arno Candel is the chief architect at H2O, a distributed and scalable open source machine-learning platform. Arno is also the main author of H2O’s Deep Learning. Before joining H2O, Arno was a founding senior MTS at Skytree, where he designed and implemented high-performance machine-learning... Read More.

John Canny
John Canny (UC Berkeley)

John F. Canny is a computer scientist and the Paul and Stacy Jacobs Distinguished Professor of Engineering in the Computer Science Department of the University of California, Berkeley. John has made significant contributions in various areas of computer science and mathematics, including artificial intelligence, robotics,... Read More.

Matt Cardillo (FINRA)

Matt Cardillo is a senior director of FINRA technology. Matt is an avid Scrum evangelist at FINRA and exercises it in the delivery of highly usable, innovative big data analytic solutions.

Amber Case

Amber Case is the director of Esri’s R&D Center in Portland, where she works on open source developer tools and next-generation location-based technology. Previously, Amber was the CEO of and cofounder of Geoloqi, a location-based software company acquired by Esri in 2012. She is... Read More.

Michele Chambers
Michele Chambers (Continuum Analytics), @mcAnalytics

An entrepreneurial executive with over 25 years of industry experience, Michele Chambers is currently CMO of Continuum Analytics. Prior to Continuum Analytics, Michele held executive leadership roles at database and analytic companies Netezza, IBM, Revolution Analytics, MemSQL, and RapidMiner. In her career, Michele... Read More.

Evan Chan
Evan Chan (Tuplejump), @evanfchan

Evan Chan is a distinguished software engineer at Tuplejump. Evan loves to design, build, and improve bleeding-edge distributed data and backend systems using the latest open source technologies. He has led the design and implementation of multiple big data platforms based on Storm, Spark, Kafka,... Read More.

Vinoth Chandar

Vinoth Chandar is currently working on bringing Hadoop and Spark to Uber. In the past, Vinoth was the LinkedIn lead on Voldemort. He has worked on Oracle server’s replication engine, HPC, and stream processing.

Manjeet Chayel
Manjeet Chayel (Amazon Web Services)

Manjeet Chayel is a specialist SA for AWS working on big data technology solutions. Manjeet focuses on Amazon EMR and helps customers solve their big data problems using the right techniques and tools for the job.

Ewen Cheslack-Postava

Ewen Cheslack-Postava is an engineer at Confluent building a stream data platform based on Apache Kafka to help organizations reliably and robustly capture and leverage all their real-time data. Ewen received his PhD from Stanford University, where he developed Sirikata, an open source system for... Read More.

Adam Cheyer

Adam Cheyer is cofounder and VP of engineering at Viv. Previously, Adam was cofounder and VP of engineering at Siri. He’s also served as program director in SRI’s Artificial Intelligence Center and chief architect of the CALO/PAL project. A pioneer in the areas... Read More.

Trina Chiasson
Trina Chiasson (Tableau Software), @trinachi

Trina Chiasson lives at the intersection of data, design, and code. Trina is a senior product manager at Tableau Software, where she enjoys helping people see and understand data. Previously, she was the cofounder and CEO of Infoactive, a web app for turning live... Read More.

Mok Choe
Mok Choe (TD Bank Group )

Mok Choe is an accomplished technologist whose career spans a diverse group of financial services businesses and successful Internet companies. Mok is a proven transformational leader, with extensive experience leading enterprise architecture at firms including TD Bank Group, Commonwealth Bank of Australia, Union Bank of... Read More.

Kelvin Chu (Uber)

Kelvin Chu is a founding member of the Hadoop team at Uber, where he creates tools and services on top of Spark to support multitenancy and large-scale computation-intensive applications. Kelvin is the creator and lead engineer of the Spark Uber development kit, Paricon, SparkPlug, and... Read More.

Brian Clapper
Brian Clapper (Databricks)

Brian Clapper is a senior instructor and curriculum developer at Databricks. Brian has more than 32 years’ experience as a software developer. Brian has worked for a stock exchange, the US Navy, a large software company, several startups and small companies, and, most recently, as... Read More.

Brian  Clark
Brian Clark (Objectivity)

Brian Clark is VP of product management at Objectivity. Brian has nearly 30 years of software and technology experience and was one of the early architects of Objectivity/DB. Before joining Objectivity, Brian worked at Automation Technology Products, providing leading tools in the MCAD market.... Read More.

Christopher Colburn

Christopher Colburn is just another data scientist at Netflix.

Eric Colson
Eric Colson (Stitch Fix)

Eric Colson is chief algorithms officer at Stitch Fix as well as an advisor to several big data startups. Previously, Eric was VP of data science and engineering at Netflix. He has a BA in economics from San Francisco State University, an MS in information... Read More.

Michael Conover

Mike Conover builds machine-learning technologies that leverage the behavior and relationships of hundreds of millions of people. A staff data scientist at LinkedIn, Mike has a PhD in complex systems analysis with a focus on information propagation in large-scale social networks. His work has appeared... Read More.

James Crawford
James Crawford (Orbital Insight), @orbital_insight

James Crawford has two decades of experience leading innovative software projects, including empowering farmers with climate data at the Climate Corporation, working to put a commercial robot on the moon at Moon Express, making the world’s books searchable at Google, and managing robotics at NASA’s... Read More.

Charlie Crocker
Charlie Crocker (Autodesk)

Charlie Crocker is a data geek with 20 years of experience bringing data out of the shadows to drive business value and optimize operational costs. At Autodesk, he is currently working across divisions to identify and validate potential reliable data sources and access mechanisms, while... Read More.

Alistair Croll
Alistair Croll (Solve For Interesting), @acroll

Alistair Croll is an entrepreneur with a background in web performance, analytics, cloud computing, and business strategy. In 2001, he cofounded Coradiant (acquired by BMC in 2011) and has since helped launch Rednod, CloudOps, Bitcurrent, Year One Labs, and several other early-stage companies. He works... Read More.

Nick Curcuru
Nick Curcuru (MasterCard Advisors)

Nick Curcuru has been delivering analytics solutions for nearly 20 years in operations and consulting. He is currently principal of the big data analytics practice at MasterCard Advisors, where he works with the executive suite cascading to the operational level to enable data-driven strategy for... Read More.

Doug Cutting
Doug Cutting (Cloudera), @cutting

Doug Cutting is the founder of numerous successful open source projects, including Lucene, Nutch, Avro, and Hadoop. Doug joined Cloudera in 2009 from Yahoo, where he was a key member of the team that built and deployed a production Hadoop storage-and-analysis cluster for mission-critical business... Read More.

Michelangelo D'Agostino

Michelangelo D’Agostino is the director of data science R&D at Civis Analytics, where he leads a team that develops statistical models and writes software to help companies and nonprofits leverage their data. As a reformed particle physicist turned data scientist, Michelangelo loves mungeable datasets, machine... Read More.

Timothy Danford
Timothy Danford (Tamr, Inc.)

Timothy Danford is a computer scientist working on advanced automation approaches to big data variety in the pharmaceutical and healthcare industries. Previously, Timothy was a software architect, engineer, and founding team member for Genome Bridge LLC, a Broad Institute subsidiary organized to develop... Read More.

Tathagata Das
Tathagata Das (Databricks)

Tathagata Das is an Apache Spark committer and a member of the PMC. He is the lead developer behind Spark Streaming, which he started while a PhD student in the UC Berkeley AMPLab, and is currently employed at Databricks. Prior to Databricks, Tathagata worked... Read More.

Sudipto Dasgupta
Sudipto Dasgupta (Infosys Limited)

Sudipto Shankar Dasgupta is a AVP and head of engineering for the Platforms group at Infosys Ltd., where he works on big data and analytics platform development for large enterprises. Prior to that he was chief architect with SAP, working on SAP... Read More.

Michael Dauber
Michael Dauber (Amplify), @dauber

Prior to joining Amplify as a general partner, Mike Dauber spent over six years at Battery Ventures, where he led early-stage enterprise investments on the West Coast. Most recently, Mike sat on the boards of Continuuity, Duetto, Interana, and Platfora. Mike also led Battery’s investment... Read More.

Bolke de Bruin is putting advanced analytics in the heart of the wholesale business line of European bank ING.

Donna Denio
Donna Denio (Team Dynamics Boston)

Donna Denio is a communications and business development specialist who is passionate about teamwork and generating productive relationships. Donna has over 20 years’ experience helping leaders of multinational companies identify and secure new business opportunities in design and construction. Ten years ago, Donna’s search for... Read More.

Anthony Dina
Anthony Dina (Dell)

With seventeen years in the IT industry, Anthony Dina serves as the director of enterprise technologists at Dell, Inc., where he leads the team of solutions architects with expertise in big data and application acceleration to work with customers on how to transform IT into... Read More.

Renee DiResta

Renee DiResta is the VP of business development at Haven, a private marketplace for booking ocean freight shipments. Prior to Haven, Renee was a principal at seed-stage VC fund O’Reilly AlphaTech Ventures (OATV). She also spent seven years as a trader at Jane Street... Read More.

Scott Donaldson

Scott Donaldson is senior director for Market Regulation Technology at FINRA. Scott leads the data and analytics teams responsible for the surveillance of US equities and fixed-income markets.

Mark Donsky
Mark Donsky (Cloudera)

Mark Donsky leads data management solutions at Cloudera. Prior to Cloudera, Mark was at Silver Spring Networks, where he managed big data analytics solutions that reduced greenhouse gas emissions by millions of dollars annually. He has a BS with honors in computer science from the... Read More.

Scott Draves
Scott Draves (Two Sigma Open Source), @BeakerNotebook

Scott Draves is an award-winning software artist, VJ, and pioneer of the open source movement. His clients and exhibitions range from the likes of MoMA.org, LACMA, Google, and the Adler Planetarium to Skrillex. He has a PhD in computer science from Carnegie Mellon University... Read More.

Chris DuBois

Chris DuBois is a data scientist focused on building tools for other data scientists. At Dato, Chris has helped design and implement tools for creating recommendation systems and for large-scale text analysis. His current work makes it simpler to train models that generalize well. After... Read More.

Ted Dunning
Ted Dunning (MapR Technologies), @ted_dunning

Ted Dunning has been involved with a number of startups—the latest is MapR Technologies, where he is chief application architect working on advanced Hadoop-related technologies. Ted is also a PMC member for the Apache Zookeeper and Mahout projects and contributed to the Mahout clustering,... Read More.

Don Bosco Durai (Hortonworks, Inc)

Don Bosco Durai is an Apache committer and currently working as a security architect at Hortonworks, focused on enabling enterprise-grade security within the Hadoop platform. Bosco brings years of experience building and managing enterprise data security products. Before Hortonworks, Bosco was the cofounder and chief... Read More.

Glynn Durham
Glynn Durham (Cloudera)

Glynn Durham is a senior instructor at Cloudera who has also worked at least five years each at Oracle, Forté Software, and MySQL. Glynn loves data and is fascinated by the notion of search as “the other NoSQL data store.”

Gary Dusbabek
Gary Dusbabek (Silicon Valley Data Science)

An Apache Cassandra committer and PMC member, Gary Dusbabek is a lifelong programmer specializing in distributed systems. His past experience includes working with large-scale text and image indexes in the newspaper industry and building a multi-data-center distributed metrics and monitoring system for a large... Read More.

Joey Echeverria

Joey Echeverria is the director of engineering at Rocana, where he builds applications for scaling IT operations built on the Apache Hadoop platform. Joey is a committer on the Kite SDK, an Apache-licensed data API for the Hadoop ecosystem. Joey was previously a... Read More.

Helena Edelson

Helena Edelson has been a software engineer for over 15 years and is currently VP of product engineering at Tuplejump. After a decade in distributed messaging engineering, Helena moved exclusively to working with Scala, first for cloud infrastructure automation then big data, all for large-scale... Read More.

Alyosha Efros
Alyosha Efros (UC Berkeley), @UCBerkeley

Alexei (Alyosha) Efros joined UC Berkeley in 2013 as associate professor of electrical engineering and computer science. Prior to that, Alyosha spent nine years on the faculty of Carnegie Mellon University. He has also been affiliated with École Normale Supérieure/INRIA and the University of... Read More.

Jana Eggers
Jana Eggers (Nara Logics), @jeggers

Jana Eggers is a tech executive focused on products and the messages surrounding them. Jana has started and grown companies and led large organizations within even bigger companies. She supports, subscribes to, and contributes to customer-inspired innovation, systems thinking, lean analytics, and Autonomy/Mastery/Purpose-style leadership. Jana’s... Read More.

Stephen Elston
Stephen Elston (Quantia Analytics, LLC)
R Day (Full Day) Tutorial

Stephen Elston is an experienced big data geek, data scientist, and software business leader. Stephen is managing director at Quantia Analytics, LLC, where he leads the building of new business lines, manages P&L, and takes software products from concept and financing through development, intellectual... Read More.

Bin Fan
Bin Fan (Alluxio)

Bin Fan is a software engineer at Alluxio. Bin is one of the top committers on the Alluxio project. Prior to Alluxio, Bin worked at Google building next-generation storage infrastructure, where he won Google’s Technical Infrastructure award. Bin has a PhD in computer science from... Read More.

Moty Fania
Moty Fania (Intel)

Moty Fania is a principle engineer for big data analytics at Intel IT, where he drives the overall technology and architectural roadmap and owns development and architecture. Moty has over 13 years of experience in BI, data warehousing, and decision-support solutions. He holds a bachelor’s... Read More.

Faisal Farooq
Faisal Farooq (IBM Watson Health)

Faisal Farooq is currently the principal scientist in the Watson Health group of IBM Watson, where he works on next-generation healthcare software to improve patient care. Faisal is an expert in applying machine learning in the healthcare domain, and his general areas of interest... Read More.

Sameer Farooqui

Sameer Farooqui is a client services engineer at Databricks, where he works with customers on Apache Spark deployments. Sameer works with the Hadoop ecosystem, Cassandra, Couchbase, and general NoSQL domain. Prior to Databricks, he worked as a freelance big data consultant and trainer globally and... Read More.

Camille Fournier
Camille Fournier (Rent the Runway), @skamille

Camille Fournier is the former head of engineering at Rent the Runway. She was previously a vice president at Goldman Sachs. Camille is an Apache ZooKeeper committer and PMC member and a Dropwizard framework PMC member.

Michael Franklin
Michael Franklin (AMPLab/UC Berkeley), @franklinmj

Michael Franklin is the Thomas M. Siebel Professor of Computer Science at UC Berkeley and the director of the AMPLab. The AMPLab, which received an NSF CISE Expeditions in Computing award announced as part of the White House Big Data Research Initiative in... Read More.

Eric Frenkiel

Eric Frenkiel is the cofounder and CEO of MemSQL, an in-memory distributed database that combines real-time and historical big data analytics. MemSQL is a Y Combinator company that has raised more than $45M in venture capital. Prior to MemSQL, Eric worked at Facebook on... Read More.

Julia Galef
Julia Galef (Center for Applied Rationality), @juliagalef

Julia Galef cofounded the Center for Applied Rationality (CFAR), a nonprofit devoted to developing cognitive, science-based strategies for reasoning and decision making. In addition to research, CFAR runs workshops for companies and talented individuals who want to use rationality to address global problems.... Read More.

Tanya Gallagher
Tanya Gallagher (DataStax)

Tanya Gallagher is a veteran technical instructor with thousands of hours of classroom experience across a 20-year career. Tanya has spent the past two years at DataStax writing curriculum and leading the curriculum development team. Prior to DataStax, she was a curriculum developer and technical... Read More.

Ilya Ganelin
Ilya Ganelin (Capital One Data Innovation Lab)

Ilya Ganelin is a roboticist turned data engineer. After a few years building self-discovering robots at the University of Michigan and another few years working on embedded DSP software with cell phones and radios at Boeing, he landed in the world of big data... Read More.

Siddha Ganju
Siddha Ganju (Carnegie Mellon University), @SiddhaGanju

Siddha Ganju is a master’s student of computational data science at Carnegie Mellon University and was a 2015 summer openlab intern at CERN. She has implemented several projects at the junction of machine learning, natural language processing, and information retrieval, and her research also... Read More.

Yael Garten
Yael Garten (LinkedIn)

Yael Garten leads a team of data scientists at LinkedIn that focuses on understanding and increasing growth and engagement of LinkedIn’s 350 million members across mobile and desktop consumer products. Yael is an expert at converting data into actionable product and business insights that impact... Read More.

Deepak Gattala is a big data architect in IT project management at Dell.

Matthew Gee
Matthew Gee (Impact Lab/University of Chicago )

Matthew Gee is cofounder and principal at the Impact Lab, a data-analytics company focused exclusively on developing scalable data science solutions to social-sector problems. He is also a senior research scientist at the University of Chicago’s Center for Data Science and Public Policy and a... Read More.

Lise Getoor
Lise Getoor (University of California, Santa Cruz)

Lise Getoor is a professor in the Computer Science Department at the University of California, Santa Cruz. Her research areas include machine learning, data integration, and reasoning under uncertainty, with an emphasis on graph and network data. Lise was recently recognized as one of the... Read More.

Charles Givre
Charles Givre (Booz | Allen | Hamilton), @cgivre

Charles Givre is an unapologetic data geek who is passionate about helping others learn about data science and become passionate about it themselves. For the last five years, Charles has worked as a data scientist at Booz Allen Hamilton for various government clients and has... Read More.

Colette Glaeser
Colette Glaeser (Silicon Valley Data Science), @ColetteGlaeser

Colette Glaeser is a principal data strategist at Silicon Valley Data Science. With a proven track record in applying analytics to provide a competitive advantage, Colette brings over 20 years of experience in driving business development, customer insight, operational analysis, and continuous process improvement across... Read More.

Dennis Gleeson
Dennis Gleeson (1010data)

Dennis Gleeson is the chief evangelist at 1010data. Prior to joining 1010data, Dennis was a director of strategy in the Central Intelligence Agency (CIA)’s Directorate of Analysis. He began his career with the CIA in 2002 as a political analyst. In 2009, he... Read More.

Scott Gnau
Scott Gnau (Hortonworks)

Scott Gnau has spent his entire career in the data industry, most recently as president of Teradata Labs, where he provided visionary direction for research, development, and sales support activities related to Teradata integrated data warehousing, big data analytics, and associated solutions. He also drove... Read More.

Joe Goldberg
Joe Goldberg (BMC Software Inc.), @GoldbergJoe

Joe Goldberg is lead solutions marketing manager at BMC. An IT professional with more than 35 years of experience in the design, development, implementation, sales, and marketing of enterprise solutions to Global 2000 organizations, Joe has been active in helping BMC products leverage... Read More.

Kevin Goode
Kevin Goode (Inmar)

Kevin Goode is the director of platform engineering at Inmar. Kevin has 20 years of IT experience, 19 years of which has been SQL-server focused, starting with version 6.5. For the past four years, he has been focused on big data, Hadoop, and NoSQL.... Read More.

Alex Gorelik
Alex Gorelik (Waterline Data), @gorelikalex

Alex Gorelik is the founder and CEO of Waterline Data, a startup focused on enhancing the value of Hadoop through data self-service and governance. Alex is a serial entrepreneur and innovator who has spent over 25 years inventing and bringing to market cutting-edge data-oriented... Read More.

Jonathan Gosier
Jonathan Gosier (AuDigent), @jongos

Jon Gosier is a serial tech entrepreneur and venture capitalist working at the intersection of data science and design. Based in Philadelphia, Jon is also the cofounder of Predictive Pop (aka PredPop), a data company changing way the music industry monitors and monetizes music. Prior... Read More.

Alexander Gray
Alexander Gray (Skytree, Inc.), @skytreeHQ

Alexander Gray is an associate professor at Georgia Tech and the CEO of Skytree, Inc. His research focuses on scaling up all of the major practical methods of machine learning (ML) to massive datasets. Alex began working on this problem at NASA in... Read More.

Dave Gray
Dave Gray (XPLANE), @davegray

Dave Gray is the founder and chairman of XPLANE, the visual thinking company. Founded in 1993, XPLANE has grown to be the world’s leading consulting and design firm focused on information-driven communications. Dave spends his time researching and writing about visual business, as... Read More.

Garrett Grolemund
Garrett Grolemund (RStudio, Inc.)
R Day (Full Day) Tutorial

Garrett Grolemund is the editor-in-chief of Shiny.rstudio.com, the development center for the Shiny R package, and a data scientist and chief instructor for RStudio, Inc. Garrett is a longtime user and advocate of R; he wrote the popular lubridate package for working with dates... Read More.

Robert Grossman
Robert Grossman (University of Chicago)

Robert Grossman is a faculty member and the chief research informatics officer in the Biological Sciences Division of the University of Chicago. Robert is the director of the Center for Data Intensive Science (CDIS) and a senior fellow at both the Computation Institute... Read More.

Mark Grover

Mark Grover is a software engineer working on Apache Spark at Cloudera. Mark is a committer on Apache Bigtop and a committer and PMC member on Apache Sentry. He has contributed to a number of open source projects including Apache Hadoop, Apache Hive, Apache... Read More.

Carlos Guestrin
Carlos Guestrin (Dato Inc.), @guestrin

Carlos Guestrin is the Amazon Professor of Machine Learning in Computer Science & Engineering at the University of Washington and the cofounder and CEO of Dato. Carlos also coteaches the Machine Learning Specialization through UW and Coursera. His previous positions include the Finmeccanica Associate... Read More.

Kanu Gulati
Kanu Gulati (Zetta Venture Partners)

Kanu Gulati is a senior associate at Zetta Venture Partners. Kanu has over 10 years of operating experience as an engineer, scientist, and strategist. She owned Intel’s multicore CAD algorithms research roadmap, developed advanced parallel CAD solutions, and pioneered metrics-driven methodology improvements for... Read More.

Sijie Guo
Sijie Guo (Twitter), @sijieg

Sijie Guo is a staff software engineer at Twitter, where he is tech lead of the DistributedLog project and the PMC chair of Apache BookKeeper.

Vida Ha
Vida Ha (Databricks), @femineer

Vida Ha is currently a solutions engineer at Databricks. Previously, she worked on scaling Square’s reporting analytics system. Vida first began working with distributed computing at Google, where she improved search rankings of mobile-specific web content and built and tuned language models for speech recognition... Read More.

Patrick Hall
Patrick Hall (SAS)

Patrick Hall is a senior staff scientist at SAS and an adjunct professor in the Department of Decision Sciences at George Washington University. Patrick designs new data-mining and machine-learning technologies. He is the 11th person worldwide to become a Cloudera certified data scientist. Patrick... Read More.

Jordan Hambleton
Jordan Hambleton (Cloudera, Inc.)

Jordan Hambleton is a solutions architect for Cloudera, based in the San Francisco office. While at Cloudera, his focus has been partnering with customers to build and manage scalable enterprise products on the Hadoop stack. Prior to Cloudera, Jordan was a member of technical staff... Read More.

Bob Hansen
Bob Hansen (HPE)

Bob Hansen, the engineer in charge of making Vertica a vibrant part of the greater Hadoop ecosystem, turns customers’ needs into new features, making Vertica a peaceful island floating in the center of your data lake. Over his entire career, Bob has been dedicated to... Read More.

Moritz Hardt
Moritz Hardt (Google), @mrtz

Moritz Hardt is a senior research scientist at Google Research, where his mission is to build the theory and tools that make machine learning more reliable. After obtaining a PhD in computer science from Princeton University, Moritz spent three years at IBM Research Almaden... Read More.

Todd Harple

Todd Harple is an experience engineer at Intel, where he has worked since 2005. Todd has conducted global ethnographic and design research and presently he leads strategic innovation and pathfinding activities within Intel’s New Devices Group. Over the past three years, his focus has increasingly... Read More.

Derrick Harris

Derrick Harris works for datacenter software startup Mesosphere. He was previously a technology journalist, most notably covering cloud computing, big data, and other emerging IT trends for Gigaom since 2009. There’s a strong possibility that Derrick has written the words “cloud” and “Hadoop” more than... Read More.

Kate Heddleston
Kate Heddleston (Kate Heddleston LLC), @heddle317

Kate Heddleston is a software engineer who focuses on using open source tools to build web applications, with a particular interest in the portions of the product that interface with the user. When she’s not programming, Kate is involved with organizations like Hackbright Academy, PyLadies,... Read More.

Joe Hellerstein

Joseph M. Hellerstein is the Jim Gray Chair of Computer Science at UC Berkeley and cofounder and CSO at Trifacta. Joe’s work focuses on data-centric systems and the way they drive computing. He is an ACM fellow, an Alfred P. Sloan fellow, and... Read More.

Hylke Hendriksen

Hylke Hendriksen is a data scientist at ING. Hylke studied computer science at Delft University of Technology. After demonstrating his graduate thesis project to the ING Wholesale Banking Advanced Analytics team on real-time anomalous click path detection, Hylke is now implementing this in... Read More.

Bill Hinderman
Bill Hinderman (Expedia, Inc.), @billHinderman

Bill Hinderman is the engineering manager for air site optimization at Expedia, and was the senior site optimization UI engineer at Orbitz Worldwide. In human terms: he built the A/B testing development practice from the ground up. He and his team focus on experimenting and... Read More.

Allen Hoem
Allen Hoem (Teradata)

Throughout his eight-year tenure in the advanced electronics industry, Allen Hoem has a focused on process optimization and product development. At Roku Inc., Allen streamlined the firmware deployment model for the New Products, Television division. Prior to that, Allen was the development lead and process coach for developing Read More.

Joshua Hoffman
Joshua Hoffman (Zymergen)

Joshua Hoffman is the CEO of Zymergen. Prior to Zymergen, Josh was a partner at Norcob Capital and before that a managing director in merchant banking at Rothschild, where he was a member of the Management Committee. He began his career at McKinsey &... Read More.

Jeff Holoman

Jeff Holoman is a systems engineer at Cloudera. Jeff is a Kafka contributor and has focused on helping customers with large-scale Hadoop deployments, primarily in financial services. Prior to his time at Cloudera, Jeff worked as an application developer, system administrator, and Oracle technology specialist.

... Read More.
Jeremy Howard

Jeremy Howard is a serial entrepreneur, business strategist, developer, and educator. He is the CEO of Enlitic, a startup he founded to use recent advances in machine learning to transform the practice of medicine and bring modern medical diagnostics to billions of people in... Read More.

Johnson Hsieh
Johnson Hsieh (Cardiogram)

Johnson Hsieh is a cofounder at Cardiogram, where he is applying deep learning to medicine. Previously a software engineer at Google building user models (e.g. user interests) to improve cross-product personalization/recommendation using various ML techniques. He also worked on the Google Voice Assistant (a.k.a. “Ok... Read More.

John Hugg
John Hugg (VoltDB), @johnhugg

John Hugg has spent his entire career working with databases and information management. In 2008, John was lured away from a PhD program by Mike Stonebraker to work on what became VoltDB. As the first engineer on the product, he liaised with a team of... Read More.

Leah Hunter
Leah Hunter (Tech Journalist), @leahthehunter

Leah Hunter writes about the human side of tech for Fast Company and is also a writer for the German magazine Business Punk. Formerly an editor at MISC magazine and AVP of innovation at Idea Couture, Leah has spent her career exploring the... Read More.

Alysa Z. Hutnik
Alysa Z. Hutnik (Kelley Drye & Warren LLP)

Alysa Z. Hutnik is a partner in the Advertising & Marketing and Privacy & Information Security practices at Kelley Drye & Warren LLP in Washington, DC. Her practice represents clients in all forms of consumer-protection matters, from counseling to defending regulatory investigations and litigation.... Read More.

Tim Hwang
Tim Hwang (ROFLCon / The Web Ecology Project), @timhwang

Tim Hwang is a lawyer and researcher focusing on the intersection of intelligent agents and society, currently at the Intelligence and Autonomy project at Data & Society in New York. He has formerly served in research roles with the Stanford Center for Legal Informatics, the... Read More.

Noah Iliinsky
Noah Iliinsky (Amazon Web Services), @noahi

Noah Iliinsky is a senior UX architect with Amazon Web Services. Noah strongly believes in the power of intentionally crafted communication and has spent the last decade researching, writing, and speaking about best practices for designing visualizations, informed by his graduate work in user experience... Read More.

Mario Inchiosa
Mario Inchiosa (Microsoft)

Mario Inchiosa’s passion for data science and high-performance computing drives his work at Microsoft, where he focuses on delivering parallelized, scalable advanced analytics integrated with the R language. Previously, Mario served as Revolution Analytics’s chief scientist and as analytics architect in IBM’s Big Data organization,... Read More.

Alex Ingerman
Alex Ingerman (Amazon Web Services)

Alex Ingerman leads the product management team for Amazon Machine Learning. He joined Amazon in 2012 after working on products including web-scale search, content recommendation systems, immersive data-exploration environments, and enterprise email and content servers. Alex holds a BS in computer science and an MS... Read More.

Marco Ippolito
Marco Ippolito (CGG GeoSoftware)

Marco M. Ippolito is the data model architect for French-based geophysical services company, CGG, Inc., a fully integrated geoscience company providing leading geological, geophysical, and reservoir capabilities to a broad base of customers primarily from the global oil and gas industry. Since joining Read More.

Sreeni Iyer
Sreeni Iyer (quadanalytix), @av0gadr0

Sreeni Iyer is CTO, CIO, and cofounder of Quad Analytix, a big data company in the ecommerce vertical. Sreeni is focused on machine learning, big data in batch and quasi real time, and insightful visualizations. Sreeni’s previous positions include director of architecture for... Read More.

Mridul Jain
Mridul Jain (Yahoo)

Mridul Jain is a senior principal architect for Yahoo’s monitoring platform. He has been using Storm and Kafka to solve various real-time problems at Yahoo for almost three years. Mridul is also the author of Pig on Storm. His interests are mostly in the area... Read More.

Rohit Jain
Rohit Jain (Esgyn)

Rohit Jain is the CTO at Esgyn for Trafodion, a transactional SQL-on-HBase RDBMS. Rohit worked for Hewlett-Packard for 28 years on applications and databases, undertaking such roles as solutions architect, consultant, software engineer, architect, development and QA manager, product manager, and chief... Read More.

Jeroen Janssens
Jeroen Janssens (Tilburg University), @jeroenhjanssens

Jeroen Janssens is an assistant professor of data science at Tilburg University. As an independent consultant and trainer, Jeroen helps organizations make sense of their data. Previously, he was a data scientist at Elsevier in Amsterdam and the startups YPlan and Visual Revenue in New... Read More.

Calvin Jia (Alluxio), @JiaCalvin

Calvin Jia is a software engineer at Tachyon Nexus and a top contributor to Tachyon.

Aaron Kalb
Aaron Kalb (Alation)

Aaron Kalb has spent his career crafting and empowering delightful human-computer interactions, especially through natural language interfaces. Aaron currently leads the design team and guides the product vision at Alation, after leaving Stanford with a BS and an MS in symbolic systems and working at... Read More.

David Kale
David Kale (University of Southern California)

Dave Kale is a PhD student in computer science and an Alfred E. Mann Innovation in Engineering fellow at the University of Southern California. His research uses machine learning to extract insight from digital data in high-impact domains, including, but not limited to, health care.... Read More.

Holden Karau

Holden Karau is a software development engineer at IBM and is active in open source. Prior to IBM, she worked on a variety of big data, search, and classification problems at Alpine, Databricks, Google, Foursquare, and Amazon. Holden is the author of Learning... Read More.

Aneesh Karve
Aneesh Karve (Quilt Data, Inc)

Aneesh Karve is the CTO of Quilt Data, a social database platform. Aneesh has shipped products to millions of users around the globe as product manager and interaction designer at companies including Microsoft, NVIDIA, and Matterport. Aneesh’s academic background spans proteomics, machine learning,... Read More.

Mubashir Kazia (Cloudera)

Mubashir Kazia is a solutions architect at Cloudera focusing on security. Mubashir started the initiative integrating Cloudera Manager with Active Directory for kerberizing the cluster and provided sample code. Mubashir has also contributed patches to Apache Hive that fixed security-related issues.

Brian Kent
Brian Kent (Dato)

Brian Kent earned a PhD in statistics from Carnegie Mellon University. His research focused on clustering methods and topological data analysis, with applications in neuroimaging and genetics.

Paul Kent

Paul Kent is vice president of big data initiatives at SAS. He divides his time between customers, partners, and the Research & Development teams discussing, evangelizing, and developing software at the confluence of big data and high performance computing. Paul was previously vice president... Read More.

Grega Kespret
Grega Kespret (Celtra Inc.), @gregakespret

Grega Kešpret is the director of engineering for analytics at Celtra, where he builds analytics pipeline and optimization systems. Grega also leads teams of engineers and data scientists in San Francisco and Ljubljana working on Celtra’s analytics platform. Prior to Celtra, Grega worked at Read More.

Amandeep Khurana
Amandeep Khurana (Cloudera)

Amandeep Khurana is a solutions architect at Cloudera, where he’s involved in the entire lifecycle of Hadoop adoption for customers from use-case discovery to taking systems to production. Amandeep is also a coauthor of HBase In Action, a book geared toward building applications using HBase.... Read More.

Spencer Kimball
Spencer Kimball (Cockroach Labs), @cockroachdb

Spencer Kimball is the cofounder and CEO of Cockroach Labs, where he maintains a delicate balance between a love for programming distributed systems and the excitement of helping the company grow smoothly. He cut his teeth on databases during the dot-com heyday and had... Read More.

Jonathan King
Jonathan King (Ericsson), @jhk24

Jonathan H. King is the Head of Cloud Strategy for Ericsson. He was previously Head of Cloud Strategy and business development for CenturyLink Technology Solutions. Prior to that Jonathan was SVP of WW business development at Joyent, an innovative cloud-computing company based in San... Read More.

Adam  Kocoloski

Adam is an IBM Distinguished Engineer and CTO of the Cloud Data Services group. He joined IBM in 2014 via the acquisition of Cloudant, where he built a highly available, scalable database and drove the development of the systems required to offer... Read More.

Benedikt Koehler

Benedikt Koehler studied sociology, anthropology, and psychology in Munich, where he received his PhD in 2006. After founding a mobile-Web startup in the late 1990s, he worked as a consultant for Internet and media companies. In 2008, Benedikt cofounded the Social Media Association, the first... Read More.

Marcel Kornacker
Marcel Kornacker (Cloudera)

Marcel Kornacker is a tech lead at Cloudera and the architect of Apache Impala (incubating). Marcel has held engineering jobs at a few database-related startup companies and at Google, where he worked on several ad-serving and storage infrastructure projects. His last engagement was as the... Read More.

Jay Kreps
Jay Kreps (Confluent)

Jay Kreps is the cofounder and CEO of Confluent, a company focused on Apache Kafka. Previously, Jay was one of the primary architects for LinkedIn, where he focused on data infrastructure and data-driven products. He was among the original authors of a number of... Read More.

Balaji Krishna has been with SAP for over 16 years, with customer-facing experience as support consultant, RIG, solution management, and currently product management. He has been a trusted advisor to customers in architecting and implementing the best end-to-end EDW and analytics solutions.... Read More.

Balaji Krishnapuram (IBM Watson Health)

Balaji Krishnapuram is responsible for analytics at IBM Watson Health, where he currently leads the development of two products and a cloud-based analytics platform for healthcare. Previously, Balaji led teams that launched seven commercially successful products using machine learning over the last 10 years... Read More.

Chi-Yi Kuan
Chi-Yi Kuan (LinkedIn)

Chi-Yi Kuan is director of business analytics at LinkedIn. He has over 15 years of extensive experience in applying big data analytics, business intelligence, risk and fraud management, data science, and marketing mix modeling across various business domains (social network, ecommerce, SaaS, and consulting) at... Read More.

Scott Kurth
Scott Kurth (Silicon Valley Data Science)

Scott Kurth is the vice president of advisory services at Silicon Valley Data Science, where he helps clients define and execute the strategies and data architectures that enable differentiated business growth. Building on 20 years of experience making emerging technologies relevant to enterprises, he has... Read More.

Yann Landrin
Yann Landrin (Autodesk)

Yann Landrin is a data scientist and data engineer with over 15 years of personalization and big data experience. He has worked on all aspects of big data, from large-scale machine learning to infrastructure optimization. At Autodesk, he is working on the next-generation data platform,... Read More.

Costin Leau
Costin Leau (Elastic), @costinl

Costin Leau is an engineer at Elasticsearch, where he leads big data efforts. An open source veteran, Costin led various Spring projects (Spring OSGi, GemFire, Redis, Hadoop) and authored an OSGi spec. He has spoken about Java, big data, and Elasticsearch-related topics at a number... Read More.

Alex Leblang
Alex Leblang (Cloudera)

Alex Leblang is currently an engineer at Cloudera on the RecordService team. Perviously, Alex was an Apache Impala (incubating) engineer and interned at Vertica. He holds a bachelor’s degree from Brown University with concentrations in computer science and Latin American studies.

Erin Ledell
Erin Ledell (H2O.ai), @ledell

Erin Ledell is a statistician and machine-learning scientist at H2O.ai. Erin is the main author of H2O Ensemble. Before joining H2O, she was the principal data scientist at Wise.io and Marvin Mobile Security (acquired by Veracode in 2012) and the founder of DataScientific, Inc. Erin... Read More.

Bob Levy
Bob Levy (Virtual Cove), @VirtualCove

Bob Levy is CEO of Virtual Cove, focused on emerging technologies for improving human data processing potential. He has over two decades of executive, product, marketing, and R&D experience with firms including IBM, MathWorks, Hancock Software, Harte Hanks, and Rational Software. Bob was... Read More.

Linus Liang
Linus Liang (Embrace)

Linus Liang is a serial entrepreneur with expertise in technology, medical devices, and social enterprises. He most recently cofounded Embrace, a social enterprise that develops and distributes a low-cost infant incubators to developing countries. Unlike traditional incubators that cost up to $20,000, the Embrace Infant... Read More.

Todd Lipcon
Todd Lipcon (Cloudera), @tlipcon

Todd Lipcon is an engineer at Cloudera, where he primarily contributes to open source distributed systems in the Apache Hadoop ecosystem. He is a committer and a PMC member on the Apache Hadoop, HBase, and Thrift projects. Prior to Cloudera, Todd worked on web... Read More.

Zachary Lipton
Zachary Lipton (University of California, San Diego)

Zack Lipton is a graduate student in the Artificial Intelligence Group at the University of California, San Diego. He works on the theory and application of machine learning, particularly deep learning and multilabel classification, and develops algorithms to exploit sparsity, enabling the efficient training of... Read More.

Darren Lo
Darren Lo (Cloudera)

Darren Lo is currently a lead engineer on Cloudera Manager. He previously worked on the Model Repository Server at Informatica.

Bill Loconzolo

Bill Loconzolo is the vice president of Intuit’s Data Engineering and Analytics team, where he leads the development of Intuit’s central big data platform, which leverages the power of the collective data of 45 million Intuit customers. The platform creates unique data-driven insights and product... Read More.

David Loftesness
David Loftesness (On sabbatical, most recently at Twitter), @dloft
Scaling teams Cultivate

David Loftesness has been a software engineer and manager at a range of tech companies, including Amazon, Twitter, Xmarks, and Geoworks, each with its own unique strengths and challenges. David is currently taking time to share what he’s learned through talks and blog posts before... Read More.

Michael Lopp
Michael Lopp (Rands), @rands

Michael Lopp is a Silicon Valley-based engineering leader who builds both teams and software at companies such as Borland, Netscape, Palantir, and Apple. Michael has written two books. His first book, Managing Humans, a popular guide to the art of engineering leadership, clearly explains that... Read More.

Ben Lorica
Ben Lorica (O'Reilly Media), @bigdata

Ben Lorica is the chief data scientist at O’Reilly Media, Inc. Ben has applied business intelligence, data mining, machine learning, and statistical analysis in a variety of settings including direct marketing, consumer and market research, targeted advertising, text mining, and financial engineering. His background includes... Read More.

Michael Ludden is an IBMer in developer relations at Watson. Previously, Michael was developer marketing manager lead at Google, head of developer marketing at Samsung, a developer evangelist at HTC, and global director of developer relations at startups Quixey and Nexmo and was involved... Read More.

Roger Magoulas
Roger Magoulas (O'Reilly Media), @rogerm

Roger Magoulas is the research director at O’Reilly Media and chair of the Strata + Hadoop World conferences. Roger and his team build the analysis infrastructure and provide analytic services and insights on technology-adoption trends to business decision makers at O’Reilly and beyond. He and... Read More.

Seshadri Mahalingam

Seshadri Mahalingam is a software engineer at Trifacta, where, in addition to building out Wrangle, Trifacta’s domain-specific language for expressing data transformation, he develops the low-latency compute framework that powers Trifacta’s fluid and immersive data wrangling experience. Seshadri holds a BS in EECS from... Read More.

Ted Malaska
Ted Malaska (Cloudera)

Ted Malaska is a solutions architect at Cloudera. Ted has 18 years of professional experience working for startups, the US government, some of the world’s largest banks, commercial firms, bio firms, retail firms, hardware appliance firms, and the largest nonprofit financial regulator in the US... Read More.

James Malone
James Malone (Google)

James Malone is a product manager for Google Cloud Platform.

Vikash Mansinghka

Vikash Mansinghka is a research scientist at MIT, where he leads the Probabilistic Computing Project. Vikash holds BS degrees in mathematics and computer science from MIT, as well as an MEng in computer science and a PhD in computation. His PhD dissertation on... Read More.

Keith Manthey

Keith is the CTO of Analytics for EMC’s Emerging Technology Division. He brings more then 24 years of Identity Fraud Analytics experience, alternative and traditional data architectures experience, and Financial Systems and Analytics experience. Keith is an advisory board member of the University of... Read More.

Sanjay Mathur
Sanjay Mathur (Silicon Valley Data Science)

As the CEO of Silicon Valley Data Science, Sanjay Mathur has brought together a team of world-class data scientists and engineers to help companies become more data driven. Previously, Sanjay was SVP of product management for LiveOps, where he was responsible for LiveOps’s... Read More.

Drew Mattison
Drew Mattison (XPLANE)

Drew Mattison is a connector, facilitator, communicator, rationalizer, strategist, and advocate who helps clients get things done. For the last 20 years, he has worked where business and strategy intersect with design and communications. Drew is responsible for ensuring XPLANE teams exceed expectations and... Read More.

Patrick McFadin

Patrick McFadin is one of the leading experts in Apache Cassandra and data-modeling techniques. As a consultant and the chief evangelist for Apache Cassandra at DataStax, Patrick has helped build some of the largest and most exciting deployments in production. Prior to DataStax, he was... Read More.

Pat McGarry
Pat McGarry (Ryft)

Pat McGarry brings extensive technology and leadership experience in hardware and software engineering to his role as vice president of engineering at Ryft. Pat joined Ryft from Ixia Communications, where he was responsible for federal security systems engineering programs. During his tenure at Ixia and... Read More.

Emma McGrattan
Emma McGrattan (Actian)

Emma McGrattan is SVP of engineering at Actian, where she leads the Actian Vector, Actian Vector Hadoop Edition, and Actian Matrix development teams. A leading authority in DBMS technologies, Emma has over 20 years’ experience managing, supporting, and developing a variety of databases,... Read More.

Denise McInerney

Denise McInerney is a data professional with over 16 years of experience. Denise began her career as a database administrator, managing and developing databases for online transactional systems. She now works as a data architect at Intuit, where she designs and implements BI and analytics... Read More.

Wes McKinney
Wes McKinney (Cloudera), @wesmckinn

Wes McKinney is a software engineer at Cloudera. He is the creator of Python’s pandas library and the Ibis project, as well as the author of Python for Data Analysis. Previously, Wes was the founder and CEO of DataPad.

Eric McNulty
Eric McNulty (Eric J McNulty.com), @richerearth

Eric McNulty helps leaders and organizations create long-term value and increase their positive impact on the full range of stakeholders. Eric is a writer, speaker and conversation catalyst, teacher, and consultant. A fox, not a hedgehog, he draws inspiration from nature, the complex environments of... Read More.

Stephen Merity
Stephen Merity (MetaMind), @smerity

Stephen Merity is a senior software engineer at MetaMind, where he works on researching and implementing deep learning models for vision and text, with a focus on memory networks and neural attention mechanisms for computer vision and natural language processing tasks. Prior to joining... Read More.

Leo Meyerovich
Leo Meyerovich (Graphistry)

Leo Meyerovich cofounded Graphistry, Inc. in 2014 to scale visual graph analysis (think exploring security alerts) by connecting browsers to GPU clusters. Graphistry builds upon the founding team’s work at UC Berkeley on the first parallel web browser and Superconductor, a declarative GPU-accelerated... Read More.

Claire Michell

Claire Mitchell is a product experience designer at Temboo in NYC. With experience ranging from creative strategy to design, Claire has designed interfaces that show the potential for the future, developed conceptual pitches for award-winning commercials, and built and managed teams that can effectively... Read More.

kai miller
kai miller (Stanford University), @kaijoshuamiller

i’m a neurosurgery resident at stanford. i have doctorates in physics, medicine, and neuroscience. surgically i am focused on epilepsy, brain tumors, and deep brain stimulation. my research focuses on neural engineering, human electrophysiology, and imaging in neurosurgery. i’m enthusiastic about how machine learning and... Read More.

Donald Miner
Donald Miner (Miner & Kasch)

Donald Miner is the founder of the data science firm Miner & Kasch, where he specializes in Hadoop enterprise architecture and applying machine learning to real-world business problems. Donald is author of MapReduce Design Patterns and the forthcoming Enterprise Hadoop, both published by O’Reilly Media.... Read More.

Sophie-Charlotte Moatti
Sophie-Charlotte Moatti (Products That Count), @scmoatti

As an executive at mobile pioneers such as Facebook, Trulia, and Nokia, SC Moatti has launched and monetized mobile products that are used by billions of people and have received prestigious awards, including an Emmy nomination. Currently, SC runs Products That Count, a company that... Read More.

Prat Moghe

Prat Moghe is the founder and CEO of Cazena. Prat is a successful big data entrepreneur with nearly 20 years of experience inventing next-generation products and building strong teams in the technology sector. Prior to founding Cazena, as SVP of strategy, products, and... Read More.

Rajat Monga
Rajat Monga (Google)

Rajat Monga works on the Google Brain team, where he is the technical lead and manager for TensorFlow—an open source machine-learning library and the center of Google’s efforts at scaling up deep learning. Rajat is particularly interested in learning from sequences, including video. Prior to... Read More.

Aurelia Moser
Aurelia Moser (Mozilla Science), @auremoser

Aurelia Moser is a developer and curious cartographer building communities around code at Mozilla Open Science. Previously of Ushahidi, Internews Kenya, and CartoDB, she’s been working in the open tech and nonprofit science space for a few years. Recent projects include mapping sensor data to... Read More.

John Mount
John Mount (Win Vector LLC)
R Day (Full Day) Tutorial

John Mount is a principal consultant at Win-Vector LLC a San Francisco data science consultancy. John has worked as a computational scientist in biotechnology and a stock-trading algorithm designer and has managed a research team for Shopping.com (now an eBay company). John... Read More.

Conrad Mulcahy
Conrad Mulcahy (K2 Intelligence)

Conrad Mulcahy is an associate managing director and director of data analytics in K2 Intelligence’s New York Office. In his time at K2 Intelligence, Conrad has conducted numerous investigations targeting risk, fraud, corruption, anti-money laundering, and bankruptcy for clients such as law firms, government agencies,... Read More.

Sean Murphy
Sean Murphy (PingThings), @sayhitosean

Sean Patrick Murphy serves as the chief data scientist for PingThings, an Industrial Internet of Things (IIoT) startup bringing advanced data science and machine learning to the nation’s electric grid. He also advises several startups and provides learning-analytics consulting for EverFi. Previously, he served as... Read More.

Justin Murray (VMware)

Justin Murray works in a technical marketing role at VMware. His activities involve talking with customers and partners about deploying Hadoop and YARN clusters on VMware’s vSphere platform and designing the architecture for best possible outcomes on that basis. He has worked at VMware... Read More.

Jacques Nadeau
Jacques Nadeau (Dremio)

Jacques Nadeau is the CTO and cofounder of Dremio. He is also the founding PMC chair of the open source Apache Drill project, spearheading the project’s technology and community. Prior to Dremio, Jacques was the architect and engineering manager for Drill and other... Read More.

Nina Narelle
Nina Narelle (XPLANE)

As a catalyst of systems change, Nina Narelle brings over 15 years of experience in organizational design and systems thinking to inform her work leading organizational transformation. Nina helps groups dream big about their future state and emerge with stronger relationships and clear agreements for... Read More.

Neha Narkhede

Neha Narkhede is the cofounder and head of engineering at Confluent, a company backing the popular Apache Kafka messaging system. Prior to founding Confluent, Neha led streams infrastructure at LinkedIn, where she was responsible for LinkedIn’s petabyte-scale streaming infrastructure built on top of Apache Kafka... Read More.

Tony Ng
Tony Ng (eBay, Inc.), @tony_ng

Tony Ng is a director of engineering at eBay, where he leads the User Behavior Analytics, Experimentation, and Marketing Platform products. Tony is involved in building eBay’s core platforms and services, including cloud, big data analytics, real-time streaming, web services, and messaging systems. Prior to... Read More.

Christopher Nguyen

Christopher Nguyen is CEO and cofounder of Arimo (née Adatao), the leader in collaborative, predictive intelligence for enterprises. Previously, Christopher served as engineering director of Google Apps and cofounded two successful startups. As a professor, he also cofounded the computer engineering program at Read More.

Robert Nishihara is a second-year PhD student working in the UC Berkeley AMPLab with Michael Jordan. He works on machine learning, optimization, and artificial intelligence.

Alex Nisnevich
Alex Nisnevich (Bayes Impact), @AlexNisnevich

Alex Nisnevich is a data scientist at Bayes Impact. Previously, he worked on machine-learning pipelines at Workday and built natural language interfaces for databases at UPSHOT. He received his MS in NLP at UC Berkeley.

Jack Norris
Jack Norris (MapR Technologies), @Norrisjack

Jack Norris has over 20 years of enterprise software marketing experience. Jack has a wide range of demonstrated successes from defining new markets for small companies to increasing sales of new products for large public companies. Jack’s broad experience includes launching and establishing analytics, virtualization,... Read More.

Kevin O'Dell

Kevin O’Dell currently works as a field engineer for Rocana, helping companies take IT operations to the next level, and has been an HBase contributor since 2012. Kevin regularly works to architect, size, and deploy big data applications across a wide variety of verticals in... Read More.

Stephen O'Sullivan
Stephen O'Sullivan (Silicon Valley Data Science), @steveos

A leading expert on big data architecture and Hadoop, Stephen O’Sullivan has 20 years of experience creating scalable, high-availability data and applications solutions. A veteran of WalmartLabs, Sun, and Yahoo, Stephen leads data architecture and infrastructure at Silicon Valley Data Science.

Amy OConnor
Amy OConnor (Cloudera), @imamyo

Amy O’Connor is a big data evangelist and telecommunications specialist at Cloudera, the leading big data vendor. She advises customers globally as they introduce big data solutions and adopt enterprise-wide big data delivery capabilities. Amy was recently named one of Information Management’s 10 Big Data... Read More.

Travis Oliphant
Travis Oliphant (Continuum Analytics), @continuumIO

As CEO of Continuum Analytics, Travis Oliphant engages customers in all industries, develops business strategy, and helps guide the technical direction of the company. Travis actively contributes to software development and engages with the wider open source community in the Python ecosystem. He has... Read More.

Silvia Oliveros
Silvia Oliveros (Silicon Valley Data Science), @soliverost

With a background in computer engineering and visual analytics, Silvia Oliveros has worked on several projects helping clients explore and analyze their data. Silvia is interested in building and optimizing the infrastructure and data pipelines used to gather insights from various datasets.

Dan Olsen
Dan Olsen (The Lean Product Playbook), @danolsen

Dan Olsen is a product management consultant, speaker, and author. At Olsen Solutions, he works with CEOs and product leaders to build great products and strong product teams, often as interim VP of product. He has helped product teams at Facebook, Box, Microsoft, Medallia,... Read More.

Matt Olson
Matt Olson (CenturyLink)

Matt Olson is a principal network architect at CenturyLink. Matt’s current focus is on big data analytics for SDN/NFV performance management with the aim of building automated feedback loops for adaptive intelligent network services. Matt has many years of experience leveraging data analytics,... Read More.

John Omernik
John Omernik (Secureworks)

John Omernik is currently a data architect at Secureworks, where he helps build up systems to bridge the gap between data and security. John has been active in the banking industry; he began in systems architecture before moving to information security and finally to fraud... Read More.

Jerry Overton

Jerry Overton is a data scientist and distinguished engineer at CSC, a global leader of next-generation IT solutions, where he is head of advanced analytics research and founder of CSC’s advanced analytics lab. Jerry shares his experiences leading open research in data science on... Read More.

Todd Palino
Todd Palino (LinkedIn), @bonkoif

Todd Palino is a site reliability engineer at LinkedIn tasked with keeping Zookeeper, Kafka, and Samza deployments fed and watered. His days are spent, in part, developing monitoring systems and tools to make that job a breeze. Previously, Todd was a systems engineer at Verisign,... Read More.

Ganesan Pandurangan
Ganesan Pandurangan (Infosys Limited)

Ganesan Pandurangan is a principal technology architect at Infosys Ltd. He has around 20 years of experience in building large-scale online, batch, and data warehouse systems. Ganesan is currently leading the development and implementation of Infosys’s big data platform, the Infosys Information Platform, and has... Read More.

Josh Patterson
Josh Patterson (Patterson Consulting), @jpatanooga

Josh Patterson currently runs a consultancy in the big data machine-learning space. Previously, Josh worked as a principal solutions architect at Cloudera and an engineer at the Tennessee Valley Authority, where he was responsible for bringing Hadoop into the smart grid during his involvement in... Read More.

Joshua Patterson
Joshua Patterson (Accenture Technology Labs), @datametrician

Joshua Patterson is a principal data scientist at Accenture Technology Labs and a Presidential Innovation Fellow. Josh leads data science research on cybersecurity and risk at Accenture, focusing on big data architecture, analytics, and visualization techniques to accelerate fraud and anomaly detection at scale. For... Read More.

Rob Peglar
Rob Peglar (Micron Technology, Inc)

Robert Peglar is vice president of advanced storage solutions at Micron Technology. A 38-year industry veteran and published author, Robert leads efforts in advanced storage systems strategy, leads executive-level planning with key customers and partners worldwide for Micron’s Storage Business Unit, and defines future storage... Read More.

Paulo Pereira

Paulo Pereira is the GE executive responsible the technical aspects of data security and governance for GE Digital. In this capacity, Paulo leads the efforts around big data cloud infrastructure, governance, and security. Working with heavily regulated data from several GE businesses and clients, Paulo... Read More.

Don Perigo
Don Perigo (GE Power)

Don Perigo is the IT chief enterprise architect of GE Power Services, a $15B organization within GE Power. Power Services is a combination of two of the best service teams in the power industry—GE’s Power Generation Services and the former Alstom Thermal Services (acquired Q4... Read More.

Daniella Perlroth (Lyra Health)

Daniella Perlroth is chief data scientist at Lyra Health, a technology company transforming behavioral health with data and a human touch, where she is developing treatment and provider recommendations to help mental health patients get access to the best quality care. Prior to Lyra, Daniella... Read More.

Thomas Phelan

Tom Phelan earned his computer science degree from UC Berkeley and then began a long career focused on storage and systems virtualization. After cutting his teeth on UNIX internals at Altos Computer Systems, he went on to develop highly fault-tolerant storage subsystems at Stratus.... Read More.

Sébastien Pierre

Sébastien Pierre is the director of FFunction, an award-winning data visualization studio. He has worked with clients such as HP, National Geographic, the Bill & Melinda Gates Foundation, Edelman, and many other high-profile organizations. Trained both as a software engineer and a designer, Sébastien regularly... Read More.

Jeff Pohlmann
Jeff Pohlmann (Oracle)

Jeff Pohlmann is the vice president of NA big data at Oracle. Jeff has more than 30 years of leadership and management experience with over 15 years of it managing and consulting with Fortune 500 companies deploying analytical information solutions. Prior to joining Oracle, Jeff... Read More.

Jake Porway
Jake Porway (DataKind)

Jake Porway is the founder and executive director of DataKind, a nonprofit that harnesses the power of data science in the service of humanity. He is an alum of the New York Times R&D Lab and has worked at Google and Bell Labs. A recognized... Read More.

Timothy Potter (Lucidworks ), @thelabdude

Timothy Potter is a senior member of the engineering team at Lucidworks, a committer on the Apache Solr project, and the coauthor of Solr In Action, a comprehensive guide to using Solr 4. Tim focuses on scalability and hardening the distributed features in Solr. Previously,... Read More.

Paula Poundstone
Paula Poundstone (Star of NPR's #1 radio show, "Wait Wait...Don't Tell Me"), @paulapoundstone

Thirty-two years ago, Paula Poundstone climbed on a Greyhound bus and traveled across the country—stopping in at open mic nights at comedy clubs as she went. Today, she is one of our country’s foremost humorists. You can hear her through your laughter as a regular... Read More.

James Powell
James Powell (NumFOCUS)

James Powell is a NYC-based Python programmer with experience in quantitative finance and data science. James is also very active in the Python community, where he organizes NYC Python, the world’s largest and most active Python meetup group. He also works with... Read More.

Mr Prabhat
Mr Prabhat (Berkeley Lab)

Prabhat leads the Data and Analytics Services team at NERSC. His current research interests include scientific data management, parallel I/O, high-performance computing, and scientific visualization. He is also interested in applied statistics, machine learning, computer graphics, and computer vision. Prabhat received an ScM in... Read More.

Nitin Prabhu (Transamerica)

Nitin Prabhu has been with Transamerica for over a decade in IT roles. He now serves as manager for strategy and architecture.

Peter Prettenhofer (DataRobot)

Peter Prettenhofer is a data scientist and software engineer at DataRobot. He is a contributor to scikit-learn, where he coauthored a number of modules such as Gradient Boosted Regression Trees, Stochastic Gradient Descent, and Decision Trees. Peter studied computer science at Graz University of Technology,... Read More.

Megan Price
Megan Price (Human Rights Data Analysis Group), @hrdag

As the executive director at the Human Rights Data Analysis Group, Megan Price designs strategies and methods for statistical analysis of human rights data for projects in a variety of locations including Guatemala, Colombia, and Syria. Megan’s work in Guatemala includes serving as the lead... Read More.

Richard Probst is VP of infrastructure technology strategy at SAP and is currently focused on working with SAP partners on innovative cloud architectures for SAP application landscapes to help SAP customers become more agile.

Erin Ptacek
Erin Ptacek (Starfighters.io)

Erin Ptacek is a cofounder and VP of engineering at Starfighter, a startup that is changing the way technical recruiting is done by building CTFs (games you play by programming). During her long tenure in the security industry, Erin has helped turn green recruits... Read More.

Yvonne Quacken
Yvonne Quacken (Siemens)

Yvonne Quacken is a senior big data architect and engineer at Siemens. In her role as BI and big data technology lead, Yvonne is responsible for platform and solution architecture, cloud integration, and DevOps for BI and big data. Yvonne has been working in this... Read More.

Rachel Quint
Rachel Quint (Hewlett Foundation), @RMQuint

Rachel Quint is a fellow in the Global Development and Population Program at the Hewlett Foundation. Before joining the Hewlett Foundation, Rachel lived in Addis Ababa, Ethiopia, where she worked in the UN World Food Programme’s Africa office, serving as a liaison to the African... Read More.

Mohammad Quraishi

Mohammad Quraishi is a senior principal technologist at Cigna with 20 years of experience in application architecture, design, and development. Mohammad has specific experience in mobile native applications, SOA platform implementation, web development, distributed applications, object-oriented analysis and design, requirements analysis, data modeling, and... Read More.

Phillip Radley

Phill Radley is a physics graduate with an MBA who has worked in IT and the communications industry for 30 years, mostly with British Telecommunications Plc. He is currently chief data architect for BT at their Adastral Park campus in the UK. Phill works... Read More.

Siva Raghupathy
Siva Raghupathy (Amazon Web Services)

Siva Raghupathy leads the Americas Big Data Solutions Architecture team at AWS, where he guides developers and architects to build successful big data solutions on AWS. Previously, as a principal technical program manager for AWS database service, Siva gathered emerging NoSQL requirements... Read More.

Karthik Ramasamy

Karthik Ramasamy is the engineering manager and technical lead for real-time analytics at Twitter. Karthik is the cocreator of Heron and has more than two decades of experience working in parallel databases, big data infrastructure, and networking. He cofounded Locomatix, a company that specializes in... Read More.

Jun Rao
Jun Rao (Confluent)

Jun Rao is the cofounder of Confluent, a company that provides a stream data platform on top of Apache Kafka. Before Confluent, Jun was a senior staff engineer at LinkedIn, where he led the development of Kafka. Before LinkedIn, he was a researcher at IBM’s... Read More.

Naveen Rao
Naveen Rao (Nervana)

Naveen Rao is cofounder and CEO of Nervana, where he brings together engineering disciplines and neural computational paradigms to build state of the art technology that makes machines smarter. Naveen’s fascination with computation in synthetic and neural systems began when, at nine years old,... Read More.

Chris Rawles
Chris Rawles (Pivotal)

Chris Rawles is a data scientist at Pivotal, where he works with customers across a variety of domains, building models to derive insight and business value from their data. Prior to joining Pivotal, Chris worked in both the oil and gas and alternative energy industries,... Read More.

Atish Ray
Atish Ray (Accenture)

Atish Ray has more than 15 years of technology and management experience in application architecture and delivery. He has broad experience in planning, estimation, design, integration and implementation of tiered web-centric applications. Based in the Washington DC metro area, he is the Data Engineering Lead... Read More.

Tom Reilly
Tom Reilly (Cloudera), @cloudera

Tom Reilly is the CEO of Cloudera. Tom has a distinguished 30-year career in the enterprise software market. Prior to Cloudera, his most recent role was as vice president and general manager of enterprise security at HP. Previously, Tom served as CEO of... Read More.

Dieter Reuther
Dieter Reuther (Team Dynamics Boston), @DieterReuther

Dieter Reuther is a leadership consultant who focuses on people, process, and technology. He helps organizations balance creative chaos with structure to bring out the best in teams and individuals. As a strong believer in the power positive leadership can have on people’s motivation, performance,... Read More.

Evan Richards (Uber)

Evan Richards is a member of the Hadoop Compute Platform team at Uber, where he works as the tech lead for the schemas and schema management projects. Previously, Evan interned with Zappos, helping monitor the migration of their catalog from their proprietary format and local... Read More.

Katrina Riehl
Katrina Riehl (Continuum Analytics)

Katrina Riehl is a senior data scientist at Continuum Analytics, where she leads the Memex team. Over the last decade, Katrina has worked extensively in the fields of scientific computing, machine learning, data mining, and visualization. Most notably, she worked at Enthought, the signal and... Read More.

Travis Ringger

Travis Ringger is a manager in PwC’s Risk & Compliance Systems and Analytics practice. Travis specializes in designing and delivering analytical solutions that provide better awareness and understanding of compliance and business risk, particularly solutions incorporating unstructured data and natural language processing. He has deep... Read More.

Cody Rioux
Cody Rioux (Netflix (Real-time Analytics))

Cody Rioux is a senior analytics engineer at Netflix working in the real-time analytics space to design fully autonomous systems that support availability and reliability in the Netflix cloud environment on Amazon Web Services. Cody is passionate about using stream processing, functional programming, and Bayesian... Read More.

Julie Rodriguez
Julie Rodriguez (Sapient Global Markets), @juliargentinag

Julie Rodriguez is an experience designer and focuses on user research, analysis, and design for complex systems. She has patented her work in data visualizations for MATLAB, compiled a data visualization pattern library (www.vizipedia.com), and publishes industry articles on user experience and data analysis... Read More.

Monica Rogati
Monica Rogati (Data Natives), @mrogati

Monica Rogati is an independent data science executive and advisor who has built key data products and teams at Jawbone and LinkedIn; she now helps startups make the most out of their data. As the VP of data at Jawbone, Monica built Jawbone’s data science... Read More.

Bob Rogers

Bob Rogers is chief data scientist for big data solutions at Intel, where he applies his experience solving problems with big data and analytics to help Intel build world-class customer solutions. Prior to joining Intel, Bob was cofounder and chief scientist at Apixio, a big... Read More.

Brandon Rohrer
Brandon Rohrer (Microsoft)

Brandon Rohrer is a data scientist in Microsoft’s Azure Machine Learning group. He creates end-to-end data science solutions for external customers and supports the development of core algorithms and functionality in Azure ML. Brandon obtained his data science skills working in a variety of applications,... Read More.

Irene Ros
Irene Ros (Bocoup), @ireneros

Irene Ros is the director of data visualization at Bocoup and the program chair of OpenVis Conf, a 2-day conference on data visualization on the open Web. Irene is an information visualization researcher and developer, making engaging, informative, and interactive data-driven stories,... Read More.

Alan Ross
Alan Ross (Intel Corporation), @intel

Alan Ross is a senior principal engineer and chief cloud security architect at Intel. Alan has more than 20 years of information security experience in various capacities, from policy and awareness and security/risk analysis to engineering and architecture. Prior to joining Intel, Alan worked as... Read More.

Laurel Ruma
Laurel Ruma (O'Reilly Media, Inc.), @laurelatoreilly
Closing remarks Cultivate
Closing remarks Cultivate
Welcome Cultivate
Welcome Cultivate

Laurel Ruma is the director of talent for O’Reilly Media. Most recently, Laurel cochaired Where 2.0, OSCON Java, and Gov 2.0 Expo. She joined O’Reilly in 2005 after working for five years at various IT analyst firms in the Boston area. Laurel is also... Read More.

Sandy Ryza
Sandy Ryza (Clover Health), @s_ryz

Sandy Ryza is a senior data scientist at Clover Health. He was previously at Cloudera doing engineering and data science. He is an author of O’Reilly’s Advanced Analytics with Spark, as well as a Spark committer and member of the Hadoop project management committee. He... Read More.

Mohan Sadashiva
Mohan Sadashiva (Waterline Data)

Mohan Sadashiva is VP of products at Waterline Data, where he leverages his extensive experience in managing large-scale software products and cloud services to drive new innovations in big data. Previoiusly, Mohab was the SVP of products and business development at Narus, a cybersecurity... Read More.

Neelesh Srinivas Salian

Neelesh Srinivas Salian is a dedicated support engineer at Cloudera, supporting all components from Cloudera’s Distribution Including Apache Hadoop (CDH). Neelesh is also a contributor to Apache Software Foundation projects like Spark and Hadoop. He holds a bachelor’s degree in engineering from the University... Read More.

Chris Sanden

Chris Sanden is a senior analytics engineer at Netflix with a focus on real-time analytics and machine learning. He is part of the Insight Engineering team responsible for building systems that allow everyone at Netflix visibility into the state of the cloud environment. Chris is... Read More.

Majken Sander
Majken Sander (BusinessAnalyst.dk), @majsander

Majken Sander is a business analyst and business developer who has worked with IT, management information, analytics, BI, and DW for 20+ years. Armed with strong analytical expertise, Majken is keen on “data driven” as a business principle, data science, the IoT, and all other... Read More.

Krishna Sankar
Krishna Sankar (Volvo Cars), @ksankar

Krishna Sankar is a consulting data scientist working on retail analytics, social media data science, and forays into deep learning, as well as codeveloping the DeepLearnR package interfacing R over TensorFlow/Skflow. Previously, Krishna was a chief data scientist at Blackarrow.tv, where he focused on... Read More.

Kazunori Sato

Kazunori “Kaz” Sato is a staff developer advocate on the Cloud Platform team at Google, where he leads the developer advocacy team for machine-learning and data analytics products such as TensorFlow, Vision API, and BigQuery. Kaz has been leading and supporting developer communities for... Read More.

Bill Schmarzo

Bill Schmarzo is responsible for setting the strategy and defining the service line offerings and capabilities for the EMC Consulting Enterprise Information Management and Analytics service line. Bill has more than two decades of experience in data warehousing, BI, and analytic applications. Bill has... Read More.

Andreas Schmidt
Andreas Schmidt (Blue Yonder), @aschmidt_42

Andreas Schmidt is a product manager at Blue Yonder, a leading European company for predictive applications in retail. Previously, he was a senior data scientist there for several years, designing and implementing applications such as replenishment optimization for fresh and perishable goods. During his PhD... Read More.

Jim Scott
Jim Scott (MapR Technologies, Inc.), @kingmesal

Jim Scott is the director of enterprise strategy and architecture at MapR Technologies, Inc. Across his career, Jim has held positions running operations, engineering, architecture, and QA teams in the consumer packaged goods, digital advertising, digital mapping, chemical, and pharmaceutical industries. Jim has built systems... Read More.

Kim Scott
Kim Scott (Radical Candor, Inc.), @kimballscott

Kim Malone Scott is an advisor at Dropbox, Kurbo, Qualtrics, Rolltape, Shyp, Twitter, and several Silicon Valley startups. Kim was a member of the faculty at Apple University and before that led AdSense, YouTube, and Doubleclick online sales and operations at Google. Known for her... Read More.

Partha Seetala (Robin Systems), @robinsystems

Currently the CTO of Robin Systems, Partha Seetala has more than 16 years of technology and product expertise. Previously, Partha was a distinguished engineer and senior director of engineering at Veritas, Symantec’s information management business, where he conceived, architected, and led engineering teams to... Read More.

Jonathan Seidman

Jonathan Seidman is a solutions architect on the Partner Engineering team at Cloudera. Before joining Cloudera, he was a lead engineer on the Big Data team at Orbitz Worldwide, helping to build out the Hadoop clusters supporting the data storage and analysis needs of one... Read More.

Debora Seys
Debora Seys (eBay)

Debora Seys works on delivering a trusted self-service data experience at eBay. She’s been helping users help themselves to find, use, and collaborate with information and data for 15+ years. Prior to her current role, Deb drove search and taxonomy technology capabilities at Kaiser Permanente... Read More.

HIREN SHAH
HIREN SHAH (Microsoft), @HirenShahTW

Hiren Shah is currently a principal program manager in Microsoft’s Cortana Analytics group, where he focuses on big data analytics and data science. Over the last seven years, Hiren has worked on a variety of big data technologies in Bing and Azure. Hiren has a... Read More.

Abin Shahab
Abin Shahab (Altiscale)

Abin Shahab is a senior software engineer at Altiscale as well as a contributor to Docker and LXC. Abin’s work at Altiscale is focused on multitenant Hadoop clusters using Docker containers. Prior to joining Altiscale, Abin worked on graph databases and search engines at Guidewire, Symantec, and Vivisimo... Read More.

Gwen Shapira
Gwen Shapira (Confluent), @gwenshap

Gwen Shapira is a system architect at Confluent, where she helps customers achieve success with their Apache Kafka implementation. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in... Read More.

Ben Sharma

Ben Sharma is CEO and cofounder of Zaloni. He is a passionate technologist with experience in solutions architecture and service delivery of big data, analytics, and enterprise infrastructure solutions. With previously experience in technology leadership positions for NetApp, Fujitsu, and others, Ben’s expertise ranges... Read More.

Chang She
Chang She (Cloudera)

Chang She is a software engineer at Cloudera currently working on metadata management tools for Hadoop. Prior to joining Cloudera, Chang was cofounder and CTO of DataPad, a next-gen BI/analytics company. An early core contributor to the pandas library, Chang’s passion is creating data... Read More.

Jayant Shekhar
Jayant Shekhar (Sparkflows Inc.), @jshekhar

Jayant Shekhar is the founder of Sparkflows Inc., which enables machine learning on large datasets using Spark ML and intelligent workflows. Jayant was a principal solutions architect at Cloudera working with companies both large and small in various verticals on big data use cases, architecture,... Read More.

Jeffrey Shmain
Jeffrey Shmain (Cloudera)

Jeff Shmain is a senior solution architect at Cloudera. He has 16+ years of financial industry experience with a strong understanding of security trading, risk, and regulations. Over the last few years, Jeff has worked on various use-case implementations at 8 out of 10 of... Read More.

Alex Silva
Alex Silva (Pluralsight)

Alex Silva is a chief data architect at Pluralsight, where he leads the development of the company’s data infrastructure and services. He has been instrumental in establishing Pluralsight’s data initiative by architecting a platform that is used to capture valuable insights on real-time video analytics... Read More.

Jiri Simsa
Jiri Simsa (Alluxio), @jsimsa

Jiri Simsa is a software engineer at Alluxio, Inc., where he is one of the maintainers and top contributors of the Alluxio open source project. Before joining Alluxio, Inc., Jiri was a software engineer at Google, working on yet another distributed applications framework. He earned... Read More.

Sumeet  Singh

Sumeet Singh is a senior director of product management for cloud and big data platforms at Yahoo. In his current role, he leads the Hadoop products team responsible for both Apache open source contributions and Yahoo projects. Sumeet is responsible for introducing several new multitenant... Read More.

Vartika Singh
Vartika Singh (Cloudera)

Solutions Architect at Cloudera with over ten years of experience in applying machine learning techniques to big data problems.

Joseph Sirosh

Joseph Sirosh is the corporate vice president of the Data group, leading Microsoft’s database, big data, and machine-learning products, as well as a talented team of engineers, data scientists, and product leaders who are developing tools and services to transform data at scale into actionable... Read More.

Ram Shankar Siva Kumar
Ram Shankar Siva Kumar (Microsoft (Azure Security Data Science))

Ram Shankar is a security data wrangler in Azure Security Data Science, where he works on the intersection of ML and security. Ram’s work at Microsoft includes a slew of patents in the large intrusion detection space (called “fundamental and groundbreaking” by evaluators). In addition,... Read More.

Richard Socher is the founder of MetaMind. Richard is widely cited for his work on machine learning, deep learning, natural language processing, and computer vision. He holds a PhD from Stanford, where he worked with Chris Manning and Andrew Ng. Richard was awarded a 2013... Read More.

Paul Soldera
Paul Soldera (Equation Research)

Paul Soldera is currently head of strategy at Equation Research, a full-service market-research company focused on helping clients design, execute, and internalize data and insights from customer- and consumer-focused surveys. Paul also acts as an advisor to other companies looking to grow internal research and... Read More.

Jean-Marc Spaggiari

Jean-Marc Spaggiari is a Cloudera senior solution architect with many years’ experience as a big data architect, specializing in HBase solutions. An active HBase contributor, Jean-Marc has contributed more than 50 patches to the community and participates in all release testing. Prior to Cloudera, Jean-Marc... Read More.

Ben Spivey
Ben Spivey (Cloudera)

Ben Spivey is a principal solutions architect at Cloudera who provides consulting services for large financial-services customers. Ben specializes in Hadoop security and operations. He is the coauthor of Hadoop Security from O’Reilly Media (2015).

Vikram Sreekanti
Vikram Sreekanti (Berkeley AMP Lab), @viksree

Vikram Sreekanti is a software engineer working on research in the AMPLab at UC Berkeley. A graduate of Berkeley’s computer science department, he has served as a teaching assistant at Berkeley and an intern at Cloudera and Yammer.

Srikrishna Sridhar

Krishna Sridhar is a data scientist at Dato. He has a PhD in computer science from University of Wisconsin-Madison, where he worked on high-performance software for large-scale problems in mathematical optimization and data analysis. His work has been used in applications such as healthcare, industrial... Read More.

Vikram Srivastava
Vikram Srivastava (Cloudera, Inc.)

Vikram Srivastava is a software engineer at Cloudera.

Jeremy Stanley

Jeremy is currently the VP of data science at Instacart, where he works closely with data scientists who are integrated into product teams to drive growth and profitability through logistics, catalog, search, consumer, shopper, and partner applications. Previously, Jeremy was chief data scientist and Read More.

Sandy Steier
Sandy Steier (1010data)

Sandy Steier is the cofounder and CEO of 1010date. With more than a quarter century of industry experience, Sandy is recognized as an innovator behind the adoption of advanced analytic technologies by financial services institutions. Before cofounding 1010data, Sandy was a vice president and... Read More.

Louis Suarez-Potts
Louis Suarez-Potts (Age of Peers, Inc.), @luispo

As community strategist at Age of Peers, Louis Suarez-Potts strategizes the formation of and manages productive commons-based peer networks (open source communities). Louis helps communities consolidate good work, good connections, and good intentions into a force held in common, producing something all can look at... Read More.

Anand Subbaraj (Microsoft)

Anand Subbaraj is a principal program manager in the Microsoft Information Management & Machine Learning division. Anand has over 12 years of experience in the IT industry delivering products and services that solve challenging business problems and delight customers. Anand currently specializes in big data... Read More.

Brian Suda
Brian Suda ((optional.is)), @briansuda

Brian Suda is a master informatician currently residing in Reykjavík, Iceland. Since first logging on in the mid-90s, he has spent a good portion of each day connected to the Internet. When he is not hacking on microformats or writing about web technologies, he enjoys... Read More.

Adam Sugano
Adam Sugano (Autodesk)

Adam Sugano serves as the head of predictive modeling and advanced analytics at Autodesk, where he leads a team of both internal and external data scientists charged with delivering innovative, actionable data-driven solutions that help empower Autodesk’s customer-retention and engagement-optimization efforts across the customer lifecycle.

... Read More.
Roshan Sumbaly
Roshan Sumbaly (Coursera Inc), @rsumbaly

Roshan Sumbaly currently leads the Content Experience and Teaching team at Coursera. Prior to that he worked at LinkedIn, where he led the Data Platform team responsible for serving all social feeds and gestures across LinkedIn. He also worked on various data-mining-based products, while also... Read More.

Chao Sun (Cloudera)

Chao Sun is currently a software engineer at Cloudera working on the RecordService project. Before that, Chao worked on the Hive on Spark project. He holds a PhD in computer science from the University of Wisconsin-Milwaukee, where he focused on type systems and programming languages.

... Read More.
Jagane Sundar
Jagane Sundar (WANdisco)

Jagane Sundar is the CTO at WANdisco. Jagane has extensive big data, cloud, virtualization, and networking experience. He joined WANdisco through its acquisition of AltoStor, a Hadoop-as-a-service platform company. Before AltoStor, Jagane was founder and CEO of AltoScale, a Hadoop- and HBase-as-a-platform company... Read More.

David Taieb

David Taieb is the STSM for the Cloud Data Services Developer Advocacy team at IBM, leading a team of avid technologists with the mission of educating developers on the art of possible with cloud technologies. Previously, David was the lead architect for the... Read More.

Roopa Tangirala

Roopa Tangirala is an experienced engineering leader with extensive background in databases, be they distributed or relational. She manages the database engineering team at Netflix responsible for operating cloud persistent and semipersistent run-time stores for Netflix, which includes Cassandra, Elasticsearch, and MySQL databases, by ensuring... Read More.

Piotr Teterwak

Piotr Teterwak received his BA in computer science at Dartmouth College, where he conducted work exploring the learning of convolutional deep neural nets with applications in computer vision. He currently works on the toolkit development team at Dato.

Arun Thangamani

Arun Thangamani is a software architect for CDK Global (formerly ADP Dealer Services), where he helped lay the foundation for the Open BI Platform (a big-data initiative), which provides integrated value to CDK Global customers. Before CDK, Arun spent about a... Read More.

Robin Thottungal
Robin Thottungal (US Environmental Protection Agency), @rathottungal

Robin Thottungal is the EPA’s first chief data scientist focused on creating and implementing an agency-wide vision on analytics for effective decision making. Prior to joining the EPA, Robin was at Deloitte Consulting, where he focused on selling and delivering large-scale analytics projects for... Read More.

Richard Tibbetts

Richard Tibbetts is a software entrepreneur, database and programming languages nerd, a Visting Scientist at MIT Probabilistic Computing, a leader of the BayesDB open source project. Prior to MIT Richard was founder and CTO at StreamBase, a CEP... Read More.

Kathleen Ting
Kathleen Ting (Cloudera)

Kathleen Ting is currently a technical account manager at Cloudera, where she helps strategic customers deploy and use the Hadoop ecosystem in production. Kathleen has spoken on Hadoop, ZooKeeper, and Sqoop at many big data conferences, including Hadoop World, ApacheCon, and OSCON. She’s contributed... Read More.

Sravya Tirukkovalur

Sravya Tirukkovalur is a software engineer at Cloudera focusing on Hadoop security, specifically working on authorization. Sravya is one of the core contributors of Apache Sentry. She is also a committer and a PPMC member of the project driving the Apache community. Sravya has... Read More.

Steven Totman
Steven Totman (Cloudera)

Steven Totman is the financial services industry lead for Cloudera’s Field Technology Office, where he helps companies monetize their big data assets using Cloudera’s Enterprise Data Hub. Prior to Cloudera, Steve ran strategy for a mainframe-to-Hadoop company and drove product strategy at IBM for... Read More.

Anh Trinh
Anh Trinh (Arimo, Inc.), @chickamade

Anh Trinh is a software architect at Arimo (née Adatao), where he coauthored three patent-pending inventions: the Distributed Data Framework for Data Analytics, Collaboration using Shared Documents for Processing Distributed Data, and Multi-language Support for Interfacing with Distributed Data. He is also a coauthor of... Read More.

Eric Tschetter

Eric Tschetter is the creator and one of the main contributors to Druid, an open source, real-time analytical data store. Eric is currently a distinguished engineer at Yahoo, where he works on speeding up analytics with a mix of data science and traditional BI. Eric... Read More.

Daniel Tunkelang

Daniel Tunkelang is a data science and engineering executive who has built and led some of the strongest teams in the software industry. He was a founding employee and chief scientist of Endeca, a search pioneer that Oracle acquired for $1.1B. He led a local... Read More.

Joseph Turian
Joseph Turian (Workday), @turian

Joseph Turian is currently a principal engineer at Workday. He headed the machine-learning consultancy MetaOptimize LLC and founded the startup UPSHOT (acquired by Workday), which allowed users to query enterprise data from a mobile device using natural language.

Joseph holds a PhD in... Read More.

Nick Turner
Nick Turner (Markerstudy)

Nick Turner has made a career in data that spans more than 25 years. Since 2013, Nick has led the Enterprise Data team at Markerstudy, where he oversees the award-winning Big Data Insights project and is responsible for the collection, analysis, and visualization of hundreds... Read More.

Kostas Tzoumas
Kostas Tzoumas (data Artisans), @kostas_tzoumas

Kostas Tzoumas is a PMC member of the Apache Flink project and cofounder of data Artisans, the company founded by the original development team that created Flink. Kostas has spoken extensively about Flink, including at Hadoop Summit San Jose 2015.

Alexander Ulanov
Alexander Ulanov (Hewlett-Packard Labs)

Alexander Ulanov is a senior researcher in HP Labs. His research focuses on application of machine learning on a large scale, in particular, for deep learning. Alexander has made several contributions to Apache Spark. Previously, he worked on text mining, classification and recommender systems, and... Read More.

Amy Unruh
Amy Unruh (Google)

Amy Unruh is a developer programs engineer for the Google Cloud Platform, with a focus on machine learning and data analytics as well as other Cloud Platform technologies. Amy has an academic background in CS/AI and has also worked at several startups, done industrial... Read More.

Matthew Van Adelsberg

Matt van Adelsberg is chief data scientist at CACI, where he is responsible for managing the development of advanced, scalable solutions to complex data-analytics problems from small to big data regimes. Matt’s data science team provides end-to-end solutions to support customers throughout the commercial... Read More.

Bryan Van de Ven
Bryan Van de Ven (Continuum Analytics), @ContinuumIO

Bryan Van de Ven is a software engineer at Continuum Analytics. Previously, Bryan worked at the Applied Research Labs, developing software for sonar feature detection and classification systems on US Naval submarine platforms, and Enthought, where he worked on problems in financial risk modeling and... Read More.

Jake Vanderplas
Jake Vanderplas (eScience Institute, University of Washington)

Jake Vanderplas is the director of research in the physical sciences at the University of Washington’s eScience Institute, where his research is primarily in the area of data-driven astronomy and astrophysics. In addition, Jake is a maintainer and/or frequent contributor to many open source Python... Read More.

Krishnan Venkata
Krishnan Venkata (LatentView Analytics), @latentview

Krishnan Venkata is the director for the US West Coast at LatentView Analytics, where he’s responsible for sales leadership and relationship management for LatentView’s clients, especially in the technology sector. Krishnan has over 11 years of experience in global IT services delivery in the US,... Read More.

Mythili Venkatakrishnan

Mythili Venkatakrishnan is an IBM senior technical staff member and is the z Systems architecture and technology lead. Mythili has been with IBM for 25 years, all in the mainframe environment working with clients in various capacities. Her focus areas have been diverse... Read More.

Pratik Verma
Pratik Verma (BlueTalon), @pratverm

Pratik Verma is the founder and chief product officer at BlueTalon. Pratik founded BlueTalon to accelerate big data deployments and remove security as a barrier to adoption. Previously, he led AgeTak, a healthcare startup build on technologies created by Rakesh Verma. He is an angel... Read More.

Amit Walia
Amit Walia (Informatica)

Amit Walia is the executive vice president and chief product officer at Informatica, where he is responsible for product development, product management, product marketing, and engineering. Previously, Amit was the senior vice president and general manager for Informatica’s Data Integration and Data Security business unit.... Read More.

Laura Waller
Laura Waller (UC Berkeley), @optrickster

Laura Waller is an assistant professor at UC Berkeley in the Department of Electrical Engineering and Computer Sciences (EECS) and a senior fellow at the Berkeley Institute of Data Science (BIDS), with affiliations in Bioengineering and Applied Sciences & Technology. Previously, Laura was... Read More.

Dean Wampler
Dean Wampler (Lightbend), @deanwampler

Dean Wampler is the architect for big data products at Lightbend. He specializes in scalable, distributed big data and streaming systems using tools like Spark, Mesos, and Scala. Dean is a contributor to several open source projects. He’s also the co-organizer of several conferences... Read More.

Guozhang Wang

Guozhang is a an engineer at Confluent, building a stream data platform on top of Apache Kafka. Prior to Confluent, Guozhang was a senior software engineer at LinkedIn, developing and maintaining its backbone streaming infrastructure on Apache Kafka and Apache Samza. He holds a PhD... Read More.

Haojun Wang
Haojun Wang (Baidu)

Haojun Wang is a tech lead of Baidu’s US autonomous driving car team. Currently, Haojun is driving the in-car computing platform and offline data platform. Prior to Baidu, he worked at the IBM Silicon Valley Lab, focusing on database core development and big data... Read More.

Wei Wang
Wei Wang (Hortonworks)

Wei Wang is the senior director of product marketing at Hortonworks, where she serves as the primary leadership force behind strategic marketing execution, with a focus on boosting Hortonworks Data Platform market expansion and revenue generation globally. Wei is an accomplished international marketing executive with... Read More.

Daniel Weeks
Daniel Weeks (Netflix)

Daniel Weeks manages the Big Data Compute team at Netflix and is a Parquet committer. Prior to joining Netflix, Daniel focused on research in big data solutions and distributed systems.

Director managing development of global integrated marketing solutions, processes, and technologies for Dell marketing units. Marketing thought leader for researching emerging technology, solutions, and business development opportunities across worldwide groups.

Dave Wells
Dave Wells (Paxata)

Dave Wells is actively involved in information management and business management, especially at their intersection. Dave is a consultant and educator dedicated to building meaningful connections throughout the path from data to business value. Knowledge sharing and skills development are Dave’s passions, carried out through... Read More.

Michael Wendt
Michael Wendt (Accenture Technology Labs)

Mike Wendt is a R&D associate manager at Accenture Technology Labs in San Jose, CA. Since joining Accenture Technology Labs, Mike has worked with Hadoop, Cassandra, Storm, and other big data technologies. His research includes benchmarking bare-metal and cloud-based Hadoop clusters and comparing their price-performance... Read More.

Timoni West
Timoni West (Unity Labs), @timoni

Timoni West leads design for Unity Labs, focusing on new game development and creation tools in VR. Previously, Timoni was SVP of design at Alphaworks, a new startup helping to democratize small business funding, and cofounder and creative director of Recollect, a social backup... Read More.

Edd Wilder-James
Edd Wilder-James (Silicon Valley Data Science), @edd

Edd Wilder-James is a technology analyst, writer, and entrepreneur based in California. He’s helping transform businesses with data as VP of strategy for Silicon Valley Data Science. Formerly Edd Dumbill, Edd was the founding program chair for the O’Reilly Strata conferences... Read More.

Cack Wilhelm
Cack Wilhelm (Scale Venture Partners)

Cack Wilhelm is a principal at Scale Venture Partners, where she focuses on investments in next-generation enterprise-software companies with a particular emphasis on the cloud infrastructure, big data, DevOps, and security sectors. Cack’s recent efforts have led to investments in mobile-database company Realm and cloud-analytics-infrastructure... Read More.

Michael Williams
Michael Williams (Fast Forward Labs), @mikepqr

Mike Williams is a research engineer at Fast Forward Labs. He has a PhD in astrophysics. He tweets @mikepqr.

Roseanne Wincek
Roseanne Wincek (Institutional Venture Partners)

Roseanne Wincek joined IVP in March 2015. She focuses on investing in later-stage, high-growth consumer and enterprise companies. Roseanne actively works with IVP portfolio companies Compass and PopSugar. Roseanne was previously a principal at Canaan Partners, a leading early-stage venture firm, where she... Read More.

Christina Wodtke
Christina Wodtke (Wodtke Consulting), @cwodtke

Christina Wodtke has led redesigns and initial product offerings for such companies as LinkedIn, Myspace, Zynga, Yahoo, Hot Studio, and eGreetings. Christina has founded two consulting startups, a product startup, and Boxes and Arrows, an online magazine of design. She also cofounded the Information Architecture... Read More.

Kristi Wolff
Kristi Wolff (Kelley Drye)

Kristi Wolff is special counsel in Kelley Drye’s Washington, DC, office. Kristi’s practice focuses on food, dietary supplements, medical devices, and emerging health/wearable technology and privacy issues. Kristi has extensive experience advising clients whose products are within the overlapping jurisdictions of the Food and Drug... Read More.

Steve Wooledge (MapR Technologies)

Steve Wooledge is vice president of product marketing at MapR, where he is responsible for communicating the business value and technical advantages of MapR innovations and solutions for Hadoop. Steve was previously vice president of marketing for Teradata Unified Data Architecture, where he drove big... Read More.

Kristi Woolsey

Kristine Woolsey is the practice lead for creative environments at MAYA, a design and technology innovation consultancy. Kristi is well known as a behavioral strategist with years of speaking and research on the impact that the physical environment has on human behavior. She joined... Read More.

Ian Wrigley
Ian Wrigley (Confluent), @iwrigley

Ian Wrigley has taught tens of thousands of students over the last 25 years, in subjects ranging from C programming to Hadoop development and administration. Ian is currently the director of education services at Confluent, where he heads the team building and delivering courses focused... Read More.

Jennifer Wu
Jennifer Wu (Cloudera)

Jennifer Wu is director of product management for cloud at Cloudera, where she focuses on cloud strategy and solutions. Before joining Cloudera, Jennifer worked as a product line manager at VMware, working on the vSphere and Photon system management platforms.

Yinglian Xie
Yinglian Xie (DataVisor), @datavisor

Yinglian Xie is the CEO and cofounder of DataVisor, a startup in the area of big data analytics for security. Yinglian has been working in the area of Internet security and privacy for over 10 years, where she has helped improve the security of... Read More.

Reynold Xin
Reynold Xin (Databricks)

Reynold Xin is a committer on Apache Spark. He is also a cofounder of Databricks. Before founding Databricks, he was pursuing a PhD in the UC Berkeley AMPLab.

Caiming Xiong
Caiming Xiong (Metamind)

Caiming Xiong is a senior researcher at Metamind. Before that, he was a postdoctoral researcher in the Department of Statistics at the University of California, Los Angeles. Caiming holds a PhD in computer science and engineering from SUNY Buffalo and a BS and MS... Read More.

Fangjin Yang
Fangjin Yang (Imply)

Fangjin Yang is a coauthor of the open source Druid project and a cofounder of Imply, a data analytics startup based in San Francisco. Previously, Fangjin held senior engineering positions at Metamarkets and Cisco Systems. Fangjin holds a BASc in electrical engineering and an MASc... Read More.

Chuck Yarbrough (Pentaho)

Chuck Yarbrough is the director of big data product marketing at Pentaho, a leading big data analytics company that helps organizations engineer big data connections, blend data, and report and visualize all of their data. Much of Chuck’s focus at Pentaho is in educating the... Read More.

Martin Yip (VMware)

Martin Yip is a product line marketing manager for VMware’s Cloud Platform Business Unit, where he oversees product marketing for a portfolio of products including vSphere, vSphere with Operations Management, and Big Data. Martin has been in the high technology industry for over 10 years... Read More.

Michael Yoder
Michael Yoder (Cloudera)

Mike Yoder is a software engineer at Cloudera who has worked on a variety of Hadoop security features and internal security initiatives. Most recently, he implemented log redaction and the encryption of sensitive configuration values in Cloudera Manager. Prior to Cloudera, he was a security... Read More.

Jin Zhang
Jin Zhang (CA Technologies), @jinz1

Jin Zhang is a passionate technology leader who is currently leading analytics at CA Technologies. Previously, Jin led Apigee to its IPO as their VP of engineering and was an engineering executive with IBM, where she was responsible for managing large teams as... Read More.

Owen Zhang
Owen Zhang (DataRobot)

Owen Zhang is the chief product officer at DataRobot. Owen spent most of his career in the property and casualty insurance industry. Most recently Owen served as vice president of modeling of the newly formed AIG Science team.

After spending several years in IT... Read More.

Weidong Zhang
Weidong Zhang (LinkedIn)

Weidong Zhang is an engineering manager on the Data Analytics Infrastructure team at LinkedIn and leads the marketing and customer-service data warehouse vertical. Weidong has a passion for analytics, research, and data-driven decision making. He spent 10+ years in the data warehouse ETL and... Read More.

Yongzheng Zhang
Yongzheng Zhang (LinkedIn)

Yongzheng Zhang is a business analytics manager at LinkedIn and an active researcher and practitioner of text mining and machine learning. He has developed many practical and scalable solutions for utilizing unstructured data for ecommerce and social-networking applications, including search, merchandising, social commerce, and customer-service... Read More.

Alice Zheng

Alice Zheng is the director of data science at Dato (formerly GraphLab), a Seattle-based startup that offers powerful, large-scale machine-learning and graph-analytics tools. Alice loves playing with data and enabling others to do the same. She is a tool builder and an expert in machine-learning... Read More.

Wei Zheng
Wei Zheng (Trifacta)

As VP of products at Trifacta, Wei Zheng combines her passion for technology with experience in enterprise software to define and shape Trifacta’s product offerings. Having founded several startups of her own, Wei believes strongly in innovative technology that solves real-world business problems. Most recently,... Read More.

Shivon Zilis
Shivon Zilis (Bloomberg Beta), @shivon

Shivon Zilis is a venture capitalist and founding member of Bloomberg Beta, where she focuses on early-stage data and machine-intelligence investments. Shivon has led 12 investments since launch. One, Newsle, was acquired by LinkedIn; others include Context Relevant, Alation, and InfluxDB. She recently released a... Read More.

Nina Zumel
Nina Zumel (Win-Vector LLC)
R Day (Full Day) Tutorial

Nina Zumel is cofounder and principal at Win-Vector LLC, a data science consultancy based in San Francisco. She frequently writes and speaks on statistics and machine learning. She is also the coauthor of the popular book Practical Data Science with R (Manning 2014).