Skip to main content
Make Data Work
Oct 15–17, 2014 • New York, NY

Strata + Hadoop World Speakers

New speakers are added continuously. Please check back to see the latest updates to the program.

Search Speakers

ed00425e 963b0803
ed00425e 963b0803 (MapR Technologies), @caroljmcdonald

Carol Mcdonald is a solutions architect at MapR focusing on big data, Apache HBase, Apache Drill, Apache Spark, and machine learning in healthcare, finance, and telecom. Previously, Carol worked as a Technology Evangelist for Sun,... Read More.

Michael Abbott
Michael Abbott (Stanford University)

Mike Abbott is a general partner at Kleiner Perkins Caufield & Byers, where he focuses on investments in the firm’s digital practice, helping entrepreneurs in the social, mobile, and cloud computing sectors rapidly scale teams... Read More.

Lior Abraham (Interana Inc), @lemon999

Spent 5+ years scaling Facebook’s infrastructure and building the most popular data analysis platform called Scuba, and 5+ years working on data products and tools at Veritas Software. Founder of a next generation analytics company,... Read More.

Jim Adler
Jim Adler (Metanautix), @jim_adler
Wonk, Meet Geek Session

Jim is a business executive, entrepreneur, and thought leader on big data, privacy, security, and voting systems. Currently, Jim is Vice President of Products and Chief Privacy Officer at Metanautix. He also serves on... Read More.

Joseph Adler
Joseph Adler (Facebook), @jadler

Joseph Adler has many years of experience in data mining and data analysis at companies including LinkedIn, DoubleClick, American Express, and VeriSign. He graduated from MIT with an B.Sc. and M.Eng in Computer Science... Read More.

Sumeet Agrawal
Sumeet Agrawal (Informatica)

Sumeet Kumar Agrawal is a Principal Product Manager for Big Data Edition product at Informatica. Based in bay area, Sumeet has over 7 years of experience working in different Informatica technologies. Sumeet is responsible in... Read More.

John Akred
John Akred (Silicon Valley Data Science), @BigDataAnalysis

With over 15 years in advanced analytical applications and architecture, John is dedicated to helping organizations become more data-driven. He combines deep expertise in analytics and data science with business acumen and dynamic engineering leadership.

... Read More.
Bharath Aleti (Cisco)

Bharath Aleti is a product manager in Cisco’s Data Center Group and is responsible for driving UCS Big Data Solutions. Prior to joining Cisco, Bharath was responsible for bringing leading solutions to market in... Read More.

Joseph Allaire
Joseph Allaire (Rstudio, Inc.)
R Day Tutorial

JJ Allaire is a software engineer and entrepreneur who has created a wide variety of products including ColdFusion, Windows Live Writer, Lose It!, and RStudio.

Alasdair Allan
Alasdair Allan (Babilim Light Industries), @aallan

Alasdair Allan is a director at Babilim Light Industries and a scientist, author, hacker, maker, and journalist. An expert on the internet of things and sensor systems, he’s famous for hacking hotel radios, deploying mesh... Read More.

Julia Angwin
Julia Angwin (ProPublica), @JuliaAngwin

Julia Angwin is an award-winning investigative journalist at the independent news organization ProPublica. Previously, she was a reporter at the Wall Street Journal, where she led a privacy investigative team that was a finalist for... Read More.

Mike Armstrong
Mike Armstrong (ZestFinance)

Mike Armstrong is the Chief Marketing Officer for ZestFinance, a technology startup that is transforming credit underwriting with big data analysis. He has spent more than 14 years working in financial services and has experience... Read More.

Bahman Bahmani
Bahman Bahmani (Rakuten)

Bahman did his PhD at Stanford University, supported by William R. Hewlett Stanford Graduate Fellowship, and focused on the topic of algorithms for big data applications, in which he is a well-published author in some... Read More.

Vishal Bamba
Vishal Bamba (Transamerica), @vishalbamba

Vishal Bamba is vice president of strategy and architecture at Transamerica Technology, where he leads a team focusing on innovation initiatives within the enterprise. Vishal has over 15 years of experience in distributed systems and... Read More.

Julia  Bardmesser

Julia Bardmesser is the Global Head of Business Data Management in Citi’s Chief Data Office. She is responsible for defining requirements for the implementation of consistent and controlled approach to the development and use of... Read More.

Nenshad  Bardoliwalla

Nenshad Bardoliwalla is an executive and thought leader with a proven track record of success leading product strategy, product management, and development in business analytics. He is the co-author of Driven to Perform: Risk-Aware Performance... Read More.

Jodok Batlogg
Jodok Batlogg (CRATE Technology GmbH), @jodok

Jodok has deep experience and wide recognition for his expertise in open source and big data. As early innovator he started with cloud services in 2006 and entered the world of billions of records many... Read More.

Joy Beatty
Joy Beatty (Seilevel), @joybeatty

Joy Beatty is a Vice President at Seilevel, a business analysis consulting firm whose mission is to redefine the way software requirements are created. Joy implements new methodologies that improve requirements elicitation and modeling. Her... Read More.

David Beaudoin
David Beaudoin (Transamerica)

DBA team manager and architect in Transamerica’s Employee Services and Pensions division.

Damian Black
Damian Black (SQLstream Inc)

Damian Black has over twenty-five years in the software infrastructure business. Damian started his career at Hewlett-Packard in HPLABS at their European research labs, and went on to run a European-wide middleware solutions and... Read More.

Ron Bodkin

Ron Bodkin is a technical director on the applied artificial intelligence team at Google, where he provides leadership for AI success for customers in Google’s Cloud CTO office. Ron engages deeply with Global F500... Read More.

Farrah Bostic
Farrah Bostic (The Difference Engine), @farrahbostic

Farrah Bostic is the founder of the Difference Engine, which she created based on her belief that deep understanding of customer needs is essential to growing businesses through great products and services. Farrah has honed... Read More.

Courtney Bowman (Palantir Technologies)

Courtney Bowman is one of Palantirʼs in-house Privacy and Civil Liberties specialists, with extensive experience working with local government (including law enforcement, criminal justice, health and social services) to develop technology-driven solutions to information sharing... Read More.

Joseph Bradley
Joseph Bradley (Databricks), @jkbatcmu
Spark Camp Tutorial

Joseph Bradley is a software engineer working on machine learning at Databricks. Joseph is an Apache Spark committer and PMC member. Previously, he was a postdoc at UC Berkeley. Joseph holds a PhD in... Read More.

Matthias Braeger

Matthias Braeger is Software Engineer (staff) at CERN, the European Organization for Nuclear Research. He is responsible for the Technical Infrastructure Monitoring (TIM) system which is a 24/7 service used by many different... Read More.

Deb Brooks
Deb Brooks (FICO)

Working as a Database Engineer with many different DBMSs for over 10 years. Started using Cloudera Hadoop in April of 2012 and working with FICO’s Data Scientists to evolve our products to use the Cloudera... Read More.

Jon Bruner
Jon Bruner (O'Reilly Media), @jonbruner

Jon Bruner is a data journalist who approaches questions that interest him by writing and coding. Jon is cochair of the O’Reilly Solid conference, focused on the intersection between software and the physical world, and... Read More.

Ryan Brush
Ryan Brush (Cerner Corporation), @ryanbrush

Ryan is a Distinguished Engineer with Cerner Corporation, one of the leading healthcare IT companies worldwide. He has built infrastructure for healthcare systems over the past decade, and currently is leading the design of Cerner’s... Read More.

Doug Bryan
Doug Bryan (RichRelevance)

Doug Bryan leads the data science services practice at RichRelevance. Prior to joining RichRelevance he was the VP of Analytics at iCrossing/Core Audience, a digital ad agency and DMP owned by Hearst. Earlier roles... Read More.

Andrea Burbank (Pinterest)

Andrea Burbank works as a data scientist at Pinterest, where she has led A/B testing for the past 2 years. Prior to Pinterest, she worked as a software engineer at Bing and as a natural... Read More.

Yu Cao
Yu Cao (EMC)

Yu is an engineering manager at EMC, where he works at a combo role of data scientist, solution architect and technical leader on customer specific Big Data solution innovation for verticals like social intelligence,... Read More.

Joe Caserta
Joe Caserta (Caserta Concepts), @CasertaConcepts

Joe Caserta is president of Caserta Concepts, an award-winning New York-based innovation consulting and technology implementation firm specializing in big data analytics, data warehousing, business intelligence solutions, and helping clients maximize data value. A recognized... Read More.

Jaroslav Cecho
Jaroslav Cecho (Cloudera)

Jarek Jarcec Cecho is a software engineer at Cloudera, where he develops software to help customers better access and integrate with the Hadoop ecosystem. He has led the Sqoop community in the architecture of the... Read More.

Winston Chang
Winston Chang (RStudio)
R Day Tutorial

Winston is a software engineer at RStudio, and holds a Ph.D. in Psychology from Northwestern University. He is a developer for the ggplot2, devtools, shiny, and ggvis packages, and is the author of R Graphics... Read More.

Trina Chiasson
Trina Chiasson (Tableau Software), @trinachi

Trina is the co-founder and CEO of Infoactive, a web app that helps people turn live data into interactive infographics and data visualizations. She is also a 2013-2014 Fellow at the Donald W. Reynolds... Read More.

Vishal Chowdhary
Vishal Chowdhary (Microsoft)

Vishal Chowdhary is a Principal Development lead with the MSR – Microsoft Translator (MT) team for the past 4 years. His team is primarily responsible for the data acquisition and training infrastructure for building... Read More.

Cliff Click
Cliff Click (0xdata)

Cliff Click is the CTO and Co-Founder of 0xdata, a firm dedicated to creating a new way to think about web-scale math and real-time analytics. I wrote my first compiler when I was 15... Read More.

Eli Collins
Eli Collins (Cloudera), @elicollins
Pax Data Keynote

Eli is Cloudera’s Chief Technologist, currently focused on new technology introduction and strategy. He previously lead the team responsible for Cloudera’s Hadoop distribution (CDH), and is an Apache Hadoop committer and PMC member.... Read More.

George Corugedo
George Corugedo (RedPoint Global), @RedPointCTO

A mathematician and seasoned technology executive, George Corugedo has over 20 years of business and technical expertise. As co-founder and CTO of RedPoint Global, George is responsible for leading the development of the RedPoint... Read More.

Charlie Crocker
Charlie Crocker (Autodesk)

Charlie is a data geek with 20 years of experience bringing data out of the shadows to drive business value and optimize operational costs. For Autodesk, he is currently working across divisions to identify and... Read More.

Alistair Croll
Alistair Croll (Solve For Interesting), @acroll

Alistair Croll is an entrepreneur with a background in web performance, analytics, cloud computing, and business strategy. In 2001, he cofounded Coradiant (acquired by BMC in 2011) and has since helped launch Rednod, CloudOps, Bitcurrent,... Read More.

Beau Cronin
Beau Cronin (Embedding.js), @beaucronin

Beau Cronin is the lead developer for Embedding.js, a library for data-driven immersive environments. Beau cofounded two startups based on probabilistic inference; the second was acquired by Salesforce in 2012. Recently, he has become increasingly... Read More.

Ailey Crow
Ailey Crow (Pivotal)

Ailey is a Senior Data Scientist at Pivotal Inc focusing on life sciences and healthcare. She holds a Ph.D. in Biophysics from UC Berkeley where her research focused on applying novel atomic force microscopy (Read More.

Jean-Daniel Cryans

Jean-Daniel Cryans works as a software engineer at Cloudera on the Storage team where he spends his days making Apache HBase better. Previous to that, he worked at StumbleUpon where he also worked on HBase... Read More.

Doug Cutting
Doug Cutting (Cloudera), @cutting

Doug Cutting is the chief architect at Cloudera and the founder of numerous successful open source projects, including Lucene, Nutch, Avro, and Hadoop. Doug joined Cloudera from Yahoo, where he was a key member of... Read More.

Sabrina Dahlgren
Sabrina Dahlgren (Kaiser Permanente)

Sabrina Dahlgren is a director in charge of strategic analysis at Kaiser Permanente. Her expertise ranges from statistics and economics to project management and computer science. Sabrina has 20 years’ total work experience in leadership... Read More.

Brian Dalessandro
Brian Dalessandro (Capital One), @delbrians

Brian d’Alessandro is a Sr Director of data science at Capital One (Financial Services). Brian is also an active professor for NYU’s Center for Data Science graduate degree program. Previously, Brian built and led data... Read More.

Ami Daniel
Ami Daniel (Windward), @amidaniel1

Ami Daniel is the CEO and co-Founder of Windward Ltd., the world leader in predictive maritime analytics. In 2010, following seven years as a Naval Officer in the Israeli Navy, Daniel and fellow Naval... Read More.

Pravin Darbare
Pravin Darbare (Western Union)

Pravin Darbare is Senior Manager for Data Integration and Data Discovery at Western Union. With over 15 years’ enterprise software experience he has led big data projects that include implementing Cloudera, Informatica, Teradata, Tableau, QlikView,... Read More.

Shirshanka Das
Shirshanka Das (LinkedIn)

Shirshanka is the technical lead for Data Analytics Infrastructure at LinkedIn. He was among the original authors of a variety of open and closed source projects built at LinkedIn, including Databus, Apache Helix and Espresso.... Read More.

Tathagata Das
Tathagata Das (Databricks)
Spark Camp Tutorial

Tathagata Das is an Apache Spark committer and a member of the PMC. He is the lead developer behind Spark Streaming, which he started while a PhD student in the UC Berkeley AMPLab, and... Read More.

Michael Dauber
Michael Dauber (Amplify Partners), @dauber

Michael Dauber is a general partner at Amplify Partners. Previously, Mike spent over six years at Battery Ventures, where he led early-stage enterprise investments on the West Coast, including Battery’s investment in a stealth security... Read More.

Allen Day
Allen Day (MapR Technologies), @allenday
Just Enough Math Tutorial

Allen is Principal Data Scientist at MapR Technologies, where he leads interdisciplinary teams to deliver results in fast-paced, high-pressure environments across several verticals in industry. Previously, Allen founded TinyTube Networks which provided the first mobile... Read More.

michael dddd
Spark Camp Tutorial

Michael Armbrust is the lead developer of the Spark SQL project at Databricks. He received his PhD from UC Berkeley in 2013, and was advised by Michael Franklin, David Patterson and Armando Fox. His... Read More.

Michelle Dennedy (McAfee, an Intel Company)

Michelle Finneran Dennedy currently serves as VP and Chief Privacy Officer at McAfee. She is responsible for the development and implementation of McAfee data privacy policies and practices, working across business groups to drive data... Read More.

Manish Devgan
Manish Devgan (Software AG), @mdevgan

Manish Devgan heads Terracotta Product Management and Strategy. He has more than a decade of experience leading Product Management and R&D at companies building enterprise class technology products and solutions.

Prior to Terracotta, Manish led... Read More.

Barry Devlin
Barry Devlin (9sight Consulting), @BarryDevlin

Dr. Barry Devlin is a founder of the data warehousing industry, defining its first architecture in 1985. A foremost authority on business intelligence (BI), big data and beyond, he is respected worldwide as a visionary... Read More.

Sunil Dhaliwal
Sunil Dhaliwal (Amplify Partners)

Sunil Dhaliwal is a General Partner with Amplify Partners. He has over 16 years of experience as an early-stage investor and mentor to entrepreneurs. His current portfolio includes AppNeta, BlueData, Conjur, Continuuity, Datadog, Fastly, Wibidata,... Read More.

Anubhav Dhoot (Cloudera)

Anubhav is a software engineer working on Resource Management in Hadoop at Cloudera. Previously he worked for 10 years in Microsoft building different distributed system components across different platforms including Bing, Azure, and AppFabric. His... Read More.

nick dimiduk
nick dimiduk (Hortonworks, Inc), @xefyr

Nick stumbled into HBase in 2008 when his nightly ETL jobs started taking 20+ hours to complete. Since then, he has applied Hadoop and HBase to projects in social media, social gaming, click-stream analysis,... Read More.

Renee DiResta
Renee DiResta (New Knowledge), @noupside

Renee DiResta is Director of Research at cybersecurity company New Knowledge and a Mozilla Fellow in Media, Misinformation, and Trust. She investigates the spread of disinformation and malign narratives across social networks, and has advised... Read More.

James  Dixon
James Dixon (Pentaho)

As CTO at Pentaho, James Dixon is responsible for Pentaho’s architecture and technology roadmap. James has over 15 years of professional experience in software architecture, development and systems consulting. Prior to Pentaho, James held... Read More.

Mark Doms
Mark Doms (United States Department of Commerce)

Dr. Mark E. Doms was sworn in as the 11th Under Secretary of Commerce for Economic Affairs on January 3, 2013.

Doms succeeds Dr. Rebecca M. Blank who served as the Acting Secretary of Commerce... Read More.

Anna Dorofiyenko
Anna Dorofiyenko (MarketShare)

Anna is a data practitioner with 20 years of experience in Data Warehousing, BI and analytics in banking, web, marketing research and marketing analytics industries. Currently VP, Data at MarketShare, Anna is responsible for integrating... Read More.

Joseph Dossantos
Joseph Dossantos (EMC Consulting )

Joe leads the Enterprise Information Management and Analytics practice for EMC Consulting in the Americas, leading a practice of consultants that focus on helping clients create the next generation of big data capabilities, including... Read More.

Ted Dunning

Ted Dunning is Chief Application Architect at MapR Technologies and committer and PMC member for the Apache Mahout, Drill and Zookeeper projects and mentor for the Storm and Spark projects.

He was the chief... Read More.

Sastry Durvasula
Sastry Durvasula (American Express)

Sastry Durvasula is Vice President and Global Technology Head of Information Management and Digital Capabilities at American Express. In this role, Sastry is responsible for leading the IT strategy and transformational development to power the... Read More.

Helena Edelson

Committer to several open source projects including the Spark Cassandra Connector, Cassandra Kafka Connector, a previous contributor to Akka (2 new features in Akka Cluster), Spring Integration and several others. She is also a speaker... Read More.

Jana Eggers
Jana Eggers (Nara Logics), @jeggers

Jana Eggers is CEO of Nara Logics, a neuroscience-inspired artificial intelligence company providing a platform for recommendations and decision support. A math and computer nerd who took the business path, Jana has had a... Read More.

Rana el Kaliouby

Rana el Kaliouby is cofounder and CEO of Affectiva—a pioneer in emotion AI, the next frontier of artificial intelligence—where she leads the company’s award-winning emotion recognition technology, built on a science platform that uses... Read More.

Igor Elbert
Igor Elbert (, @ielbert

Mr. Elbert has been dealing with big data for over 20 years. From calculating financial risk for Salomon Brothers to tracking movements of millions of items across the supply chain for major brands, Mr. Elbert... Read More.

Allan Enemark
Allan Enemark (Accenture)

Industrial and UX designer who believes data should be both insightful and beautiful. When not buried under post-it notes or tangled in wireframes, he is probably tinkering with a 3D printer or out wandering with... Read More.

Hossein Falaki
Hossein Falaki (Databricks Inc.)

Hossein Falaki is a software engineer at Databricks working on the next big thing. Prior to that he was a data scientist at Apple’s personal assistant, Siri. He graduated with Ph.D. in Computer Science from... Read More.

Victor Fang
Victor Fang (Pivotal)

Dr. Chunsheng Victor Fang is a Senior Data Scientist at Pivotal. His Big Data expertise covers verticals e.g. large scale machine learning, video analytics, social network analysis, FDA approved medical imaging algorithms,etc. Prior to... Read More.

Donald Farmer

Donald Farmer is an internationally respected speaker and writer, with 30 years’ experience in data management and analysis. Before joining Qlik, Donald was a leader of the Microsoft Business Intelligence team, working on new products... Read More.

Sameer Farooqui
Spark Camp Tutorial

Sameer Farooqui is a client services engineer at Databricks, where he works with customers on Apache Spark deployments. Sameer works with the Hadoop ecosystem, Cassandra, Couchbase, and general NoSQL domain. Prior to Databricks, he worked... Read More.

Rob Fergus (New York University and Facebook)

Rob Fergus is an Associate Professor of Computer Science at the Courant Institute of Mathematical Sciences, New York University. He is also a Research Scientist at Facebook, working in their AI Research Group. He received... Read More.

adeen flinker
adeen flinker (

Adeen is an advisor for and a research scientist at New York University specializing in data analysis and signal processing. His neuroscience research focuses on the neural mechanisms that support speech and the computational... Read More.

Jake Flomenberg
Jake Flomenberg (Accel Partners)

Jake joined Accel in 2012. He has over a decade of experience building innovative software products. He focuses on early stage investments in next generation infrasture and data-driven services and is part of the team... Read More.

Eric Frenkiel

Eric is the cofounder and CEO of MemSQL, the leader in real-time and historical Big Data analytics based on a distributed in-memory database. Before MemSQL, Eric worked at Facebook on partnership development. He has... Read More.

Adam Fuchs

Adam Fuchs is the Chief Technology Officer and co-founder of Sqrrl. Previously at the National Security Agency, Adam was an innovator and technical director for several database projects, handling some of the world’s largest and... Read More.

Eddie Garcia
Eddie Garcia (Cloudera), @edygarcia

Eddie Garcia leads the company’s customer-facing technical resources including the pre-sales technical team, implementation and deployment engineers, security architects, customer support and the DevOps infrastructure used by Gazzang and our customers. Prior to this role,... Read More.

Amy Gaskins

Amy Gaskins is the founder of Panopticon, a management consulting firm based in Cary with clients around the globe. Before venturing out on her own, Amy was the Big Data Project Director and a member... Read More.

Lars George
Lars George (Cloudera), @larsgeorge

Author of O’Reilly’s “HBase – The Definitive Guide”.

Ari Gesher
Ari Gesher (Kairos Aerospace), @alephbass

Ari Gesher is a senior engineer and Engineering Ambassador at Palantir Technologies.

At Palantir Technologies, Ari has split his time between working as a backend engineer on Palantir’s analysis platform, thinking and writing about Palantir’s... Read More.

Anna Gilbert
Anna Gilbert (University of Michigan)

Anna Gilbert received an S.B. degree from the University of Chicago and a
Ph.D. from Princeton University, both in mathematics. In 1997, she was a
postdoctoral fellow at Yale University and AT&T Labs-Research.... Read More.

P. Taylor Goetz (Hortonworks )

P. Taylor Goetz is an Apache Storm committer and release manager and has been involved with the usage and development of Storm since it was first released as open source in October of 2011. As... Read More.

Ryan Goldman
Ryan Goldman (Cloudera)

Ryan Goldman is a senior product marketing manager at Cloudera, where he is focused on vertical marketing solutions, primarily concentrating on big data applications within healthcare and financial services. He will moderate this panel discussion.

... Read More.
Brett Goldstein
Brett Goldstein (University of Chicago), @bjgol

Brett Goldstein is a leader in Enterprise Architecture, Big Data/Analytics, and Government Technology. He has 15 years of experience in operations, management and leadership in technical environments in both the public and private sector.

Brett... Read More.

Vitaly Gordon
Vitaly Gordon (LinkedIn)

Vitaly Gordon is a senior data scientist on the LinkedIn Product Data Science team where he develops data products that most of you use every day. Prior to LinkedIn, Vitaly founded the data science team... Read More.

Alex Gorelik
Alex Gorelik (Waterline Data), @gorelikalex

Alex Gorelik is the founder and CEO of Waterline Data, a startup focused on enhancing the value of Hadoop through data self-service and governance. Alex is a serial entrepreneur and innovator who has spent... Read More.

Mark Grabb
Mark Grabb (General Electric Global Research Center)

Dr. Mark Grabb is the Technology Director for Analytics at the General Electric Global Research Center in New York. The Analytics Technology Organization includes labs with expertise in Applied Statistics, Applied Mathematics, Quantitative Finance, Operations... Read More.

Brian Granger
Brian Granger (Cal Poly San Luis Obispo), @ellisonbg
PyData at Strata Tutorial

Brian Granger is an Associate Professor of Physics at Cal Poly State
University in San Luis Obispo, CA. He has a background in theoretical
atomic, molecular and optical physics, with a Ph.D from... Read More.

John Grant (Palantir Technologies)

John joined Palantir Technologies in September 2010 as a Civil Liberties Engineer. Previously, John served for nearly a decade as an advisor in the United States Senate. He earned his law degree from Georgetown shortly... Read More.

Olivier Grisel
Olivier Grisel (Inria & scikit-learn), @ogrisel
PyData at Strata Tutorial

Olivier Grisel is a software engineer at Inria Saclay, France.

He works on scikit-learn an Open Source project for Machine Learning in Python. He also contributes occasional bug fixes to upstream projects in the NumPy... Read More.

Garrett Grolemund
Garrett Grolemund (RStudio)
R Day Tutorial

Garrett maintains, the development center for the Shiny R package, and is the author of Hands-On Programming with R as well as Data Science with R, a forthcoming book by O’Reilly Media.

In his... Read More.

Mark Grover

Mark Grover is a committer on Apache Bigtop, a committer and PMC member on Apache Sentry (incubating) and a contributor to
Apache Hadoop, Apache Spark, Apache Hive, Apache Sqoop, Apache Pig and Apache... Read More.

Tina Groves
As a senior product manager, Tina Groves leverages her 20+ years in analysis, event processing and information-driven applications to drive big data product strategy in IBM’s Analytics and Information Group. An engaging speaker known for... Read More.
Geoff Guerdat (Gilt Groupe)

Geoff is the Director of Data Engineering at Gilt, an e-commerce shopping website offering members a luxury lifestyle through clothing, home decor, and local activities. Gilt’s Data Engineering team is a technical team responsible for... Read More.

Carlos Guestrin
Carlos Guestrin (Apple | University of Washington )

Carlos Guestrin is the Amazon Professor of Machine Learning at the
Computer Science & Engineering Department of the University of
Washington. He is also a co-founder and CEO of GraphLab Inc.,
... Read More.

Aashima Gupta
Aashima Gupta (Kaiser Permanente), @aashima1gupta

Accountable for establishing Digital Technology Incubation functions and solutions. This is a new way of delivering applications at Kaiser Permanente – ability to Incubate and Dogfood applications to accelerate maturity before releasing to the end... Read More.

Joel Gurin
Joel Gurin (Center for Open Data Enterprise), @joelgurin

Joel Gurin is an expert on consumer issues, information policy, and the application of open data. He is currently Senior Advisor at the Governance Lab at New York University, a center dedicated to improving government... Read More.

Sebastian Gutierrez
Sebastian Gutierrez (, @dashingd3js

Sebastian Gutierrez is the Co-Founder of LetsWombat, a product sampling startup, and quite the Data Visualization Communicator. At LetsWombat he leads the technical work and business development. Prior to LetsWombat, Gutierrez worked on Wall Street... Read More.

Chris Harland

Chris Harland is a Data Scientist at Microsoft working on problems in Bing search, Windows, and MSN. He holds a PhD in Physics from the University of Oregon and has worked in a wide... Read More.

Rob Harper
Rob Harper (Uncharted), @rdharper

Rob Harper is partner, lead product architect at Uncharted, and has been building technical platforms and products in the visualization industry for a decade. Over the past number of years Rob has been focusing on... Read More.

Guy Harrison
Guy Harrison (Dell Software), @guyharrison

Guy Harrison is Executive Director of Research and Development at Dell Software. Guy is the author of Oracle Performance Survival Guide (Prentice Hall, 2009) and MySQL Stored Procedure Programming (OReilly with Steven Feuerstein) as well... Read More.

Rachel Hawley

Rachel assists customers in defining their business problems and objectives, and using SAS advanced analytics solutions to help them reach their goals. Her main focuses are SAS In-Memory Analytics solutions and SAS... Read More.

Stefan Heeke
Stefan Heeke (, @stefan_heeke

Stefan believes that data well applied has transformational power in any environment: personal, organizational, social. Stefan is an analytics professional with an Economics background and an MBA from the French ESCP-Europe. Stefan worked... Read More.

Jeffrey Heer
Jeffrey Heer (Trifacta | University of Washington), @jeffrey_heer

Jeff is Trifacta’s Chief Experience Officer and a Professor of Computer Science at the University of Washington, where he directs the Interactive Data Lab. Jeff’s passion is the design of novel user interfaces for exploring,... Read More.

Joe Hellerstein

Joe is Trifacta’s Chief Executive Officer and a Professor of Computer Science at Berkeley. His career in research and industry has focused on data-centric systems and the way they drive computing. In 2010, Fortune Magazine... Read More.

Andrew Hill

Andrew Hill is the senior scientist at Vizzuality where he explores the future of online mapping to help guide innovation at CartoDB. He is a PhD biologist by training but has been working on maps,... Read More.

Mike Hoskins (Actian Corporation), @MikeHSays

Actian CTO Michael Hoskins directs Actian’s technology innovation strategies and evangelizes game-changing trends in big data, analytics, Hadoop and cloud to give insight into Accelerating Big Data 2.0™. Mike, a Distinguished and Centennial Alumnus... Read More.

Jonathan Hsieh
Jonathan Hsieh (Cloudera, Inc), @jmhsieh

Software Engineer @ Cloudera. Apache HBase Commiter, Apache Flume Founder.

Julian Hyde
Julian Hyde (Hortonworks), @julianhyde

Founder of Apache Optiq and Pentaho Mondrian; Apache Drill committer; Architect at Hortonworks.

I have previously been a database kernel developer at Oracle, SQLstream, and Pentaho.

Joey is currently focused on the architecture and strategy for the deployment of complex analytic technologies as an Enterprise Technologist at at Dell, as part of the Office of the CTO. Joey’s area of... Read More.

Rohit Jain
Rohit Jain (Esgyn)

Rohit Jain is the CTO at Esgyn for Trafodion, a transactional SQL-on-HBase RDBMS. Rohit worked for Hewlett-Packard for 28 years on applications and databases, undertaking such roles as solutions architect, consultant, software... Read More.

Jeroen Janssens
Jeroen Janssens (Data Science Workshops), @jeroenhjanssens

Jeroen Janssens is a senior data scientist at YPlan NYC, tonight’s going out app, where he’s responsible for making event recommendations more personal. Jeroen holds an M.Sc. in Artificial Intelligence from Maastricht University, the... Read More.

David Jonker
David Jonker (Uncharted Software Inc.)

David Jonker is EVP and a founder of Oculus. He is a visual analytics designer and technical architect with twenty years experience. David is interested in the visual elegance of information and the underlying... Read More.

Rachel Kalmar

Rachel is a neuroscientist who is passionate about making sensor data accessible, actionable, and predictive. How do we take sensor data to the next level, from tracking to actions? Rachel is active in the Bay... Read More.

Sean Kandel
Sean Kandel (Trifacta)

Sean is Trifacta’s Chief Technical Officer. He completed his Ph.D. at Stanford University, where his research focused on user interfaces for database systems. At Stanford, Sean led development of new tools for data transformation and... Read More.

Holden Karau
Holden Karau (Independent), @holdenkarau
Spark Camp Tutorial

Holden Karau is a transgender Canadian software working in the bay area. Previously, she worked at IBM, Alpine, Databricks, Google (twice), Foursquare, and Amazon. Holden is the coauthor of Learning Spark, High Performance Spark,... Read More.

Ron Kasabian
Ron Kasabian (Intel)

Ron Kasabian
VP and General Manager, Big Data Solutions
Intel Corporation
Extended bio link:

Micheál Keane
Micheál Keane (Civis Analytics), @aexia

Micheál helped craft the Obama campaign’s 2008 data strategy – from Iowa to Florida, Micheal was the voter file director that helped lead these states to success. After President Obama took office, he joined Changing... Read More.

Altan Khendup @madmongol
Altan Khendup @madmongol (Teradata Corporation), @AltanKTeradata

Technology professional working in Teradata’s Big Data Center of Excellence focused on creating analytical ecosystem data fabrics that integrate Big Data systems and technology

“Working with businesses to imagine what’s possible and helping them get... Read More.

Amandeep Khurana
Amandeep Khurana (Cloudera)

Amandeep Khurana is a solutions architect at Cloudera, where he’s involved in the entire lifecycle of Hadoop adoption for customers from use-case discovery to taking systems to production. Amandeep is also a coauthor of HBase... Read More.

Liza Kindred (Lullabot)

Business Director / Managing Partner at Lullabot, the world’s funnest Drupal Company. We provide high-level consultation, site architecture, and education for all things Drupal.

It’s a possibility that we are the most kick-ass virtual company... Read More.

Liza Kindred
Liza Kindred (Third Wave Fashion), @lizak

Liza Kindred is an expert on the future of commerce and the founder of fashion tech think tank Third Wave Fashion. She is writing a book about the future of commerce (The Third Wave of... Read More.

Jon Kleinberg
Jon Kleinberg (Cornell University)

Jon Kleinberg is the Tisch University Professor of Computer Science and Information Science at Cornell, where his research focuses on the social and information networks that underpin the Web and other on-line media. He is... Read More.

Martin Kleppmann
Martin Kleppmann (University of Cambridge), @martinkl

Martin is committer on Apache Samza (a distributed stream processing framework), software engineer at LinkedIn, and author at O’Reilly (currently writing a book on designing data-intensive applications). Previously he co-founded and sold two startups, Rapportive... Read More.

Daniel Koffler
Daniel Koffler (Rio Tinto Alcan), @dkoffler

With over 20 years of experience in enterprise and start-up technology companies, Daniel has recently moved into the world of large enterprise heavy industry operational technologies. As information technology (IT) and operations technology (OT) converge... Read More.

Eugene Kolker
Eugene Kolker (Seattle Children's)

Eugene Kolker is Chief Data Officer at Seattle Children’s which consists of the Hospital, Research Institute and Foundation. Dr. Kolker has over 25 years of experience in multi-disciplinary data analysis, predictive analytics and algorithm development.... Read More.

Samuel Kommu
Samuel Kommu (Cisco Systems)

Samuel Kommu currently works at Cisco Systems in the Application Centric Infrastructure group. Samuel’s prime focus areas are application profiles from a network perspective and network programmability. He is one of the co-authors of “Big... Read More.

Marcel Kornacker
Marcel Kornacker (Cloudera)

Tech lead at Cloudera for new products. Graduated in 2000 with a PhD in databases from UC Berkeley, followed by engineering jobs at a few database-related startup companies. Marcel joined Google in 2003, where he... Read More.

Tim Kraska
Tim Kraska (Brown University), @cloudyminds

Tim Kraska is an Assistant Professor in the Computer Science department at Brown University. Currently, his research focuses on Big Data management for machine-learning and hybrid human/machine database systems. Before joining Brown, Tim Kraska spent... Read More.

Adi Krishnan
Adi Krishnan (Amazon Web Services)

Adi Krishnan is the Sr. Product Manager for Amazon Kinesis, a fully managed service for real-time processing of streaming data at massive scale. In this role he works closely with customers and partners, helps define... Read More.

Philip (Flip) Kromer

I’m a Distinguished Engineer at CSC and co-founder, CTO and chief architect of Infochimps, a CSC Big Data Business, the leading big data platform in the cloud. At Infochimps, a CSC... Read More.

Chi-Yi Kuan
Chi-Yi Kuan (LinkedIn)

Chi-Yi Kuan is director of data science at LinkedIn. He has over 15 years of extensive experience applying big data analytics, business intelligence, risk and fraud management, data science, and marketing mix modeling across various... Read More.

Lenni Kuff
Lenni Kuff (Facebook)

Lenni Kuff is a software engineer at Cloudera working on the Impala project. Lenni graduated from the University of Wisconsin-Madison with degrees in Computer Science and Computer Engineering. Before joining Cloudera he worked at Microsoft... Read More.

Uri Laserson
Uri Laserson (Cloudera), @laserson

Uri Laserson is a data scientist at Cloudera. Previously, he obtained his PhD from MIT developing applications of high-throughput DNA sequencing to immunology. During that time, he co-founded Good Start Genetics, a next-generation... Read More.

Jean Lau (GE Software)

Jean Lau joined GE in August 2012 with 25+ years of experience in enterprise software development, which includes 10+ years in management. She is Engineering Manager, leading a team of talented developers who are building... Read More.

Sasha Laundy
Sasha Laundy (Warby Parker), @sashalaundy

Sasha is a senior data scientist at Warby Parker. Previously, she was a founding data scientist and engineer at Polynumeral, a data science consultancy in New York City. She also worked at Twilio and was... Read More.

Juan Lavista is a Principal Data Scientist for at Microsoft, where he works with a team of data scientist searching for insights in petabytes of data. Juan joined Microsoft to work for the Microsoft... Read More.

Xavier Léauté
Xavier Léauté (Confluent), @xvrl

Xavier Léauté is a software engineer at Confluent as well as a founding Druid committer and PMC member. Prior to his current role he headed the backend engineering team at Metamarkets.

George Legendre
George Legendre (IJP Architects London)

George is a principal of IJP, an architectural firm which explores the natural intersections between space, mathematics and computation. IJP has built Henderson Waves, a 1000-foot long bridge in Singapore designed with a... Read More.

Matt LeMay
Matt LeMay (Constellate Data)

Matt LeMay is the co-founder of Constellate Data, where he designs human-centered systems for contextualizing and collaborating around data. In his work as a technology communicator, Matt has designed and led workshops about product management... Read More.

Alisa Lemberg (IDEO)

As a member of IDEO’s Design Research community, Alisa focuses on our emerging hybrid methods, combining qualitative story telling with quantitative reach and validation. She works closely with IDEO’s research vendors to discover new quantitative... Read More.

Josh Levy (Vast)

Josh Levy is Senior Director, Data Science at, where he has built personalized recommenders for homes and used vehicles. Previously he worked at Demand Media, where he developed contextual recommendations for a multimillion document... Read More.

Haoyuan Li
Haoyuan Li (Alluxio), @haoyuan

Haoyuan Li is a Computer Science Ph.D. candidate in AMPLab at UC Berkeley, and he works with Prof. Scott Shenker and Prof. Ion Stoica on big data and cloud computing. He leads Tachyon, an open... Read More.

Lauro Lins (AT&T Labs), @laurolins

Lauro Lins received a BSc/MSc in Computer Science and a PhD in Computational Mathematics from Universidade Federal de Pernambuco (Brazil, 1996-2007). During this period, Lauro also worked as the main software designer and developer of... Read More.

Nicolas Liochon
Nicolas Liochon (Scaled Risk)

Nicolas has stayed focused on the software architecture business at various positions including Head of Architecture at Thomson Reuters for the Risk Management product line. He has been deeply part of the Big Data arena... Read More.

Edy Liongosari
Edy Liongosari (Accenture)

Edy Liongosari is a Managing Director of Accenture Technology Labs. He leads the Infrastructure and Systems Research Group, which covers the Industrial Internet. His teams are actively working on Smart Grid, Intelligent Cities, Autonomous Unmanned... Read More.

Lucian Lita
Lucian Lita (Yoyo Labs), @datariver

Director of data engineering at Intuit. Previously, founder of Level Up Analytics, lead analytics and engineering team at BlueKai, PhD in computer science from Carnegie Mellon

Stephen Lloyd
Stephen Lloyd (Transamerica)

Manager in a BI Architecture group. Responsible for front end tools and user experience. Helping to bring Hadoop into our enterprise. Enjoy data visualization, startups, and new technologies.

Jorge Lopez
Jorge Lopez (Amazon Web Services), @zanilli

Jorge A. Lopez, Director, Product Marketing
With over 14 years of experience in Business Intelligence and Data Integration, Jorge A. Lopez is responsible for product marketing and global education at Syncsort. He also... Read More.

Ben Lorica
Ben Lorica (O'Reilly), @bigdata

Ben Lorica is the chief data scientist at O’Reilly. Ben has applied business intelligence, data mining, machine learning, and statistical analysis in a variety of settings, including direct marketing, consumer and market research, targeted advertising,... Read More.

Anil Madan
Anil Madan (PayPal)

Anil is the Sr. Director of Engineering at PayPal running several of their Online & Offline systems around PayPal Behavioral analytics, Personalization and Marketing. Prior to this he built out eBay’s Big Data Hadoop platform... Read More.

Roger Magoulas
Roger Magoulas (O'Reilly Media), @rogerm

Roger Magoulas is the vice president of O’Reilly Radar. Previously, Roger was the research director at O’Reilly, where he and his team built the company’s analysis infrastructure and provided analytic services and insights on technology-adoption... Read More.

Alisher Maksumov
Alisher Maksumov (GE Software)

Alisher Maksumov is Principal Architect for the Industrial Internet Software Platform at GE Software. He is responsible for architecture, design, and roadmap of Intelligent Machine, Communication and Networking components of the platform. Since joining GE... Read More.

Ted Malaska
Ted Malaska (Capital One), @TedMalaska

Ted has worked on close to 60 Clusters over 2-3 dozen clients with over 100’s of use cases. He has 18 years of professional experience working for start-ups, the US government, a number of the... Read More.

Gina Mancuso
Gina Mancuso (LoveThatFitTM (LTF))

Gina Mancuso is the President at LoveThatFitTM (LTF), a virtual fitting technology
that addresses a classic dilemma in online shopping: buying clothes that fit. This
technology lets shoppers try on clothing... Read More.

Bob Mankoff
Bob Mankoff (The New Yorker Magazine), @BobMankoff

A cartoonist and the cartoon editor of The New Yorker, Bob Mankoff is one of the nation’s leading commentators on the role of humor in American business, politics, and life.

He speaks on the appreciation... Read More.

Laura Manley
Laura Manley (The GovLab at NYU)

Laura Manley is Project Manager at The Governance Lab (GovLab) based at New York University. She developed and coordinates the Open Data 500, the first comprehensive study of U.S. companies that use open... Read More.

Hilary Mason
Hilary Mason (Cloudera Fast Forward Labs), @hmason

Hilary Mason is the Chief Scientist at, where she finds sense in vast data sets. Her work involves both pure research and development of product-focused features.

She’s also a co-founder of HackNY,... Read More.

Michael Massiello Hiskey

Michael is world traveler who loves technology. An accomplished writer, speaker and blogger, he spends his days Imagineering how Big Data, analytics and the Internet of Things will change the way we live and work.... Read More.

Q McCallum
Q McCallum (@qethanm)

Q Ethan McCallum works as a professional-services consultant, speaker, and writer with a focus on strategic matters around data and technology. He is especially interested in helping companies build and shape their internal... Read More.

Arianna McClain

Arianna McClain is a design researcher – data specialist at IDEO. Arianna works at the intersection of technology, data, and human behavior. She leads hybrid research processes that merge quantitative (data) and qualitative (stories)... Read More.

Dan McClary

Dan McClary currently serves as Principal Product Manager for Big Data and Hadoop at Oracle. Prior to joining Oracle he served as Director of Business Intelligence at Red Robot Labs in Palo Alto, CA. He... Read More.

Patrick McFadin
Patrick McFadin (Datastax)

Patrick McFadin is regarded as a foremost expert for Apache Cassandra and data modeling. As Chief Evangelist for Apache Cassandra and consultant working for DataStax, he has been involved in some of the biggest deployments... Read More.

Stephanie McKinley
Stephanie McKinley (Independent Consultant), @slangenfeld

With over 15 years of data infrastructure and application experience, Stephanie has a track record of bringing new technologies to market and into the hands of business analysts. As VP of Marketing at Trifacta, Stephanie... Read More.

Wes McKinney
Wes McKinney (Two Sigma Investments), @wesmckinn
PyData at Strata Tutorial

Wes McKinney is a software architect at Two Sigma Investments. He is the creator of Python’s pandas library and a PMC member for Apache Arrow and Apache Parquet. He wrote the book Python for... Read More.

Steve McPherson (Amazon Web Services)

Steve McPherson is the Senior Manager of Amazon Elastic MapReduce, a managed Hadoop web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data. Before starting in... Read More.

Miriah Meyer
Miriah Meyer (University of Utah), @miriahmeyer

Miriah Meyer is an associate professor in the School of Computing at the University of Utah, where she runs the Visualization Design Lab. Her research focuses on the design of visualization systems for helping analysts... Read More.

Leo Meyerovich
Leo Meyerovich (Graphistry)

Leo Meyerovich is the founder of Graphistry, which achieves interactive visualizations of 2-5 magnitudes more data in web browsers by automatically exploiting client and cloud GPU hardware. Previously, he researched parallel language and browser... Read More.

Lelanie Moll
Lelanie Moll (FICO)

As Director of Database Engineering, provides strategic direction and leadership for dynamic, multi-platformed, mission critical, database environments. Focus in internal Cloud Services(Paas and SaaS) as well enabling Big Data Technologies

Karen Moon
Karen Moon (Trendalytics), @KarenMoon140

Karen Moon is cofounder and CEO of Trendalytics, a style-centric visual data platform that measures consumer engagement with merchandise trends. Karen has more than 12 years of experience in retail and technology working with... Read More.

Douglas Moore
Douglas Moore (Think Big Analytics), @Douglas_MA

Mr. Moore is Principal Big Data Architect for Think Big Analytics and architect of the system described in this presentation. Mr. Moore’s experience spans 25 years of integrating data collection, analysis, OLTP, OLAP,... Read More.

Will Moss (Airbnb)

Will is a software engineer at Airbnb in San Francisco, CA. Airbnb is a community marketplace for people to list, discover, and book unique accommodations around the world. Before that, he worked for Bump Technologies... Read More.

Silaphet Mounkhaty

I came from Microsoft SQL Server DBA background. I have been working with Fraud Analytics team supporting Big Data Hadoop cluster for deploying Hadoop Ecosystems.

John Mount
John Mount (Win-Vector LLC)

John Mount is a principal consultant at Win-Vector LLC, a San Francisco data science consultancy. John has worked as a computational scientist in biotechnology and a stock-trading algorithm designer and has managed... Read More.

Kathleen Moynahan
Kathleen Moynahan (Accenture Technology Labs), @mary13

I am a multidisciplinary interaction designer, currently researching visual literacy and applying its fundamental principles to data visualization in the Accenture Technology Labs. I love to design engaging, beautiful experiences – from interactive, experimental interfaces... Read More.

Sharmila Mulligan
Sharmila Mulligan (ClearStory Data), @ShahaniMulligan

Sharmila has spent 18+ years building game-changing software companies in a variety of markets. She has been EVP & CMO at numerous software companies, including Netscape, Kiva Software, AOL, Opsware, and Aster... Read More.

Kevin Murray (American Express)

Kevin Murray is VP of Information Management Infrastructure and Integration for American Express. Throughout his 25+ year career he has brought emerging technologies into large enterprises, and most recently launched the Big Data infrastructure platform... Read More.

Paco Nathan
Paco Nathan (, @pacoid
Just Enough Math Tutorial
Spark Camp Tutorial

O’Reilly author (Enterprise Data Workflows with Cascading and the new “Just Enough Math”) and a “player/coach” who’s led innovative Data teams building large-scale apps. OSS evangelist for Apache Spark (Databricks), workshop instructor (Global... Read More.

Chris Nauroth
Chris Nauroth (Hortonworks)

Chris Nauroth is a software engineer on the HDFS team at Hortonworks. He is an active contributor across the lowest layers of the Hadoop ecosystem: Hadoop Common, HDFS, YARN, and MapReduce. His... Read More.

Trent Nelson (Continuum Analytics)
PyData at Strata Tutorial
Praveen Neppalli Naga (Linkedin Corp)

Praveen Neppalli Naga leads Linkedin’s Distributed Data Aanalytics team and is responsible for building an distributed infrastructure for all interactive analytics needs at Linkedin. The infrastructure supports both Linkedin’s member/customer facing analytics and internal analytics... Read More.

Scott Nicholson
Scott Nicholson (Poynt)

Scott Nicholson most recently was Chief Data Scientist at Accretive Health, where his team built out data infrastructure, predictive analytics and data visualizations to help healthcare providers make better clinical and financial decisions. Before moving... Read More.

Vaibhav Nivargi
Vaibhav Nivargi (ClearStory Data), @vnivargi

Vaibhav Nivargi is Chief Architect and Founder. Prior to ClearStory Data, Vaibhav was one of the first engineers at Aster Data, where he led development of key areas of the product through the acquisition of... Read More.

Michael O'Connell
Michael O'Connell (TIBCO Software Inc.)

Michael O’Connell is Chief Data Scientist at TIBCO Software, developing analytic solutions across a number of industries including Financial Services, Energy, Life Sciences, Consumer Goods & Retail, and Telco, Media & Networks. He has been working on statistical software applications... Read More.

Stephen O'Sullivan
Stephen O'Sullivan (Data Whisperers), @steveos

A leading expert on big data architecture and Hadoop, Stephen brings over 20 years of experience creating scalable, high-availability, data and applications solutions. A veteran of WalmartLabs, Sun and Yahoo!, Stephen leads data architecture and... Read More.

Matthew Ocko
Matthew Ocko (Data Collective), @mattocko

Matt Ocko has three decades of experience as a technology entrepreneur and VC. Over his career, he has invested in Cotendo, Zynga, Facebook, XenSource, UltraDNS, FlashSoft, Fortinet, Aggregate Knowledge, Virtuata, DataMirror, Couchbase, Ayasdi, Kenshoo, D-Wave... Read More.

Travis Oliphant
Travis Oliphant (Anaconda)
PyData at Strata Tutorial

Travis Oliphant has a Ph.D. from the Mayo Clinic and B.S. and M.S. degrees in Mathematics and Electrical Engineering from Brigham Young University. Since 1997, he has worked extensively with Python for numerical and scientific... Read More.

Mike Olson
Mike Olson (Cloudera), @mikeolson

Mike Olson cofounded Cloudera in 2008 and served as its CEO until 2013, when he took on his current role of chief strategy officer. As CSO, Mike is responsible for Cloudera’s product strategy,... Read More.

srowen om
srowen om (Cloudera), @sean_r_owen

Sean is Director of Data Science for EMEA at Cloudera, helping customers build large-scale machine learning solutions on Hadoop. Previously, Sean founded Myrrix Ltd, producing a real-time recommender and clustering product evolved from Mahout.... Read More.

Emil Ong
Emil Ong (Lookout), @OngEmil

Emil Ong is a Principal Software Engineer at Lookout, focusing on data and services infrastructure for the company’s mobile security offerings. Before arriving at Lookout, he took a brief detour out of modern technology into... Read More.

Nathan Oostendorp is a co-founder of Sight Machine, a data and application platform for manufacturing.

Nathan co-founded, worked 9 years as an architect and director for, and has developed several other successful online... Read More.

Todd Papaioannou (Splunk)

Todd Papaioannou has served as Splunk’s Chief Technology Officer since 2013. Prior to joining Splunk, Mr. Papaioannou was an Entrepreneur-in-Residence at Data Collective Venture Capital. From 2011 to 2013, he served as Chief Executive Officer... Read More.

Kayur Patel
Kayur Patel (Google)
PyData at Strata Tutorial

Kayur Patel makes data science tools easier to use and studies how people apply machine learning to solve problems and build software. Kayur received his PhD in Computer Science and Engineering from the University of... Read More.

Joshua Patterson

Joshua Patterson is a director of AI infrastructure at NVIDIA leading engineering for RAPIDS.AI. Previously, Josh was a White House Presidential Innovation Fellow and worked with leading experts across public sector, private sector,... Read More.

Fernando Perez
Fernando Perez (UC Berkeley and Lawrence Berkeley National Laboratory), @fperez_org
PyData at Strata Tutorial

Fernando Pérez is a research scientist at UC Berkeley, working at the
intersection of brain imaging and open tools for scientific computing. He
created IPython while a PhD student in Physics at the... Read More.

Claudia Perlich

Claudia Perlich currently acts as chief scientist at Dstillery (previously m6d) and in this role designs, develops, analyzes, and optimizes the machine learning that drives digital advertising. She has published more than 50 scientific article... Read More.

adam pilz
adam pilz (SAS)

Adam applies data science techniques to acquire, manipulate, and model data using SAS advanced analytics technologies. He holds a master’s degree in economics from The Ohio State University and a master’s degree in analytics... Read More.

Brigitte Piniewski
Brigitte Piniewski (nonaffiliated )

Brigitte has been a primary care physician, researcher, author and multidisciplinary collaborator for more than two decades. Most recently she led several innovative projects as the Chief Medical Officer for PeaceHealth Laboratories, serving Alaska, Washington... Read More.

Don Pinto
Don Pinto (Couchbase), @NoSQLDon

Don Pinto is a Product Manager at Couchbase and is currently focused on advancing the server capabilities of Couchbase Server including Security. He is extremely passionate about data technology and in the past, has authored... Read More.

Sid Probstein

Sid Probstein is the Chief Technology Officer of Attivio, responsible for product & technology strategy & implementation.

Sid has over 20 years of experience in managing R&D organizations and delivering award-winning, high-value enterprise software and... Read More.

Sam Pullara
Sam Pullara (Sutter Hill Ventures), @sampullara

Sam Pullara is a managing director at Sutter Hill Ventures, where his current directorships include Boxer, FoundationDB, and Wavefront. He is also responsible for a number of the firm’s other investments, including Tomfoolery. Sam joined... Read More.

Karim Qazi
Karim Qazi (

Karim Qazi accomplished Software Engineer and Technical Leader with extensive experience in using Agile and Test-Driven-Development best practices to build automated, fault tolerant and highly available software.

Xavier Quintuna
Xavier Quintuna (Orange)

Xavier is the principal Big Data architect for Orange Silicon Valley, a subsidiary of the global telecom provider Orange. He has developed Big Data solutions for Call Detail Records, Quality of Service, and CDNs across... Read More.

Mansour Raad

With over 25 years of experience in the IT/GIS field, Mansour is a Cloudera Certified Hadoop Developer, HBase Expert and a BigData advocate within Esri.
He has performed as team lead in architecting... Read More.

Mithun Radhakrishnan (Yahoo! Inc.)

Mithun Radhakrishnan is a committer on the HCatalog project, and a Hive developer at Yahoo. He’s the author of DistCp on Hadoop 0.23+. He’s an erstwhile firmware developer and is prone to flare-ups from C++... Read More.

Sanjay Radia
Sanjay Radia (Hortonworks)

Sanjay is founder and architect at Hortonworks, and an Apache Hadoop committer and member of the Apache Hadoop PMC. Prior to co-founding Hortonworks, Sanjay was the chief architect of core-Hadoop at Yahoo and part... Read More.

Kira Radinsky
Kira Radinsky (eBay | Technion), @kiraradinsky

Kira Radinsky is the chief scientist and director of data science at eBay, where she is building the next-generation predictive data mining, deep learning, and natural language processing solutions that will transform ecommerce. She also... Read More.

Greg Rahn
Greg Rahn (Cloudera), @gregrahn

Greg Rahn has been a performance engineer for over a decade working on both RDBMS and Hadoop SQL engines. He spent eight years running competitive data warehouse benchmarks at Oracle as a member... Read More.

John Rauser
John Rauser (Snapchat), @jrauser

John has been extracting value from large datasets for over 20 years at hedge funds, small data-driven startups, Amazon, Pinterest, and now Snapchat. He has deep experience in machine learning, data visualization, on-line experimentation, website... Read More.

Ben Recht
Ben Recht (University of California, Berkeley)

Ben Recht is an Assistant Professor in the Department of Electrical Engineering and Computer Sciences and the Department of Statistics at the University of California, Berkeley. Ben’s research focuses on scalable computational tools for large-scale... Read More.

Sridhar Reddy
Sridhar Reddy (MapR Technologies)

Sridhar leads the Professional Services organization at MapR, and also helps customers build HBase solutions and migrate data from SQL databases. He also developed the HBase training course. He has extensive experience working with... Read More.

kim rees
kim rees (Periscopic), @krees

Kim Rees is a founding partner of Periscopic, an award-winning information visualization firm. Their work has been featured in the MOMA, CommArts, PRINT, Adobe Success Stories, and others.

Kim is a prominent... Read More.

David Robson
David Robson (Dell Software)

David Robson is a principal technologist at Dell Software. He is the lead developer of the Dell Oracle connector for Hadoop which is currently in the process of being donated to the Apache SQOOP... Read More.

Michael Rosenbaum
Michael Rosenbaum (Pegged Software)

Mike Rosenbaum is the Founder and CEO of Pegged Software. Pegged applies big data to team assembly in the healthcare industry, and in doing so reduces employee turnover in hospitals and long term care... Read More.

Gilad Rosner
Gilad Rosner (Internet of Things Privacy Forum), @GiladRosner

Dr. Gilad Rosner is a researcher and consultant in the fields of privacy and identity management. He is interested in how society is becoming more electronic, and the ways that social interests like privacy adapt... Read More.

Indrajit Roy
Indrajit Roy (HP Labs)

Indrajit Roy is a principal researcher at HP Labs and part of the HP Vertica engineering team. He builds distributed systems for machine learning and graph analytics. Indrajit has multiple publications in systems research and... Read More.

Majken Sander
Majken Sander (Majken Sander), @majsander

Business Analyst, Business developer and a strongly analytical mind. Has been working with IT, Management Information, Analytics, BI, DW for 20+ years. Keen on everything data, math and ‘data driven’ as a management business principle.

... Read More.
Metro John Schitka

John Schitka, is a Solution Marketing Manager on the SAP Big Data Solution Marketing team. His focus in the SAP Big Data arena is largely on Hadoop and SAP HANA smart... Read More.

Peter Schlampp

Peter is the VP of Products at Platfora where he is responsible for development, product design, and the roadmap for Platfora’s innovative business intelligence platform for Hadoop. Peter joined Platfora in October 2011. Prior to... Read More.

Jim Scott
Jim Scott (NVIDIA), @kingmesal

Jim has held positions running Operations, Engineering, Architecture and QA teams. Jim is the cofounder of the Chicago Hadoop Users Group (CHUG), where he has coordinated the Chicago Hadoop community for the past 4... Read More.

Shawn Scully
Shawn Scully (Dato)

Shawn is the Director of Product at GraphLab where he helps make it easy to build cool experiences with data. He is data geeky and loves inspired technologies, businesses, and gadgets. His technical background spans... Read More.

Jonathan Seidman

Jonathan has spent more than 15 years as a software developer, with a focus in the last few years on processing large data sets using tools such as Hadoop. Currently, Jonathan is a Solutions Architect... Read More.

Dafna Shahaf
Dafna Shahaf (The Hebrew University of Jerusalem)

Dafna Shahaf is an Assistant Professor at the School of Computer Science and Engineering at the Hebrew University of Jerusalem.
Her research is about making sense of massive amounts of data. She designs... Read More.

Vin Sharma
Vin Sharma (Intel)

Vin Sharma is responsible for strategic planning for Hadoop at Intel and marketing for the Intel Distribution for Apache Hadoop. In his previous role, Vin helped drive enterprise adoption of Linux, KVM, and OpenStack... Read More.

Jesse Shaw (LexisNexis)

Mr. Shaw is a consulting software engineer at LexisNexis Risk Solutions. He has responsibilities to leverage the four petabyte core of LexisNexis data assets as well as spearheads big data R&D using the LexisNexis Public... Read More.

Nathan Shetterley

Senior Manager for Accenture’s research into Emerging Data Architectures, Analytics, and Visualizations. Leading multiply teams of researchers and developers who are building data-driven, analytical solutions based on the next generation of distributed, horizontally-scaling data storage... Read More.

Max Shron
Max Shron (Warby Parker), @mshron

Max Shron is the founder of Polynumeral. Polynumeral specializes in making the right connection between business and data problems. They bring together data scientists, software engineers, and academics to translate tough business problems into solutions... Read More.

David Simchi-Levi

David Simchi-Levi is a Professor of Engineering Systems at MIT and Chairman of OPS Rules Management Consultants, an operations strategy consulting company. He is considered one of the premier thought leaders in supply... Read More.

Roy Singh
Roy Singh (Guavus)

Roy Singh is Chief Technology Officer at Guavus, the leading telecommunications data analytics solutions provider. He has been involved in the enterprise technology and data analytics industries for over 20 years. He spent 5 years... Read More.

Sid Sipes
Sid Sipes (SAP)

Sid joined SAP in 2010 via the Sybase Acquisition and has
over 25 years of experience in the database technology field,
and in particular in the Enterprise Data Warehouse and
Large... Read More.

Joseph Sirosh

Joseph Sirosh is the corporate vice president of the Cloud AI Platform at Microsoft, where he leads the company’s enterprise AI strategy and products such as Azure Machine Learning, Azure Cognitive Services, Azure Search, and... Read More.

Laurie Skelly
Laurie Skelly (Datascope Analytics), @laurieskelly

Laurie Skelly is a Data Scientist at Chicago-based data consulting firm Datascope Analytics. She is passionate about tackling the intractable and removing the barriers between the human and the technical components of a problem. Connecting... Read More.

Ed Smith
Ed Smith (AutoTrader)

Ed Smith serves as chief technology officer for AutoTrader Group, the largest automotive marketplace and leading provider of software solutions to auto dealers throughout the U.S.

In this role, Smith leads all of AutoTrader Group’s... Read More.

Sunil Soares
Sunil Soares (Information Asset)

Sunil Soares is the Founder & Managing Partner of Information Asset, LLC. Prior to this role, Sunil was the Director of Information Governance at IBM. He is the author of four books including... Read More.

Leo Spiegel
Leo Spiegel (Pivotal)

Leo Spiegel is the senior vice president of strategy and corporate development at Pivotal. Spiegel is a managing partner at Mission Ventures, a Southern California venture capital firm that invests in early-stage IT companies which... Read More.

Suresh Srinivas
Suresh Srinivas (Hortonworks)

Suresh is an Apache Hadoop committer and member of Apache Hadoop Project Management Committee (PMC). He is a long term active contributor to the Apache Hadoop project and has designed and developed many significant... Read More.

M. C. Srivas
M. C. Srivas (Bridgewater Associates ), @mcsrivas

Srivas is CTO and Founder of MapR Technologies. Srivas ran one of the major search infrastructure teams at Google where GFS, BigTable and MapReduce were used extensively. He wanted to provide that powerful... Read More.

Michael Stonebraker

Michael Stonebraker
is an adjunct professor at MIT CSAIL and a database pioneer who has been involved with Postgres, SciDB, Vertica, VoltDB, Tamr and other database companies. He co-authored the paper... Read More.

Michael joined Yelp in 2007 as a software engineer to help to rebuild the search engine. Over the years, Michael was promoted to more senior roles and led developer recruiting; as Vice President of Engineering,... Read More.

Siva leads search & analytics engineering at Box for the last 2 years. Before Box, he was a principal engineer on the ad platform team at Yahoo!

Vijay Subramanian (Rent the Runway)

Chief Analytics Officer at Rent the Runway.
Previously Lead Scientist at ProfitLogic/Oracle Retail; PhD in Operations Research from Purdue.

Jagane Sundar
Jagane Sundar (WANdisco)

Jagane Sundar has extensive big data, cloud, virtualization, and networking experience and joined WANdisco through its acquisition of AltoStor, a Hadoop-as-a-Service platform company. Before AltoStor, Jagane was founder and CEO of AltoScale, a Hadoop... Read More.

Mike Sutten
Mike Sutten (Kaiser Permanente)

Mike Sutten joined Kaiser Permanente in 2013 as Chief Technology Officer (CTO) and Senior Vice President. Under his leadership, the CTO organization is focused on setting the future direction for technology across Kaiser... Read More.

Rajiv Synghal
Rajiv Synghal (Kaiser Permanente)

Rajiv Synghal is principal of big data strategy at Kaiser Permanente. Previously, he held delivery and architecture roles in Fortune 100 organizations, including Visa and Nokia, and startups, such as Kivera. An accomplished strategic thinker... Read More.

Ameet Talwalkar
Ameet Talwalkar (Carnegie Mellon University | Determined AI)
Spark Camp Tutorial

Ameet Talwalkar is cofounder and chief scientist at Determined AI and an assistant professor in the School of Computer Science at Carnegie Mellon University. His research addresses scalability and ease-of-use issues in the field of... Read More.

Halle Tecco
Halle Tecco (Rock Health)

Halle Tecco is a Founder & Managing Director of Rock Health, the first seed fund devoted exclusively to digital health companies. She is also an active angel investor in over 30 technology companies including Misfit... Read More.

Andy Terrel
Andy Terrel (NumFOCUS), @aterrel
PyData at Strata Tutorial

Andy is a computational scientist with experience implementing distributed, large data applications. Currently serveing as the Chief Science Officer at Continuum Analytics, he leads the Blaze team taking the Python data stack to the next... Read More.

Rasmus Thofte
Rasmus Thofte (Virtusize), @Virtusize

Rasmus Thofte has a background in Digital Marketing, Global Sales, Music Production and Creative writing. His latest endeavour was a four-year run to build up world-renowned fashion sock brand Happy Socks. After having worked as... Read More.

Nellwyn is Director of Analytics at Etsy. She leads a team of data analysts who partner with product, marketing, and engineering to scout, build, instrument and improve Etsy’s product portfolio. Before Etsy, she worked on... Read More.

Michael Thompson (Children's Healthcare of Atlanta)
Kathleen Ting

Kathleen Ting (@kate_ting) is currently a technical account manager at Cloudera where she helps strategic customers deploy and use the Hadoop ecosystem in production. She’s a frequent conference speaker, has contributed to several projects in... Read More.

Kester Tong
Kester Tong (Google)
PyData at Strata Tutorial

I am a software engineer at Google Research. I work on machine learning algorithms and infrastructure, and on a product for collaborative data analysis, coLaboratory.

Matt Turck
Matt Turck (FirstMark Capital), @mattturck

Partner at FirstMark Capital. Previously, Managing Director at Bloomberg Ventures and before that, co-founder of enterprise search software company TripleHop, acquired by Oracle. Organizes Data Driven NYC (largest Big Data monthly event on the... Read More.

Cameron Turner
Cameron Turner (The Data Guild)

Combining an extensive background in product research, data analysis, program management, and software development, Cameron co-founded ClickStream Technologies in 2003, which was acquired by Microsoft in 2009. While at Microsoft, he managed the Windows Telemetry... Read More.

Jen van der Meer
Jen van der Meer (Reason Street), @jenvandermeer

Jen van der Meer is the founder and CEO of Reason Street, where she creates business models for social impact. A former Wall Street analyst and economist, Jen is a data doyen who masters... Read More.

Jake VanderPlas
Jake VanderPlas (eScience Institute, University of Washington)
PyData at Strata Tutorial

Jake Vanderplas is the director of research in the physical sciences at the University of Washington’s eScience Institute, where his research is primarily in the area of data-driven astronomy and astrophysics. In addition, Jake is... Read More.

Pramod Varma

Dr. Pramod Varma is currently the Chief Architect and Technology Advisor to Unique Identification Authority of India. As Chief Architect of Aadhaar, World’s largest biometric identity system, he is responsible for entire system architecture and... Read More.

Shankar Vedantam

Shankar Vedantam is the author of The Hidden Brain: How our Unconscious Minds Elect Presidents, Control Markets, Wage Wars and Save Our Lives. He is also a social science correspondent at NPR – National... Read More.

sunil venkayala

Sunil Venkayala, Senior Technical Product Manager at HP Vertica in Cambridge, Mass. He leads the Distributed R open-source technology initiative and advanced analytics features of the HP Vertica platform. Prior to joining HP, he was... Read More.

Merici Vinton
Merici Vinton (OI Engine @ IDEO ), @merici

Merici Vinton was one of the first employees of the Consumer Financial Protection Bureau as digital lead. She assisted the agency’s chief technology officer and its special advisor (and now Senator) Elizabeth Warren with the... Read More.

Bradley Voytek
Bradley Voytek (UC San Diego ), @bradleyvoytek

Brad is an professor of computational cognitive science and neuroscience at UC San Diego, and the Data Evangelist for Uber. He makes use of big data, mapping, and simulations to figure out cognition.

He’s created... Read More.

Hanna Wallach (Microsoft Research NYC & University of Massachusetts Amherst), @hannawallach

Hanna Wallach is a researcher at Microsoft Research in New York City and an assistant professor at the University of Massachusetts Amherst’s School of Computer Science, where she is one of five core faculty members... Read More.

Peter Wang
Peter Wang (Anaconda), @pwang
PyData at Strata Tutorial

Peter Wang is the cofounder and CTO of Anaconda, where he leads the product engineering team for the Anaconda platform and open source projects including Bokeh and Blaze. Peter’s been developing commercial scientific computing... Read More.

Tricia Wang
Tricia Wang (Constellate Data ), @triciawang

Tricia Wang is a global tech ethnographer. She founded P&L Data, a research consultancy around data science and social science. Through extensive fieldwork in China and Mexico as a Fulbright Fellow and National Science Foundation... Read More.

Mary Ann Wayer
Mary Ann Wayer (Premier Inc)

Principal Solution Architect at Premier Inc. I have 25 years experience as a software developer.

Daniel Weeks
Daniel Weeks (Netflix)

Daniel Weeks is the tech lead for the Big Data Platform team at Netflix. Prior to joining Netflix, he focused on research in big data solutions and distributed systems.

Patrick Wendell
Patrick Wendell (Databricks)
Spark Camp Tutorial

Patrick Wendell is a cofounder of Databricks as well as a founding committer and PMC member of Apache Spark. Patrick has acted as release manager for several Spark releases in addition to maintaining several... Read More.

Mike Wendt

Mike Wendt is an engineering manager in the AI Infrastructure Group at NVIDIA. His research work has focused on leveraging GPUs for big data analytics, data visualizations, and stream processing. Previously, Mike led engineering... Read More.

Ben Werther
Ben Werther (Platfora), @bwerther

Ben Werther is the Founder & CEO of Platfora. He founded the company in 2011 to realize his vision of how Big Data Analytics will transform the way to business can use data. Under... Read More.

Brian Whitman

Brian is recognized as a leading scientist in the area of music and text retrieval and natural language processing.

He received his doctorate from MIT’s Media Lab in 2005 in Barry Vercoe’s Machine Listening group... Read More.

David Whittemore
David Whittemore (Clothes Horse), @dwhittemore

David Whittemore is a serial entrepreneur and is currently the co-founder of Clothes Horse, a fashion technology startup in New York focused on helping shoppers buy clothes that fit from online apparel retailers. Before Clothes... Read More.

Hadley Wickham
Hadley Wickham (Rice University / RStudio), @hadleywickham
R Day Tutorial

Hadley Wickham is an Assistant Professor at Rice University and Chief Scientist at RStudio. He is an active member of the R community, has written and contributed to over 30 R packages, and won the... Read More.

Edd Wilder-James

Edd Wilder-James is a strategist at Google, where he is helping build a strong and vital open source community around TensorFlow. A technology analyst, writer, and entrepreneur based in California, Edd previously helped... Read More.

Richard Williamson
Richard Williamson (Silicon Valley Data Science)

Richard has been at the cutting edge of big data since its inception, leading multiple efforts to build multi-petabyte Hadoop platforms, maximizing business value by combining data science with big data. He has extensive experience... Read More.

Chris  Wilson
Chris Wilson (L.L.Bean), @LLBean

Chris is SVP of Direct Channel at L.L.Bean. Prior to L.L.Bean, he was CMO at eBags, where he was responsible for growing the business through customer loyalty and acquisition efforts, building the eBags... Read More.

Jonathan Wu
Jonathan Wu (Linkedin), @jiyewu

Jonathan leads LinkedIn’s Business Analytics Solution team focused on technical solutions. As the team’s technical leader, Jonathan provides end-to-end big data analytics solutions including data integration, data processing and data visualization. His team delivers easy,... Read More.

Yihui Xie
Yihui Xie (RStudio, Inc.), @xieyihui
R Day Tutorial

Yihui Xie got his PhD from the Department of Statistics, Iowa State University. He is interested in interactive statistical graphics, statistical computing, and web applications. He is an active R user and the author of... Read More.

Reynold Xin
Reynold Xin (Databricks)
Spark Camp Tutorial

Reynold Xin is a cofounder and chief architect at Databricks as well as an Apache Spark PMC member and release manager for Spark’s 2.0 release. Prior to Databricks, Reynold was pursuing a PhD at... Read More.

Fangjin Yang
Fangjin Yang (Imply)

Fangjin Yang is a coauthor of the open source Druid project and a cofounder of Imply, a data analytics startup based in San Francisco. Previously, Fangjin held senior engineering positions at Metamarkets and Cisco Systems.... Read More.

Matei Zaharia
Matei Zaharia (Databricks)
Spark Camp Tutorial

Matei Zaharia is an assistant professor of computer science at MIT, and the creator of Apache Spark. He is currently on industry leave to start Databricks, a company commercializing Spark, where he is Read More.

jay zaidi
jay zaidi (Fannie Mae)

During a professional career spanning more than 15 years, Jay has conceptualized and led several data management, business transformation and change management programs. He has worked in Management Consulting, Financial Services, and Healthcare domains in... Read More.

Jennifer Zeszut

Jennifer Zeszut is the CEO of Beckon, a marketing analytics company that helps top brands derive actionable insight from cross-channel marketing data. Prior to founding Beckon, Jennifer founded Scout Labs, a breakthrough social media... Read More.

Alice Zheng

Alice is the Director of Data Science at GraphLab, a Seattle-based startup that offers powerful large-scale machine learning and graph analytics tools. She loves playing with data and enabling others to play with data. She... Read More.

Wei Zheng
Wei Zheng (Trifacta)

As VP of Products, Wei combines her passion for technology with experience in Enterprise Software to define and shape Trifacta’s product offerings. Having founded several startups of her own, Wei believes strongly in innovative technology... Read More.

Paul Zikopoulos
Competitive Database and Big Data teams. Paul is an award winning writer and speaker with more than 20 years of experience in Information Management and is seen as a global expert in Big Data and... Read More.
Shivon Zilis
Shivon Zilis (Bloomberg Beta), @shivon

Shivon Zilis is a venture capitalist and founding member of Bloomberg Beta, where she focuses on early-stage data and machine-intelligence investments. Shivon has led 12 investments since launch. One, Newsle, was acquired by LinkedIn; others... Read More.

Monte Zweben
Monte Zweben (Splice Machine Inc.), @mzweben

Monte Zweben is the CEO and co-founder of Splice Machine, provider of the only Hadoop RDBMS. The Splice Machine database is designed to scale real-time applications using commodity hardware without application rewrites.

A... Read More.