Data Science on Hadoop: How Cloudera Impala Unlocks New Productivity and Insights

Data Science Hadoop: Tools & Technology, Beekman / Sutton North (NY Hilton)
Average rating: ****.
(4.00, 4 ratings)

This talk will cover what tools and techniques work and don’t work well for data scientists working on Hadoop today and how Cloudera Impala increases the productivity of data science and analysis on Hadoop. Cloudera Impala builds upon experiences and leading edge technology from big data systems at Facebook, Google, and Yahoo.

Photo of Justin Erickson

Justin Erickson


Justin Erickson is a senior director of product management leading Cloudera’s platform team, which is responsible for the components in Cloudera Distribution, including Hadoop (CDH) above storage. Previously, he led the high-availability and disaster-recovery areas of Microsoft SQL Server.

Photo of Marcel Kornacker

Marcel Kornacker


Marcel Kornacker is a tech lead at Cloudera and the architect of Apache Impala (incubating). Marcel has held engineering jobs at a few database-related startup companies and at Google, where he worked on several ad-serving and storage infrastructure projects. His last engagement was as the tech lead for the distributed query engine component of Google’s F1 project. Marcel holds a PhD in databases from UC Berkeley.

Comments on this page are now closed.


David Magaha
11/11/2012 4:29am EST


Are you publishing the materials in PDF format?? What about the source Hive code and the information concerning configuration?




Sponsorship Opportunities

For information on exhibition and sponsorship opportunities, contact Susan Stewart at

Media Partner Opportunities

For information on trade opportunities contact Kathy Yu at mediapartners

Press and Media

For media-related inquiries, contact Maureen Jennings at

Contact Us

View a complete list of Strata contacts.