Presented By O'Reilly and Cloudera
Make Data Work
December 1–3, 2015 • Singapore
Marcel Kornacker

Marcel Kornacker
Tech Lead, Cloudera

Marcel Kornacker is a tech lead at Cloudera and the architect of Apache Impala (incubating). Marcel has held engineering jobs at a few database-related startup companies and at Google, where he worked on several ad-serving and storage infrastructure projects. His last engagement was as the tech lead for the distributed query engine component of Google’s F1 project. Marcel holds a PhD in databases from UC Berkeley.


1:30pm–2:10pm Wednesday, 12/02/2015
Data Science and Advanced Analytics
Location: 321-322 Level: Intermediate
Marcel Kornacker (Cloudera), Skye Wanderman-Milne (Cloudera)
Average rating: ***..
(3.90, 10 ratings)
In this talk, we will explain how data scientists use nested data structures to increase analytic productivity. We will use two well-known relational schemas - TPC-H and Twitter - to demonstrate how to simplify data science workloads with nested schemas. Also, we will outline best practices for converting flat relational schemas into nested ones, and give examples of data science-style analysis. Read more.