Tech lead at Cloudera for new products. Graduated in 2000 with a PhD in databases from UC Berkeley, followed by engineering jobs at a few database-related startup companies. Marcel joined Google in 2003, where he worked on several ads serving and storage infrastructure projects. His last engagement was as the tech lead for the distributed query engine component of Google’s F1 project.
11:00am–11:40am Thursday, 10/16/2014
Location: Hall A 23/24
Find out how to run real-time analytics over raw data without requiring a manual ETL process targeted at an RDBMS. This talk describes Impala’s approach to on-the-fly data transformation and its support for nested data; examples demonstrate how this can be used to query raw data feeds in formats such as text, JSON and XML, at a performance level commonly associated with specialized engines.