Presented By O'Reilly and Cloudera
Make Data Work
22–23 May 2017: Training
23–25 May 2017: Tutorials & Conference
London, UK

Tuning Impala: The top five performance optimizations for the best BI and SQL analytics on Hadoop

Marcel Kornacker (Cloudera), Mostafa Mokhtar (Cloudera)
14:5515:35 Wednesday, 24 May 2017
Hadoop platform and applications
Location: Capital Suite 13
Level: Intermediate
Average rating: *****
(5.00, 2 ratings)

Who is this presentation for?

  • Architects, developers, and software engineers

Prerequisite knowledge

  • A basic understanding of SQL analytics principles

What you'll learn

  • Learn best practices for design choices involving SQL analytics on Hadoop

Description

When it comes to SQL on Hadoop, it is easy to feel overwhelmed with the number of choices available in tools, file formats, schema design, and configurations. Making good design choices when you start is the best way to avoid some of the common pitfalls. Marcel Kornacker and Mostafa Mokhtar simplify the process and cover top performance optimizations for Apache Impala (incubating), from schema design and memory optimization to query tuning.

Topics include:

  • SQL on Hadoop: How to pick your tool based on the workload and understand where Hive, Impala, and Spark SQL are best used
  • Requirements and considerations for BI and SQL analytic workloads
  • Schema design
  • Memory usage, cluster size, and hardware recommendations
  • Multitenancy best practices
  • Query tuning basics for Impala
  • Impala performance and benchmarking
Photo of Marcel Kornacker

Marcel Kornacker

Cloudera

Marcel Kornacker is a tech lead at Cloudera and the architect of Apache Impala (incubating). Marcel has held engineering jobs at a few database-related startup companies and at Google, where he worked on several ad-serving and storage infrastructure projects. His last engagement was as the tech lead for the distributed query engine component of Google’s F1 project. Marcel holds a PhD in databases from UC Berkeley.

Photo of Mostafa Mokhtar

Mostafa Mokhtar

Cloudera

Mostafa Mokhtar is a performance engineer at Cloudera. Previously, he held similar roles at Hortonworks and on the SQL Server team at Microsoft.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Comments

Andrzej Jędrzejewski | WEBOPS ENGINEER
29/05/2017 12:24 BST

Hi Guys,
Could you upload your slides, please?