Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Tuning Impala: The top five performance optimizations for the best BI and SQL analytics on Hadoop

Marcel Kornacker (Cloudera), Mostafa Mokhtar (Cloudera)
1:15pm–1:55pm Wednesday, 09/28/2016
Hadoop internals & development
Location: 3D 10 Level: Intermediate
Average rating: ****.
(4.80, 5 ratings)

Prerequisite knowledge

  • A basic understanding of SQL analytics principles
  • What you'll learn

  • Learn best practices for design choices involving SQL analytics on Hadoop
  • Description

    When it comes to SQL-on-Hadoop, it is easy to feel overwhelmed with the number of choices available in tools, file formats, schema design, and configurations. However, in reality, making good design choices when you start will help you avoid some of the common pitfalls. Marcel Kornacker and Mostafa Mokhtar simplify the process and cover top performance optimizations for Apache Impala (incubating), from schema design and memory optimization to query tuning.

    Topics include:

    • SQL-on-Hadoop: Picking your tool based on the workload and understanding where Hive, Impala, and Spark SQL are best used
    • Requirements and considerations for BI and SQL analytic workloads
    • Schema design
    • Memory usage, cluster size, and hardware recommendations
    • Multitenancy best practices
    • Query tuning basics for Impala
    • Impala performance and benchmarking
    Photo of Marcel Kornacker

    Marcel Kornacker

    Cloudera

    Marcel Kornacker is a tech lead at Cloudera and the architect of Apache Impala (incubating). Marcel has held engineering jobs at a few database-related startup companies and at Google, where he worked on several ad-serving and storage infrastructure projects. His last engagement was as the tech lead for the distributed query engine component of Google’s F1 project. Marcel holds a PhD in databases from UC Berkeley.

    Photo of Mostafa Mokhtar

    Mostafa Mokhtar

    Cloudera

    Mostafa Mokhtar is a performance engineer at Cloudera. Previously, he held similar roles at Hortonworks and on the SQL Server team at Microsoft.

    Comments on this page are now closed.

    Comments

    Picture of Alex Rivlin
    Alex Rivlin
    10/02/2016 8:54pm EDT

    Marcel, Mostafa, can you send the link to the slides, please? -thank you!
    Alex

    Kathy Yu
    09/29/2016 7:54am EDT

    Hi Giuseppe – we post all slides to strataconf.com/slides as soon as we received them from speakers (if they choose to share them).

    09/29/2016 7:35am EDT

    are slides available?