Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Presto: Tuning performance of SQL-on-anything analytics

Kamil Bajda-Pawlikowski (Starburst), Martin Traverso (Presto Software Foundation)
11:00am11:40am Thursday, March 28, 2019
Secondary topics:  Storage, Streaming, realtime analytics, and IoT
Average rating: ***..
(3.33, 3 ratings)

Who is this presentation for?

  • Data architects and engineers, data platform leads, and data analysts

Level

Intermediate

Prerequisite knowledge

  • General familiarity with using SQL with big data

What you'll learn

  • Explore Presto, an SQL-on-anything engine
  • Understand query optimization basics and how to deal with diverse data sources

Description

Presto, an open source distributed SQL engine, is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Proven at scale in a variety of use cases at Airbnb, Bloomberg, Comcast, Facebook, FINRA, LinkedIn, Lyft, Netflix, Twitter, and Uber, in the last few years Presto has experienced an unprecedented growth in popularity in both on-premises and cloud deployments over object stores, HDFS, NoSQL, and RDBMS data stores.

With the ever-growing list of connectors to new data sources such as Azure Blob Storage, Elasticsearch, Netflix Iceberg, Apache Kudu, and Apache Pulsar, Presto’s recently introduced cost-based optimizer must account for heterogeneous inputs with differing and often incomplete data statistics. Kamil Bajda-Pawlikowski and Martin Traverso explore this topic and detail use cases for Presto across several industries. They also share recent Presto advancements, such as geospatial analytics at scale, and the project roadmap going forward.

Photo of Kamil Bajda-Pawlikowski

Kamil Bajda-Pawlikowski

Starburst

Kamil Bajda-Pawlikowski is cofounder and CTO of enterprise Presto company Starburst. Previously, Kamil was the chief architect at the Teradata Center for Hadoop in Boston, focusing on the open source SQL engine Presto, and the cofounder and chief software architect of Hadapt, the first SQL-on-Hadoop company (acquired by Teradata). Kamil began his journey with Hadoop and modern MPP SQL architectures about 10 years ago during a doctoral program at Yale University, where he co-invented HadoopDB, the original foundation of Hadapt’s technology. He holds an MS in computer science from Wroclaw University of Technology and both an MS and an MPhil in computer science from Yale University.

Photo of Martin Traverso

Martin Traverso

Presto Software Foundation

Martin Traverso is a cofounder of the Presto Software Foundation and one of the original creators of Presto. Previously, he was a software engineer at Facebook, where he lead the Presto development team.