Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

Apache Kudu: 1.0 and beyond

Todd Lipcon (Cloudera)
4:35pm–5:15pm Thursday, 09/29/2016
Hadoop internals & development
Location: River Pavilion Level: Intermediate

Prerequisite knowledge

  • General familiarity with the Apache Kudu (incubating) project (No experience contributing to Kudu or programming using the Kudu APIs is required.)
  • What you'll learn

  • Receive an update on the latest news from Kudu development
  • Learn what to expect in upcoming releases
  • Explore the experience of real-life users who have deployed Kudu in production
  • Description

    Apache Kudu was first announced as a public beta release at Strata NYC 2015 and recently reached 1.0. This conference marks its one year anniversary as a public open source project. Todd Lipcon offers a very brief refresher on the goals and feature set of the Kudu storage engine, covering the development that has taken place over the last year, including new features such as improved support for time series workloads, performance improvements, Spark integration, and highly available replicated masters. Along the way, Todd explores real-world production deployments and some of the tools that have been built to help operators manage a Kudu cluster. He ends with a view of the road map of the Kudu project for the upcoming year, including plans for security and other new features.

    Photo of Todd Lipcon

    Todd Lipcon


    Todd Lipcon is an engineer at Cloudera, where he primarily contributes to open source distributed systems in the Apache Hadoop ecosystem. Previously, he focused on Apache HBase, HDFS, and MapReduce, where he designed and implemented redundant metadata storage for the NameNode (QuorumJournalManager), ZooKeeper-based automatic failover, and numerous performance, durability, and stability improvements. In 2012, Todd founded the Apache Kudu project and has spent the last three years leading this team.¬†Todd is a committer and PMC member on Apache HBase, Hadoop, Thrift, and Kudu, as well as a member of the Apache Software Foundation. Prior to Cloudera, Todd worked on web infrastructure at several startups and researched novel machine learning methods for collaborative filtering. Todd holds a bachelor’s degree with honors from Brown University.