Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Apache Druid autoscale-out/in for streaming data ingestion on Kubernetes

Jinchul Kim (SK Telecom)
4:40pm5:20pm Thursday, March 28, 2019
Average rating: **...
(2.17, 6 ratings)

Who is this presentation for?

  • Software engineers, cloud engineers, and product managers

Level

Intermediate

Prerequisite knowledge

  • Familiarity with distributed computing and cloud systems

What you'll learn

  • Learn how SK Telecom does elastic scaling on Kubernetes and why this approach is better than using Apache Druid's autoscaling feature

Description

Ingestion and processing streaming data play increasingly significant roles in the telecommunication industry. SK Telecom, Korea’s number-one telecommunications provider, is always working on using infra resources more efficiently. Apache Druid supports autoscaling for data ingestion, but it’s only available on AWS EC2. You can’t rely on the feature on your private cloud.

Jinchul Kim demonstrates autoscale-out/in on Kubernetes, details the benefit on this approach, and discusses the development of Druid Helm charts, rolling updates, and custom metric usage for horizontal autoscaling. This approach is better than Druid’s scaling implementation because it can be used anywhere from private clouds to (managed) Kubernetes in Azure, AWS, and GKE. And while AWS EC2’s startup and termination takes a few minutes, this approach takes only a few seconds. Finally, the scaling mechanism is decoupled from Druid’s source code.

Photo of Jinchul Kim

Jinchul Kim

SK Telecom

Jinchul Kim is a senior software engineer at SK Telecom, where he leads cloud platform development using Kubernetes, Docker, Apache Druid, and Apache Hadoop and designed and implemented a Dockerized DevOps framework. Previously, he was a senior software engineer at SAP Labs working on the SAP HANA in-memory engine. Jinchul is a committer to the Apache Impala project.