Ingestion and processing streaming data play increasingly significant roles in the telecommunication industry. SK Telecom, Korea’s number-one telecommunications provider, is always working on using infra resources more efficiently. Apache Druid supports autoscaling for data ingestion, but it’s only available on AWS EC2. You can’t rely on the feature on your private cloud.
Jinchul Kim demonstrates autoscale-out/in on Kubernetes, details the benefit on this approach, and discusses the development of Druid Helm charts, rolling updates, and custom metric usage for horizontal autoscaling. This approach is better than Druid’s scaling implementation because it can be used anywhere from private clouds to (managed) Kubernetes in Azure, AWS, and GKE. And while AWS EC2’s startup and termination takes a few minutes, this approach takes only a few seconds. Finally, the scaling mechanism is decoupled from Druid’s source code.
Jinchul Kim is a senior software engineer at SK Telecom, where he leads cloud platform development using Kubernetes, Docker, Apache Druid, and Apache Hadoop and designed and implemented a Dockerized DevOps framework. Previously, he was a senior software engineer at SAP Labs working on the SAP HANA in-memory engine. Jinchul is a committer to the Apache Impala project.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org