Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Securing Apache Kafka

Jun Rao (Confluent)
11:50am–12:30pm Wednesday, 03/30/2016
Security

Location: LL21 B
Tags: real-time
Average rating: ****.
(4.33, 15 ratings)

Prerequisite knowledge

Attendees should have a high-level understanding of Kafka and its use cases.

Description

Kafka was developed at LinkedIn in 2010. To encourage adoption, it was originally an open system; developers could easily create new data streams, add data to the pipeline, and read data as it was created. Kafka succeeded brilliantly at encouraging developers to build new data applications, improved the reliability of systems and applications, and helped LinkedIn scale its logging and data infrastructure.

Unfortunately, as Kafka usage grew at LinkedIn (and at other sites), we discovered problems with a totally open system. Developers might inadvertently cause production problems when creating new Kafka streams, engineers might change the configuration of critical systems, and employees might get access to sensitive data. As Kafka has been adopted by larger enterprises with more complex security requirements, we have had to rethink our architecture.

Jun Rao explains how the community has secured Apache Kafka, discussing the threats that Kafka Security mitigates, the changes that we made to Kafka to enable security, and the steps required to secure an existing Kafka cluster.

Topics include:

  • New security features in Kafka 0.9
  • The risks and threats with a distributed data-streaming system
  • Common issues with deploying a secure Kafka system
  • The access control model for Kafka
  • Configuring authentication, access control, and encryption
  • Using a secure Kafka cluster with other secure (and insecure) systems
  • Testing, monitoring, and tuning a secure Kafka cluster
  • Future work in Kafka security
Photo of Jun Rao

Jun Rao

Confluent

Jun Rao is the cofounder of Confluent, a company that provides a streaming data platform on top of Apache Kafka. Previously, Jun was a senior staff engineer at LinkedIn, where he led the development of Kafka, and a researcher at IBM’s Almaden research data center, where he conducted research on database and distributed systems. Jun is the PMC chair of Apache Kafka and a committer of Apache Cassandra.