Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Kafka streaming microservices with Akka Streams and Kafka Streams

Dean Wampler (Anyscale), Boris Lublinsky (Lightbend)
13:3017:00 Tuesday, 22 May 2018
Average rating: ***..
(3.25, 4 ratings)

Who is this presentation for?

  • Data engineers and architects

Prerequisite knowledge

  • Programming experience, preferably with Java or Scala
  • Familiarity with Kafka, Kafka Streams, and Akka Streams (useful but not required)

Materials or downloads needed in advance

What you'll learn

  • Learn how to combine Kafka with Akka Streams and Kafka Streams to implement stream processing microservices, leveraging the strengths of these tools while avoiding their weaknesses
  • Understand how these libraries compare to Spark Streaming and Flink for stream processing

Description

If you’re building streaming data apps, your first inclination might be to reach for Spark Streaming, Flink, Apex, or similar tools, which run as services to which you submit jobs for execution. But sometimes, writing conventional microservices with embedded stream processing is a better fit for your needs.

Kafka Streams and Akka Streams are both libraries that you integrate into your microservices, which means you must manage their lifecycles yourself, but you also get lots of flexibility to do this as you see fit. Kafka Streams is purpose built for reading data from Kafka topics, processing it, and writing the results to new topics. With powerful stream and table abstractions, and an exactly once capability, it supports a variety of common scenarios involving transformation, filtering, and aggregation. Akka Streams, on the other hand, emerged as a dataflow-centric abstraction for the Akka Actors model, designed for general-purpose microservices, especially when per-event low latency is important, such as for complex event processing, where each event requires individual handling. Because of its general-purpose nature, Akka Streams supports a wider class of application problems and third-party integrations, but it’s less focused on Kafka-specific capabilities.

Dean Wampler and Boris Lublinsky walk you through building streaming apps as microservices using Akka Streams and Kafka Streams. Along the way, Dean and Boris discuss the strengths and weaknesses of each tool for particular design needs and contrast them with Spark Streaming and Flink, so you’ll know when to chose them instead.

Photo of Dean Wampler

Dean Wampler

Anyscale

Dean Wampler is an expert in streaming data systems, focusing on applications of machine learning and artificial intelligence (ML/AI). He’s head of developer relations at Anyscale, which is developing Ray for distributed Python, primarily for ML/AI. Previously, he was an engineering VP at Lightbend, where he led the development of Lightbend CloudFlow, an integrated system for building and running streaming data applications with Akka Streams, Apache Spark, Apache Flink, and Apache Kafka. Dean is the author of Fast Data Architectures for Streaming Applications, Programming Scala, and Functional Programming for Java Developers, and he’s the coauthor of Programming Hive, all from O’Reilly. He’s a contributor to several open source projects. A frequent conference speaker and tutorial teacher, he’s also the co-organizer of several conferences around the world and several user groups in Chicago. He earned his PhD in physics from the University of Washington.

Photo of Boris Lublinsky

Boris Lublinsky

Lightbend

Boris Lublinsky is a principal architect at Lightbend, where he specializes in big data, stream processing, and services. Boris has over 30 years’ experience in enterprise architecture. Previously, he was responsible for setting architectural direction, conducting architecture assessments, and creating and executing architectural road maps in fields such as big data (Hadoop-based) solutions, service-oriented architecture (SOA), business process management (BPM), and enterprise application integration (EAI). Boris is the coauthor of Applied SOA: Service-Oriented Architecture and Design Strategies, Professional Hadoop Solutions, and Serving Machine Learning Models. He’s also cofounder of and frequent speaker at several Chicago user groups.