Fueling innovative software
July 15-18, 2019
Portland, OR

Open source streaming analytics with the Kafka, Flink, Cassandra (KFC) stack

Bas Geerdink (Aizonic)
2:35pm3:15pm Wednesday, July 17, 2019
Secondary topics:  Data Driven
Average rating: ****.
(4.50, 2 ratings)

Who is this presentation for?

  • Software and solution architects and anyone interested in fast data and streaming analytics

Level

Intermediate

Description

Streaming analytics (or fast data) is becoming an increasingly popular subject in enterprise organizations because customers want to have real-time experiences, such as notifications and advice based on their online behavior and other users’ actions. A typical streaming analytics solution follows a “pipes and filters” pattern that consists of three main steps: detecting patterns on raw event data (complex event processing), evaluating the outcomes with the aid of business rules and machine learning algorithms, and deciding on the next action.

Bas Geerdink details an open source reference solution for streaming analytics that covers many use cases that follow this pattern: actionable insights, fraud detection, log parsing, traffic analysis, factory data, the IoT, and others. The solution is built with the KFC stack: Kafka, Flink, and Cassandra. All source code is written in Scala.

Bas explores a few architecture challenges that arise when dealing with streaming data, such as latency issues, event time versus server time, and exactly once processing. He provides architectural diagrams, explanations, a demo, and the source code. The solution (“Styx”) is open source and available on GitHub.

Prerequisite knowledge

  • A basic knowledge of big data, fast data applications, and application and solution architecture
  • A working knowledge of reference architecture and how to use one

What you'll learn

  • Learn how to set up a streaming analytics solution with the KFC stack, some basic concepts in this field, and an open source technology stack that follows the patterns and principles of the reference architecture
Photo of Bas Geerdink

Bas Geerdink

Aizonic

Bas Geerdink is an independent technology lead, focusing on AI and big data. He has worked in several industries on state-of-the-art data platforms and streaming analytics solutions, in the cloud and on prem. Bas has a background in software development, design, and architecture with broad technical experience from C++ to Prolog to Scala. His academic background is in artificial intelligence and informatics. Bas’s research on reference architectures for big data solutions was published at the IEEE conference ICITST 2013. He occasionally teaches programming courses and is a regular speaker at conferences and informal meetings.