Everything open source
May 16–17, 2016: Training & Tutorials
May 18–19, 2016: Conference
Austin, TX

A data-streaming architecture with Apache Flink

Jamie Grier (data Artisans)
5:10pm–5:50pm Thursday, 05/19/2016
Location: Ballroom F Level: Intermediate
Average rating: ****.
(4.33, 6 ratings)

Prerequisite knowledge

Attendees should have a basic familiarity with the applications of data analysis as well as the Hadoop ecosystem.


Data Streaming is emerging as a new and increasingly popular architectural pattern for the data infrastructure. Data-streaming architectures embrace the fact that data in practice never has the form of static datasets but is continuously produced as streams of events over time. Moving away from centralized “state of the world” databases and warehouses, these applications work directly on the streams of events and on application-specific local state that is an aggregate of the history of events. Among the many disruptive promises of streaming architectures are adecreased latency from signal to decision, a unified way of handling real-time and historic data processing, time travel queries, simple versioning of applications and their state (think Git update/rollback), and simplification of data processing stack.

Jamie Grier introduces the data-streaming paradigm, explains the building blocks of data streaming applications, and shows how to build a set of simple but representative applications using Apache Flink and Apache Kafka.

Topics include:

  • Event stream logs
  • Transformations and windows
  • Working with time
  • Application state and consistency
Photo of Jamie Grier

Jamie Grier

data Artisans

Jamie Grier is director of applications engineering at data Artisans, where he helps others realize the potential of Apache Flink in their own projects. Jamie has been working on stream processing for the last decade at companies such as Twitter, Gnip, and Boulder Imaging on projects spanning everything from ultra-high-performance video stream processing to social media analytics.