Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Apache Flink: Streaming done right

Kostas Tzoumas (data Artisans)
2:40pm–3:20pm Thursday, 03/31/2016
IoT and Real-time

Location: 210 D/H
Tags: real-time
Average rating: ****.
(4.41, 17 ratings)

Data streaming is gaining popularity, as it offers decreased latency, a radically simplified data-infrastructure architecture, and the ability to cope with new data that is generated continuously. Apache Flink realizes this vision with a full-featured stream-processing framework. Flink is used in several companies, including ResearchGate, Bouygues Telecom, the Otto Group, and Capital One, and has a large and active developer community of well over 120 contributors. Kostas Tzoumas offers an overview of Flink and its streaming-first philosophy, as well as the project roadmap and vision: fully unifying the, now separate, worlds of “batch” and “streaming” analytics.

Kostas covers Flink’s many features and benefits, including:

  • Easy to use Java- and Scala-embedded APIs that make stream analytics easy yet provide powerful tools to deal with time and uncertainty
  • Throughput of a million of events per second per core
  • Latencies in the millisecond range
  • Exactly-once consistency guarantees and the ability to realize distributed transactional data movement between systems (e.g., between Kafka and HDFS)
  • Ease of configuration and separation between application logic and fault tolerance via a novel asynchronous checkpointing algorithm
  • No single point of failure
  • Integration with popular open source infrastructure (e.g., Hadoop, HBase, Kafka, Cascading, and Elasticsearch)
  • Support for event time and out-of-order arrivals with flexible windows, watermarks, and triggers
  • Batch processing as a special case of stream processing, including dedicated libraries for machine learning and graph processing, managed memory on- and off-heap, and query optimization
Photo of Kostas Tzoumas

Kostas Tzoumas

data Artisans

Kostas Tzoumas is a PMC member of the Apache Flink project and cofounder of data Artisans, the company founded by the original development team that created Flink. Kostas has spoken extensively about Flink, including at Hadoop Summit San Jose 2015.