Apache Flink (http://flink.incubator.apache.org) is an open source project undergoing incubation in the Apache Software Foundation. Flink creates a data analysis engine that is designed to match Hadoop in reliability and Spark in performance.
The project pushes the technology forward in many ways: Flink is compatible with the Hadoop ecosystem and runs on top of HDFS and YARN. Flink’s programs are not executed directly but are optimized by Flink’s cost-based optimizer similarly to what SQL engines do for relational algebra programs. This means that Flink applications require little (re-)configuration and little maintenance when the cluster characteristics change and the data evolves over time.
Flink’s runtime implements a unique approach to memory management, using in-memory execution as much as possible and very gracefully degrading to disk-based execution when memory is not enough. Flink introduces native closed-loop iteration operators, making graph analysis and machine learning applications very fast on the platform.
Finally, Flink’s runtime is a true data streaming engine, unifying batch processing and true stream processing in a single system. Flink is an active open source project with more than 70 contributors from industry and academia.
Stephan Ewen is one of the originators and committers of the Apache Flink project, and is a CTO at a Berlin-based startup where he leads the effort to create a novel distributed system for reliable large-scale data processing.
Stephan holds a Ph.D. from the Berlin University of Technology, and is a co-author of the Stratosphere system. He has worked on data processing technologies at IBM and Microsoft in the past.
©2015, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.