Presented By O'Reilly and Cloudera
Make Data Work
Sept 29–Oct 1, 2015 • New York, NY

What's new in Spark Streaming - a technical overview

Tathagata Das (Databricks)
11:20am–12:00pm Thursday, 10/01/2015
Spark & Beyond
Location: 1 E20 / 1 E21 Level: Advanced
Average rating: ****.
(4.20, 15 ratings)

As the adoption of Spark Streaming in the industry is increasing, so is the community’s demand for more features. Since the beginning of this year, we have made significant improvements in performance, usability, and semantic guarantees. In particular, some of these features are:

  • New Kafka integration for exactly-once guarantees
  • Improved Kinesis integration for stronger guarantees
  • Addition of more sources to the Python API
  • Significantly improved UI for greater monitoring and debuggability.

In this talk, I am going to discuss these improvements as well as the plethora of features we plan to add in the near future.

Photo of Tathagata Das

Tathagata Das

Databricks

Tathagata Das is an Apache Spark committer and a member of the PMC. He is the lead developer behind Spark Streaming, which he started while a PhD student in the UC Berkeley AMPLab, and is currently employed at Databricks. Prior to Databricks, Tathagata worked at the AMPLab, conducting research about data-center frameworks and networks with Scott Shenker and Ion Stoica.