Presented By O'Reilly and Cloudera
Make Data Work
31 May–1 June 2016: Training
1 June–3 June 2016: Conference
London, UK
Tathagata Das

Tathagata Das
Lead developer of Spark Streaming and Software engineer, Databricks

Tathagata Das is an Apache Spark committer and a member of the PMC. He is the lead developer behind Spark Streaming, which he started while a PhD student in the UC Berkeley AMPLab, and is currently employed at Databricks. Prior to Databricks, Tathagata worked at the AMPLab, conducting research about data-center frameworks and networks with Scott Shenker and Ion Stoica.

Sessions

11:15–11:55 Thursday, 2/06/2016
Spark & beyond
Location: Capital Suite 13 Level: Non-technical
Tathagata Das (Databricks)
Average rating: ****.
(4.08, 12 ratings)
Spark 2.0 is a major milestone for the project. It achieves major advances in performance and introduces new initiatives to unify streaming processing with the Spark’s SQL engine. Tathagata Das explores these exciting new developments in Spark 2.0 as well as some other major initiatives that are coming in the future. Read more.
11:15–11:55 Friday, 3/06/2016
Spark & beyond
Location: Capital Suite 13 Level: Intermediate
Tags: real-time
Tathagata Das (Databricks)
Average rating: ****.
(4.38, 8 ratings)
Tathagata Das explains how Spark 2.x develops the next evolution of Spark Streaming by extending DataFrames and Datasets in Spark to handle streaming data. Streaming Datasets provides a single programming abstraction for batch and streaming data and also brings support for event-time-based processing, out-of-order data, sessionization, and tight integration with nonstreaming data sources. Read more.