Skip to main content
Tathagata Das

Tathagata Das
Lead developer of Spark Streaming and Software engineer, Databricks

Tathagata Das is a third-year Ph.D. student in the AMP Lab in UC Berkeley, working Scott Shenker and Ion Stoica. He leads the development of the Spark Streaming project. His research interests include datacenter networks and frameworks for large scale data processing. Before graduate school, he has worked as an Assistant Researcher in Microsoft Research Lab India.

Sessions

Hadoop and Beyond
GA Ballroom K
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Sameer Agarwal (UC Berkeley), Tathagata Das (Databricks), Ali Ghodsi (UC Berkeley), Ion Stoica (UC Berkeley), Ameet Talwalkar (Carnegie Mellon University | Determined AI), Reynold Xin (Databricks), Matei Zaharia (Databricks), Joseph Gonzalez (UC Berkeley)
Average rating: ****.
(4.29, 7 ratings)
3-Hours: An introduction to the newest components of the open-source Berkeley Data Analytics Stack (BDAS) in development at UC Berkeley (and an overview of existing ones). BlinkDB is a SQL engine that provides fast approximate distributed query results. MLbase includes a library to make machine learning at scale easy. Tachyon is a file system that provides memory speed sharing across frameworks.. Read more.
Hadoop and Beyond
GA Ballroom K
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Andy Konwinski (Databricks), Sameer Agarwal (UC Berkeley), Tathagata Das (Databricks), Ameet Talwalkar (Carnegie Mellon University | Determined AI), Shivaram Venkataraman (UC Berkeley), Patrick Wendell (Databricks), Reynold Xin (Databricks), Matei Zaharia (Databricks), Joseph Gonzalez (UC Berkeley), Haoyuan Li (Alluxio)
Average rating: ***..
(3.10, 10 ratings)
3-Hours: Get hands-on training with the newest components of the open-source Berkeley Data Analytics Stack (BDAS). Lessons will cover BlinkDB, MLbase, Spark, Spark Streaming, and Shark. We will provide each audience member with an EC2 cluster and walk through hands-on exercises using these technologies to analyze real-world datasets. Read more.