Skip to main content
Sameer Agarwal

Sameer Agarwal
PhD Student, UC Berkeley

Sameer is a PhD student in the AMPLab at UC Berkeley. He actively collaborated with Microsoft Researchers on RoPE, an optimizer for parallel executions that has been successfully deployed on production clusters at Microsoft Bing. He completed his undergraduate education in the Department of Computer Science and Engineering at the Indian Institute of Technology, Guwahati in 2009 and was awarded the prestigious President of India Gold Medal.

Sessions

Hadoop and Beyond
GA Ballroom K
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Sameer Agarwal (UC Berkeley), Tathagata Das (Databricks), Ali Ghodsi (UC Berkeley), Ion Stoica (UC Berkeley), Ameet Talwalkar (Carnegie Mellon University | Determined AI), Reynold Xin (Databricks), Matei Zaharia (Databricks), Joseph Gonzalez (UC Berkeley)
Average rating: ****.
(4.29, 7 ratings)
3-Hours: An introduction to the newest components of the open-source Berkeley Data Analytics Stack (BDAS) in development at UC Berkeley (and an overview of existing ones). BlinkDB is a SQL engine that provides fast approximate distributed query results. MLbase includes a library to make machine learning at scale easy. Tachyon is a file system that provides memory speed sharing across frameworks.. Read more.
Hadoop and Beyond
GA Ballroom K
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Andy Konwinski (Databricks), Sameer Agarwal (UC Berkeley), Tathagata Das (Databricks), Ameet Talwalkar (Carnegie Mellon University | Determined AI), Shivaram Venkataraman (UC Berkeley), Patrick Wendell (Databricks), Reynold Xin (Databricks), Matei Zaharia (Databricks), Joseph Gonzalez (UC Berkeley), Haoyuan Li (Alluxio)
Average rating: ***..
(3.10, 10 ratings)
3-Hours: Get hands-on training with the newest components of the open-source Berkeley Data Analytics Stack (BDAS). Lessons will cover BlinkDB, MLbase, Spark, Spark Streaming, and Shark. We will provide each audience member with an EC2 cluster and walk through hands-on exercises using these technologies to analyze real-world datasets. Read more.
Hadoop and Beyond
GA Ballroom J
Reynold Xin (Databricks), Sameer Agarwal (UC Berkeley)
Average rating: ***..
(3.50, 6 ratings)
BlinkDB is an approximate query engine that answers queries in seconds on extremely large datasets by leveraging data sampling. It exploits advances in machine learning and distributed query processing to allow trading off response times and accuracy. BlinkDB is being integrated into Shark and Presto. We will cover real world use case scenarios of BlinkDB at adopters such as Facebook. Read more.