Skip to main content
Reynold Xin

Reynold Xin
Cofounder, Databricks

Website

Reynold Xin is a PhD student in the AMP Lab at UC Berkeley. He leads the research and development of two open source systems: Shark, an analytical SQL engine that is up to 100X faster than Apache Hive; and SparkGraph, a distributed graph computation engine. He is a recipient of Best Demo Award from SIGMOD 2012 and Best Demo Award from VLDB 2011. Before graduate school, he worked on ads infrastructure at Google and distributed databases at IBM.

Sessions

Hadoop and Beyond
GA Ballroom K
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Sameer Agarwal (UC Berkeley), Tathagata Das (Databricks), Ali Ghodsi (UC Berkeley), Ion Stoica (UC Berkeley), Ameet Talwalkar (Carnegie Mellon University and Determined AI), Reynold Xin (Databricks), Matei Zaharia (Databricks), Joseph Gonzalez (UC Berkeley)
Average rating: ****.
(4.29, 7 ratings)
3-Hours: An introduction to the newest components of the open-source Berkeley Data Analytics Stack (BDAS) in development at UC Berkeley (and an overview of existing ones). BlinkDB is a SQL engine that provides fast approximate distributed query results. MLbase includes a library to make machine learning at scale easy. Tachyon is a file system that provides memory speed sharing across frameworks.. Read more.
Hadoop and Beyond
GA Ballroom K
Tutorial Please note: to attend, your registration must include Tutorials on Tuesday.
Andy Konwinski (Databricks), Sameer Agarwal (UC Berkeley), Tathagata Das (Databricks), Ameet Talwalkar (Carnegie Mellon University and Determined AI), Shivaram Venkataraman (UC Berkeley), Patrick Wendell (Databricks), Reynold Xin (Databricks), Matei Zaharia (Databricks), Joseph Gonzalez (UC Berkeley), Haoyuan Li (Alluxio)
Average rating: ***..
(3.10, 10 ratings)
3-Hours: Get hands-on training with the newest components of the open-source Berkeley Data Analytics Stack (BDAS). Lessons will cover BlinkDB, MLbase, Spark, Spark Streaming, and Shark. We will provide each audience member with an EC2 cluster and walk through hands-on exercises using these technologies to analyze real-world datasets. Read more.
Hadoop and Beyond
GA Ballroom J
Reynold Xin (Databricks), Sameer Agarwal (UC Berkeley)
Average rating: ***..
(3.50, 6 ratings)
BlinkDB is an approximate query engine that answers queries in seconds on extremely large datasets by leveraging data sampling. It exploits advances in machine learning and distributed query processing to allow trading off response times and accuracy. BlinkDB is being integrated into Shark and Presto. We will cover real world use case scenarios of BlinkDB at adopters such as Facebook. Read more.