Skip to main content
Make Data Work
Oct 15–17, 2014 • New York, NY

Office Hour with Matei Zaharia, Michael Armbrust, Paco Nathan, and Tathagata Das (Databricks)

Matei Zaharia (Databricks), michael dddd (Databricks), Paco Nathan (, Tathagata Das (Databricks)
11:00am–11:40am Thursday, 10/16/2014
Office Hour
Location: Table A

Talk with Apache Spark developers about:

  • The latest updates in Spark
  • Use cases and deployment
  • Getting started in Spark
Photo of Matei Zaharia

Matei Zaharia


Matei Zaharia is an assistant professor of computer science at MIT, and the creator of Apache Spark. He is currently on industry leave to start Databricks, a company commercializing Spark, where he is CTO.

Photo of michael dddd

michael dddd


Michael Armbrust is the lead developer of the Spark SQL and Structured Streaming projects at Databricks. Michael’s interests broadly include distributed systems, large-scale structured storage, and query optimization. Michael holds a PhD from UC Berkeley, where his thesis focused on building systems that allow developers to rapidly build scalable interactive applications and specifically defined the notion of scale independence.

Photo of Paco Nathan

Paco Nathan

Paco Nathan is known as a “player/coach” with core expertise in data science, natural language processing, machine learning, and cloud computing. He has 35+ years of experience in the tech industry, at companies ranging from Bell Labs to early-stage startups. His recent roles include director of the Learning Group at O’Reilly and director of community evangelism at Databricks and Apache Spark. Paco is the cochair of Rev conference and an advisor for Amplify Partners, Deep Learning Analytics, Recognai, and Primer. He was named one of the "top 30 people in big data and analytics" in 2015 by Innovation Enterprise.

Photo of Tathagata Das

Tathagata Das


Tathagata Das is an Apache Spark committer and a member of the PMC. He is the lead developer behind Spark Streaming, which he started while a PhD student in the UC Berkeley AMPLab, and is currently employed at Databricks. Prior to Databricks, Tathagata worked at the AMPLab, conducting research about data-center frameworks and networks with Scott Shenker and Ion Stoica.

Comments on this page are now closed.


Chiradeep Chakrabarti
10/14/2014 9:39am EDT

Hi folks, I am from Barclays Investment Bank and we are planning to use Spark as a caching layer for a subset of our risk data in hdfs for interactive access from code or BI platforms. I am hearing mixed feedback about Spark as to where it should and should not be used and have been suggested specifically that it should not be used as a caching layer.Can you please comment ?