Presented By O'Reilly and Cloudera
Make Data Work
December 1–3, 2015 • Singapore

Make Tachyon ready for next-gen data center platforms with NVM

Mingfei Shi (Intel), Bin Fan (Alluxio)
4:00pm–4:40pm Thursday, 12/03/2015
Hadoop & Beyond
Location: 328-329 Level: Intermediate
Average rating: ****.
(4.00, 3 ratings)

Prerequisite Knowledge

Some basic knowledge of big data software stacks such as Hadoop and Spark.

Description

Next generation big data engines (Apache Spark, Tez, etc.) are famous for their performance boost within memory computing. However, current memory size is far from enough to host a data set. Then NVM emerged to respond to this need. However, how to integrate NVM to support a modernized big data system is a challenge. For example, to handle a bunch of GC overheads, and refactor your system API, etc. It does bring benefits for in-memory and real-time computation, but also raises new questions about memory management in big data.

In this talk, we present our efforts to make a tiered store in Tachyon, which provided a software solution for next-gen data center platforms with NVM. It plays transparently to the end user but brings better performance for real-world applications.

Mingfei Shi

Intel

Mingfei Shi is a senior software engineer on Intel’s big data technology team. He is one of the top contributors to the Tachyon project, and also a contributor to the Spark project.

Photo of Bin Fan

Bin Fan

Alluxio

Bin Fan is a software engineer at Alluxio and a PMC member of the Alluxio project. Previously, Bin worked at Google, building next-generation storage infrastructure, where he won Google’s technical infrastructure award. He holds a PhD in computer science from Carnegie Mellon University.