Presented By O'Reilly and Cloudera
December 5-6, 2016: Training
December 6–8, 2016: Tutorials & Conference

Alluxio (formerly Tachyon): An open source memory-speed virtual distributed storage system

Jiri Simsa (Alluxio)
2:35pm–3:15pm Wednesday, December 7, 2016
Spark & beyond
Location: Summit 2 Level: Beginner
Average rating: ****.
(4.50, 2 ratings)

Prerequisite Knowledge

  • Basic knowledge of the big data stack

What you'll learn

  • Explore Alluxio's use cases, its community, and the value it brings


Alluxio (formerly Tachyon) is a memory-speed virtual distributed storage system. The Alluxio community is one of the fastest growing open source communities in big data history, with more than 300 developers from over 100 organizations around the world. The Alluxio system has been deployed at a number of companies, including Alibaba, Baidu, Barclays, Intel, Huawei, and Qunar. In some of these deployments, Alluxio has been running in production for over a year, managing PBs of data. Haoyuan Li offers an overview of Alluxio, covering its use cases, its community, and the value it brings.

In the past year, the Alluxio project experienced a tremendous improvement in performance and scalability and was extended with key new features including tiered storage, transparent naming, and unified namespace. At the same time, the Alluxio ecosystem has expanded to include support for more under storage systems and computation frameworks. In particular, Alluxio now supports a wide range of under storage systems, including Amazon S3, Google Cloud Storage, Gluster, Ceph, HDFS, NFS, and OpenStack Swift. These integrations make it possible for Alluxio to be leveraged in many different environments.

This year, the goal is to make Alluxio accessible to an even wider set of users through a focus on security, new language bindings, and further increased stability. In addition, the team is working on new APIs to allow applications to access data more efficiently and manage data across different under storage systems.

Photo of Jiri Simsa

Jiri Simsa


Jiri Simsa is a software engineer at Alluxio and one of the maintainers and top contributors of the Alluxio open source project. Previously, he was a software engineer at Google, where he worked on the distributed framework for the IoT. Jiri holds a PhD in computer science from Carnegie Mellon University, where his work focused on systematic and scalable testing of concurrent systems.