Alluxio（原名Tachyon）是内存级速度的虚拟分布式存储系统。它利用内存来存储数据和提升在不同存储系统上的数据访问的速度。 很多的机构和应用已经配合使用Apache Spark和Alluxio。其中一些已经扩展到超过PB级的数据上。
Alluxio可以使Spark在企业私有环境和公有云中的部署更加有效。 Alluxio将Spark应用程序与各种存储系统结合在一起并进一步加速数据密集型应用，同时还为各种不同存储系统上的数据提供了统一的命名空间，为应用程序开发人员提供了便利。 Alluxio还使用内存来为需要快速访问重要数据的应用存储热数据。 虽然Spark拥有自己的内存缓存，但Alluxio的内存存储可以进一步改善Spark应用。
Gene Pang和Bin Fan将会解释Alluxio如何让Spark更有效，并会分享Alluxio和Spark配合使用的生产系统上的部署案例。Gene和Bin还会讨论使用Alluxio与Spark的最佳实践，包括RDD和DataFrame，以及在企业私有环境和公有云上进行部署。
Alluxio (formerly Tachyon) is a memory-speed virtual distributed storage system that leverages memory for storing data and accelerating access to data in different storage systems. Many organizations and deployments use Alluxio with Apache Spark, and some of them scale out to over PBs of data.
Alluxio can enable Spark to be even more effective in both on-premises deployments and public cloud deployments. Alluxio bridges Spark applications with various storage systems and further accelerates data intensive applications and provides a unified namespace of data from various different storage systems, which is convenient for application developers. Alluxio also uses memory to store hot data for applications for fast access to important data. And although Spark has its own in-memory cache, Alluxio’s in-memory storage can further improve Spark applications.
Yupeng Fu explains how Alluxio helps Spark be more effective and shares examples of production deployments of Alluxio and Spark working together. Yupeng also discusses best practices for using Alluxio with Spark, including RDDs and DataFrames, as well as with on-premises deployments and public cloud deployments.
Yupeng Fu is a software engineer at Alluxio and a PMC member of the Alluxio open source project. Previously, Yupeng worked at Palantir, where he led the efforts to build the company’s storage solution. Yupeng holds a BS and an MS from Tsinghua University and has completed coursework toward a PhD at UCSD.
For exhibition and sponsorship opportunities, email firstname.lastname@example.org
For information on trade opportunities with O'Reilly conferences, email email@example.com
View a complete list of Strata Data Conference contacts Strata Data Conference contacts
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org