O'Reilly、Cloudera 主办
Make Data Work

使用Alluxio(前Tachyon)来加速大数据计算 (Using Alluxio (formerly Tachyon) to speed up big data analytics)

此演讲使用中文 (This will be presented in Chinese)

Yupeng Fu (Alluxio), Rong Gu (南京大学)
09:00–12:30 Thursday, 2017-07-13
数据工程和架构 (Data engineering and architecture)
地点: 多功能厅5C(Function Room 5C) 观众水平 (Level): 中级 (Intermediate)

必要预备知识 (Prerequisite Knowledge)

A basic understanding of Hadoop and Spark

需要提前准备的资料和下载 (Materials or downloads needed in advance)

A laptop (Linux or macOS) with terminal access and Java installed

您将学到什么 (What you'll learn)

Learn what Alluxio is, how to configure and run Alluxio, and how to build simple applications that benefit from Alluxio

描述 (Description)

在这个三个小时的教学课中, 我们将向参与者讲授Alluxio基础知识,演示Alluxio如何工作以及如何使用此系统帮助分布式计算引擎(如Spark或MapReduce)以内存速度共享数据。在上机环节里, 讲师将指导参与者部署和运行Alluxio,将外部存储系统(如S3)挂载至Alluxio命名空间,以及使用Alluxio命令行工具以及WebUI,最后使用通用计算引擎(例如,Apache Spark,Hadoop MapReduce)来搭建一个简单的大数据应用,并使用这一应用从Alluxio来读取和写入数据。

Yupeng Fu and Rong Gu offer an overview of Alluxio basics, demonstrating how Alluxio works and how to use this system to enable distributed computation engines (like Spark or MapReduce) to share data at memory speed. Using hands-on exercises, Yupeng and Rong walk you through deploying and running Alluxio, mounting external storage systems (like S3) into Alluxio’s namespace, interacting Alluxio with built-in commands and WebUI, and building simple big data applications using common computation frameworks (e.g., Apache Spark and Hadoop MapReduce) to read from and write to Alluxio.
Photo of Yupeng Fu

Yupeng Fu


Yupeng Fu is a software engineer at Alluxio and a PMC member of the Alluxio open source project. Previously, Yupeng worked at Palantir, where he led the efforts to build the company’s storage solution. Yupeng holds a BS and an MS from Tsinghua University and has completed coursework toward a PhD at UCSD.

Photo of Rong Gu

Rong Gu


Rong Gu is a research assistant professor at Nanjing University as well as an Alluxio PMC member and maintainer, where he worked on Alluxio’s performance evaluation framework, Alluxio-Perf, Alluxio ecosystem exploration, and documentation development, and an Apache Spark contributor. Rong has also worked to bridge Spark and Alluxio, contributing the OFF_HEAP storage level feature in to Spark 1.0, which allows Spark users to persist RDDs directly into Alluxio. Rong has been invited to share his work at many technical conferences, such as Spark Summit China, InfoQ Club, and Spark Meetup. He is the organizer of the Nanjing Big Data Technology meetup. He is the first author of 10 papers published in TPDS, JPDC, IPDPS, and IEEE Big Data and the author of several chapters in Understanding Big Data: Big Data Processing and Programming and Hadoop in Practice: Open the Shortcut to the Cloud Computing. Rong holds a PhD in computer science from Nanjing University. Over the last three years, he has held internships at several technology companies, including Microsoft, Intel, Baidu, and Transwarp.



WeChat QRcode


Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2


ORB Data Site