A few months ago, Baidu deployed Alluxio to accelerate its big data analytics workload. Bin Fan and Haojun Wang explain why Baidu chose Alluxio, as well as the details of how they achieved a 30x speedup with Alluxio in their production environment with hundreds of machines. Based on the success of the big data analytics engine, Baidu is currently expanding the Alluxio and Spark infrastructure to accelerate other applications, such as machine learning.
Bin and Haojun also delve into how they built a heterogenous computing platform to accelerate deep learning workloads. This platform consists of heterogeneous computing resources (CPU, GPU, FPGA) managed by a heterogeneous computing layer, as well as heterogeneous storage resources (memory, SSD, HDD) managed by Alluxio.
Bin Fan is a software engineer at Alluxio. Bin is one of the top committers on the Alluxio project. Prior to Alluxio, Bin worked at Google building next-generation storage infrastructure, where he won Google’s Technical Infrastructure award. Bin has a PhD in computer science from Carnegie Mellon University.
Haojun Wang is a tech lead on Baidu’s US autonomous driving car team. Currently, Haojun is driving the in-car computing platform and offline data platform. Prior to Baidu, he worked at the IBM Silicon Valley Lab, focusing on database core development and big data processing. Haojun received his PhD in computer science from the University of Southern California.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.