Presented By O'Reilly and Cloudera
December 5-6, 2016: Training
December 6–8, 2016: Tutorials & Conference
Singapore

How Alluxio (formerly Tachyon) brings a 300x performance improvement to Qunar’s streaming processing

Xueyan Li (Qunar), Chunming Li (Garena)
11:15am–11:55am Wednesday, December 7, 2016
Spark & beyond
Location: Summit 1 Level: Beginner
Average rating: *....
(1.00, 2 ratings)

Prerequisite Knowledge

  • Familiarity with stream processing and Spark

What you'll learn

  • Learn about stream processing on Alluxio from real-world workloads at Qunar, as well as how to position Alluxio in the streaming architecture

Description

Qunar is a Chinese-language online travel information provider and mainland search engine for web-based and mobile users. Currently, Qunar’s streaming platform processes around 6 billion system log entries (~4.5 TB) daily. Many jobs running on the platform are business critical and therefore impose strict requirements on both stability and low latency. For example, real-time user recommendations are generated mainly based on the log analysis of a user’s click behavior as well as the search pattern. The faster the iteration of the analysis, the more accurate the feedback that Qunar can deliver to the users. Therefore low latency and high stability are the top priorities of its system.

Alluxio is the first memory-speed virtual distributed storage system in the world. It unifies the interface between the various computing frameworks and under storages. Data access can be several magnitude faster because of Alluxio’s memory-centric architecture. In addition, Alluxio’s tiered storage, unified namespace, flexible file API, web UI, and command-line tools increase the usability in different application scenarios.

Qunar has been running Alluxio in production for over a year. Lei Xu explores how stream processing on Alluxio has led to a 16x performance improvement on average and 300x improvement at service peak time on workloads at Qunar.

Photo of Xueyan Li

Xueyan Li

Qunar

Xueyan li is a data platform R&D engineer at Qunar, where he is mainly responsible for the continuous integrated development of resource management system Mesos and distributed memory management system Alluxio, as well as data for all business lines based on public service support. Other focuses include the ELK log ETL platform, Spark, Storm, Flink, and Zeppelin. He graduated from Heilongjiang University with a degree in software engineering.

Photo of Chunming Li

Chunming Li

Garena

M.Sc. in Math (Theoretical Computer Science)
2 years of Game Technical Operations experience
start to work on hadoop+spark cluster since mid 2016