Qunar is a Chinese-language online travel information provider and mainland search engine for web-based and mobile users. Currently, Qunar’s streaming platform processes around 6 billion system log entries (~4.5 TB) daily. Many jobs running on the platform are business critical and therefore impose strict requirements on both stability and low latency. For example, real-time user recommendations are generated mainly based on the log analysis of a user’s click behavior as well as the search pattern. The faster the iteration of the analysis, the more accurate the feedback that Qunar can deliver to the users. Therefore low latency and high stability are the top priorities of its system.
Alluxio is the first memory-speed virtual distributed storage system in the world. It unifies the interface between the various computing frameworks and under storages. Data access can be several magnitude faster because of Alluxio’s memory-centric architecture. In addition, Alluxio’s tiered storage, unified namespace, flexible file API, web UI, and command-line tools increase the usability in different application scenarios.
Qunar has been running Alluxio in production for over a year. Lei Xu explores how stream processing on Alluxio has led to a 16x performance improvement on average and 300x improvement at service peak time on workloads at Qunar.
Xueyan Li is a data platform R&D engineer at Qunar, where he is mainly responsible for the continuous integrated development of resource management system Mesos and distributed memory management system Alluxio, as well as data for all business lines based on public service support. Other focuses include the ELK log ETL platform, Spark, Storm, Flink, and Zeppelin. He holds a degree in software engineering from Heilongjiang University.
M.Sc. in Math (Theoretical Computer Science)
2 years of Game Technical Operations experience
start to work on hadoop+spark cluster since mid 2016
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.