JD.com is China’s largest online retailer and its biggest overall retailer, as well as the country’s biggest internet company by revenue. Currently, JD.com’s BDP platform runs more than 400,000 jobs (15+ PB) daily, on a system with more than 15,000 nodes and a total capacity of 210 PB.
Alluxio, formerly Tachyon, is the world’s first system that unifies disparate storage systems at memory speed. In the big data ecosystem, Alluxio lies between computation frameworks or jobs and various kinds of storage systems. Additionally, Alluxio’s memory-centric architecture enables data access orders of magnitude faster than existing solutions.
Alluxio has run in JD.com’s production environment on 100 nodes for six months. Tao Huang, Mang Zhang, and 白冰 explain how JD.com uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and Alluxio as a pluggable optimization component. To give just one example, one framework, JDPresto, has seen a 10x performance improvement on average. This work has also extended Alluxio and enhanced the syncing between Alluxio and HDFS for consistency.
Tao Huang is a big data platform development engineer at JD.com, where he is mainly engaged in the development and maintenance of the company’s big data platform, using open source projects such as Hadoop, Spark, Alluxio and Kubernetes. He focuses on migrating Hadoop to the Kubernetes cluster, which will run long-running services and batch jobs, to improve the cluster resource utilization.
Mang Zhang is a big data platform development engineer at JD.com, where he is mainly engaged in the construction and development of the company’s big data platform, using open source projects such as Hadoop, Spark, Hive, Alluxio and Presto. He focuses on the big data ecosystem and is an open source developer, the contributor of Alluxio, Hadoop, Hive and Presto.
白冰 is a senior big data platform development engineer at JD.com focusing on computation and storage framworks such as Spark, Hive, Presto, Alluxio, and HDFS. 白冰 is experienced in designing and developing architecture for deploying the frameworks into production with large-scale clusters.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com