Presented by O'Reilly and Cloudera
Make Data Work
July 12-13, 2017: Training
July 13-15, 2017: Tutorials & Conference
Beijing, China

成长的烦恼--领英大数据平台500倍扩展中应对的挑战 (Growing pains: When your big data platform grows really big)

此演讲使用中文 (This will be presented in Chinese)

Zhe Zhang (LinkedIn)
09:20–09:35 Friday, 2017-07-14
Location: 紫金大厅A(Grand Hall A)
平均得分:: ****.
(4.50, 2 次得分)

领英是全球最早应用大数据技术的公司之一。早在2008年,领英就开始在一个20台节点的机群上运行Hadoop,支持大概10个Hadoop用户。在过去的9年里,领英的大数据平台扩展了将近500倍。现在领英有超过10个Hadoop机群,总共超过1万台节点,支持超过1000个工程师,数据科学家,商业分析师运行大规模数据分析程序。数据分析工具也从最开始单一的MapReduce/Pig,发展到现在的MR,Pig,Hive,Presto,Spark SQL,Spark ML,TensorFlow,Scalding,Cascading。


LinkedIn was one of the earliest adopters of big data technologies. In just over 10 years, the scale of its big data ecosystem has grown drastically, from a single cluster with 20 nodes supporting 10 users in 2008 to more than 10 clusters with more than 10,000 nodes supporting more than 1,000 users in 2016. The diversity of workloads has grown even faster: the company began with only MapReduce/Pig jobs but now offers an entire marketplace with MR, Pig, Hive, Presto, Spark SQL, Spark ML, TensorFlow, and so forth.

Zhe Zhang explains how LinkedIn solves various challenges around scale.

Topics include:

  • System scalability, including resource scheduling and storage
  • How to accommodate vastly different workloads, from quick interactive SQL queries to long-running deep learning jobs
  • Human scalability, including how to hide complexity from service providers and consumers and how to architect data systems to avoid duplicate and short-termed efforts
Photo of Zhe Zhang

Zhe Zhang


Zhe Zhang is a senior manager of core big data infrastructure at LinkedIn, where he leads an excellent engineering team to provide big data services (Hadoop distributed file system (HDFS), YARN, Spark, TensorFlow, and beyond) to power LinkedIn’s business intelligence and relevance applications. Zhe’s an Apache Hadoop PMC member; he led the design and development of HDFS Erasure Coding (HDFS-EC).

Connect with O'ReillyData

Use the QR Code to follow OReillyData and get the latest conference information and browse data articles.

WeChat QRcode


Stay Connected Image 1
Stay Connected Image 3
Stay Connected Image 2

Read the latest ideas on big data.

ORB Data Site