Machine learning challenges at LinkedIn: Spark, TensorFlow, and beyond

Zhe Zhang (LinkedIn)

9:05–9:20 Thursday, 17 October 2019

Location: King's Suite

Secondary topics: Machine Learning, Machine Learning tools

Average rating:

(4.12, 8 ratings)

Download slides (PDF)

Watch the keynote

From people you may know (PYMK) to economic graph research, machine learning is the oxygen that powers how LinkedIn serves its 630M+ members.

Zhe Zhang provides you with an architectural overview of LinkedIn’s typical machine learning pipelines complemented with key types of ML use cases. He explores the changes and challenges brought in by the emergence of deep learning techniques, including hardware (GPU, networking), data, tooling, and language (Python and C++ versus Java and Scala). You’ll be introduced to the ongoing work of establishing a unified ML infrastructure based on Spark and TensorFlow, which offers high performance and efficiency together with ease of use.

What you'll learn

Learn how LinkedIn uses ML

Zhe Zhang

Zhe Zhang is a senior manager of core big data infrastructure at LinkedIn, where he leads an excellent engineering team to provide big data services (Hadoop distributed file system (HDFS), YARN, Spark, TensorFlow, and beyond) to power LinkedIn’s business intelligence and relevance applications. Zhe’s an Apache Hadoop PMC member; he led the design and development of HDFS Erasure Coding (HDFS-EC).