Presented By O’Reilly and Cloudera

San Francisco • London • New York

Make Data Work

September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Job recommendations leveraging deep learning using Analytics Zoo on Apache Spark and BigDL

Guoqiong Song (Intel), Wenjing Zhan (Talroo), Jacob Eisinger (Talroo )

2:00pm–2:40pm Thursday, 09/13/2018

Big data and data science in the cloud, Data science and machine learning
Location: 1A 15/16 Level: Intermediate

Secondary topics: Deep Learning, Media, Marketing, Advertising

Download slides (PDF)

Who is this presentation for?

Machine and deep learning practitioners and big data professionals

Prerequisite knowledge

A basic understanding of Apache Spark, machine learning, and deep learning

What you'll learn

Learn how to use BigDL on Apache Spark, apply DL techniques to solve real-world use cases like job search, and deploy DL workloads in the cloud

Description

Collaborative filtering recommends items by identifying other users with similar taste but tends to misfire when user history is little known or new items are introduced into the mix. Incorporating context and natural language processing (NLP) is one way to improve recommendations. In addition, newly developed deep neural networks have shed light on the success by chaptering nonlinear relationships in the user-item dataset.

In the talent attraction industry, short hire cycles limit history around job advertisements and job seekers. The implication is most job recommendation systems search via keywords. Unfortunately, this short keyword context lacks the expressiveness to adequately describe the job seeker’s intent. In contrast, résumés offer a source of much richer context in natural language.

Guoqiong Song, Wenjing Zhan, and Jacob Eisinger demonstrate how to leverage distributed deep learning framework BigDL on Apache Spark to predict a candidate’s probability of applying to specific jobs based on their résumé, including document embedding using the pretrained Global Vectors for Word Representation (GloVe) model and neural collaborative filtering using deep neural networks. The deep learning algorithms in BigDL result in much better results compared to cosine similarity measure or traditional ALS (alternative linear square) as measured by precision and recall metrics.

Guoqiong Song

Intel

Guoqiong Song is a senior deep learning software engineer on the big data technology team at Intel. She’s interested in developing and optimizing distributed deep learning algorithms on Spark. She holds a PhD in atmospheric and oceanic sciences with a focus on numerical modeling and optimization from UCLA.

Guoqiong Song是英特尔大数据技术团队的高级深度学习软件工程师。她拥有加州大学洛杉矶分校的大气和海洋科学博士学位，专业方向是数值建模和优化。她现在的研究兴趣是开发和优化分布式深度学习算法。

Website

Wenjing Zhan

Talroo

Wenjing Zhan is a data scientist at Talroo, where she is in charge of predictive machine learning. Previously, Wenjing aided in search relevance through classification modeling and has done data engineering with Apache Spark and machine learning in Scala, R, and Python. She holds a master’s degree in statistics from the University of Texas at Austin.

Jacob Eisinger

Talroo

Jacob Eisinger is the director of data at Talroo, where he is responsible for the Special Projects initiative to pilot and validate high-impact business models and technologies. Previously, Jacob led search, personalization, data warehouse, bot detection, and machine learning at Talroo and worked in the Emerging Technologies Group at IBM, where he worked with technologies like BlueMix, Apache Spark, Apache Kafka, OAuth, and web service standards. Jacob is an accomplished inventor with over 20 patent applications. He holds a bachelor’s degree in computer science from Virginia Tech.

Website

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsors

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com