Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA

Schedule: Text and Language processing and analysis sessions

9:00am12:30pm Tuesday, March 26, 2019
David Talby (Pacific AI), Alex Thomas (John Snow Labs), Claudiu Branzan (Accenture)
Average rating: ****.
(4.75, 8 ratings)
David Talby, Alex Thomas, and Claudiu Branzan lead a hands-on introduction to scalable NLP using the highly performant, highly scalable open source Spark NLP library. You’ll spend about half your time coding as you work through four sections, each with an end-to-end working codebase that you can change and improve. Read more.
9:00am5:00pm Tuesday, March 26, 2019
Location: 2022
Alex Kudriashova (Astro Digital), Jonathan Francis (Starbucks), JoLynn Lavin (General Mills), Robin Way (Corios), June Andrews (GE), Kyungtaak Noh (SK Telecom), Taposh DuttaRoy (Kaiser Permanente), Sabrina Dahlgren (Kaiser Permanente), Craig Rowley (Columbia Sportswear), Ambal Balakrishnan (IBM), Benjamin Glicksberg (UCSF), Patrick Lucey (Stats Perform), Rhonda Textor (True Fit)
Hear practical insights from household brands and global companies: the challenges they tackled, approaches they took, and the benefits—and drawbacks—of their solutions. Read more.
11:00am11:40am Wednesday, March 27, 2019
Robert Horton (Microsoft), Mario Inchiosa (Microsoft), Ali Zaidi (Microsoft)
Average rating: ****.
(4.70, 10 ratings)
Robert Horton, Mario Inchiosa, and Ali Zaidi demonstrate how to use three cutting-edge machine learning techniques—transfer learning from pretrained language models, active learning to make more effective use of a limited labeling budget, and hyperparameter tuning to maximize model performance—to up your modeling game. Read more.
11:50am12:30pm Wednesday, March 27, 2019
Chakri Cherukuri (Bloomberg LP)
Average rating: ****.
(4.33, 3 ratings)
Quantitative finance is a rich field in finance where advanced mathematical and statistical techniques are employed by both sell-side and buy-side institutions. Chakri Cherukuri explains how machine learning and deep learning techniques are being used in quantitative finance and details how these models work under the hood. Read more.
11:50am12:30pm Wednesday, March 27, 2019
Michael Johnson (Lockheed Martin), Norris Heintzelman (Lockheed Martin)
Average rating: ****.
(4.60, 15 ratings)
How do you train a machine learning model with no training data? Michael Johnson and Norris Heintzelman share their journey implementing multiple solutions to bootstrapping training data in the NLP domain, covering topics including weak supervision, building an active learning framework, and annotation adjudication for named-entity recognition. Read more.
2:40pm3:20pm Wednesday, March 27, 2019
Divya Choudhary (University of Southern California)
Average rating: ****.
(4.50, 2 ratings)
Divya Choudhary explains how GO-JEK uses random chat messages and notes written in a local language sent by customers to their drivers while waiting for a ride to arrive to carve out unparalleled information about pickup points and their names (which sometimes even Google Maps has no idea of) and help create a world-class customer pickup experience feature. Read more.
2:40pm3:20pm Wednesday, March 27, 2019
Sonal Gupta (Facebook)
Average rating: ****.
(4.40, 5 ratings)
Sonal Gupta explores practical systems for building a conversational AI system for task-oriented queries and details a way to do more advanced compositional understanding, which can understand cross-domain queries, using hierarchical representations. Read more.
4:20pm5:00pm Wednesday, March 27, 2019
Yogesh Pandit (Roche), Saif Addin Ellafi (John Snow Labs), Vishakha Sharma (Roche Molecular Solutions)
Average rating: ****.
(4.67, 3 ratings)
Yogesh Pandit, Saif Addin Ellafi, and Vishakha Sharma discuss how Roche applies Spark NLP for healthcare to extract clinical facts from pathology reports and radiology. They then detail the design of the deep learning pipelines used to simplify training, optimization, and inference of such domain-specific models at scale. Read more.
4:20pm5:00pm Wednesday, March 27, 2019
Gungor Polatkan (LinkedIn)
Average rating: ****.
(4.33, 3 ratings)
Talent search systems at LinkedIn strive to match the potential candidates to the hiring needs of a recruiter expressed in terms of a search query. Gungor Polatkan shares the results of the company's deployment of deep learning models on a real-world production system serving 500M+ users through LinkedIn Recruiter. Read more.
3:50pm4:30pm Thursday, March 28, 2019
Pierre Romera (International Consortium of Investigative Journalists (ICIJ))
Average rating: ****.
(4.67, 6 ratings)
The ICIJ was the team behind the Panama Papers and Paradise Papers. Pierre Romera offers a behind-the-scenes look into the ICIJ's process and explores the challenges in handling 1.4 TB of data (in many different formats)—and making it available securely to journalists all over the world. Read more.
4:40pm5:20pm Thursday, March 28, 2019
Case studies
Location: 2007
Nancy Rausch (SAS Institute)
Average rating: ****.
(4.80, 5 ratings)
For data to be meaningful, it needs to be presented in a way that people can relate to. Nancy Rausch explains how she combined streaming data from a solar array and machine learning techniques to create a live-action art piece—an approach that helped bring the data to life in a fun and compelling way. Read more.