Presented By
O’Reilly + Cloudera
Make Data Work
29 April–2 May 2019
London, UK

Schedule: Text and Language processing and analysis sessions

Add to your personal schedule
13:3017:00 Tuesday, 30 April 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Alexander Thomas (John Snow Labs), Claudiu Branzan (Accenture AI)
Average rating: ****.
(4.00, 4 ratings)
Alex Thomas and Claudiu Branzan lead a hands-on introduction to scalable NLP using the highly performant, highly scalable open source Spark NLP library. You’ll spend about half your time coding as you work through four sections, each with an end-to-end working code base that you can change and improve. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Data Engineering and Architecture
Location: Capital Suite 8/9
Moty Fania (Intel)
Average rating: ***..
(3.83, 6 ratings)
Moty Fania shares his experience implementing a sales AI platform that handles processing of millions of website pages and sifts through millions of tweets per day. The platform is based on unique open source technologies and was designed for real-time data extraction and actuation. Read more.
Add to your personal schedule
11:1511:55 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Alexander Thomas (John Snow Labs), Alexis Yelton (Indeed)
Average rating: ****.
(4.67, 3 ratings)
Alexander Thomas and Alexis Yelton demonstrate how to use Spark NLP and Apache Spark to standardize semistructured text, illustrated by Indeed's standardization process for résumé content. Read more.
Add to your personal schedule
12:0512:45 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Yves Peirsman (NLP Town)
Average rating: ****.
(4.57, 7 ratings)
In this age of big data, NLP professionals are all too often faced with a lack of data: written language is abundant, but labeled text is much harder to come by. Yves Peirsman outlines the most effective ways of addressing this challenge, from the semiautomatic construction of labeled training data to transfer learning approaches that reduce the need for labeled training examples. Read more.
Add to your personal schedule
12:0512:45 Wednesday, 1 May 2019
Data Science, Machine Learning & AI, Expo Hall
Location: Expo Hall (Capital Hall N24)
Matthew Honnibal (Explosion AI)
Average rating: ****.
(4.00, 4 ratings)
Matthew Honnibal shares "one weird trick" that can give your NLP project a better chance of success: avoid a waterfall methodology where data definition, corpus construction, modeling, and deployment are performed as separate phases of work. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Maryam Jahanshahi (TapRecruit)
Average rating: ****.
(4.00, 3 ratings)
Maryam Jahanshahi explores exponential family embeddings: methods that extend the idea behind word embeddings to other data types. You'll learn how TapRecruit used dynamic embeddings to understand how data science skill sets have transformed over the last three years, using its large corpus of job descriptions, and more generally, how these models can enrich analysis of specialized datasets. Read more.
Add to your personal schedule
14:0514:45 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 17
David Low (Pand.ai)
Average rating: ***..
(3.57, 7 ratings)
Transfer learning has been proven to be a tremendous success in computer vision—a result of the ImageNet competition. In the past few months, there have been several breakthroughs in natural language processing with transfer learning, namely ELMo, OpenAI Transformer, and ULMFit. David Low demonstrates how to use transfer learning on an NLP application with SOTA accuracy. Read more.
Add to your personal schedule
17:2518:05 Wednesday, 1 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 15/16
Weifeng Zhong (Mercatus Center at George Mason University)
Average rating: ****.
(4.75, 4 ratings)
Weifeng Zhong shares a machine learning algorithm built to “read” the People’s Daily (the official newspaper of the Communist Party of China) and predict changes in China’s policy priorities. The output of this algorithm, named the Policy Change Index (PCI) of China, turns out to be a leading indicator of the actual policy changes in China since 1951. Read more.
Add to your personal schedule
11:1511:55 Thursday, 2 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
David Dogon (Van Lanschot Kempen)
Average rating: ****.
(4.75, 8 ratings)
David Dogon dives into a best practice use case for detecting fraud at a financial institution and details a dynamic and robust monitoring system that successfully detects unwanted client behavior. Join in to learn how machine learning models can provide a solution in cases where traditional systems fall short. Read more.
Add to your personal schedule
12:0512:45 Thursday, 2 May 2019
Data Science, Machine Learning & AI
Location: Capital Suite 14
Average rating: ****.
(4.67, 3 ratings)
Moshe Wasserblat offers an overview of NLP Architect, an open source DL NLP library that provides SOTA NLP models, making it easy for researchers to implement NLP algorithms and for data scientists to build NLP-based solutions for extracting insight from textual data to improve business operations. Read more.