Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Schedule: Text and Language processing and analysis sessions

9:0012:30 Tuesday, 22 May 2018
Barbara Fusinska (Google)
Average rating: ****.
(4.33, 3 ratings)
Natural language processing techniques help address tasks like text classification, information extraction, and content generation. Barbara Fusinska offers an overview of natural language processing and walks you through building a bag-of-words representation, using Python and its machine learning libraries, and then using it for text classification. Read more.
13:3017:00 Tuesday, 22 May 2018
Data science and machine learning
Location: Capital Suite 13 Level: Intermediate
David Talby (Pacific AI), Claudiu Branzan (Accenture)
Average rating: ****.
(4.33, 3 ratings)
Natural language processing is a key component in many data science systems. David Talby and Claudiu Branzan lead a hands-on tutorial on scalable NLP using spaCy for building annotation pipelines, Spark NLP for building distributed natural language machine-learned pipelines, and Spark ML and TensorFlow for using deep learning to build and apply word embeddings. Read more.
16:3517:15 Wednesday, 23 May 2018
Ran Taig (Dell), Omer Sagi (Dell)
Average rating: **...
(2.00, 1 rating)
DevOps and QA engineers spend a significant amount of time investigating reoccurring issues. These issues are often represented by large configuration and log files, so the process of investigating whether two issues are duplicates can be a very tedious task. Ran Taig and Omer Sagi outline a solution that leverages NLP and machine learning algorithms to automatically identify duplicate issues. Read more.
16:3517:15 Wednesday, 23 May 2018
Data science and machine learning
Location: Capital Suite 10/11 Level: Non-technical
Naveed Ghaffar (Narrative Economics), Rashed Iqbal (UCLA)
Average rating: ***..
(3.67, 3 ratings)
Narratives are significant vectors of rapid change in culture, economic behavior, and the Zeitgeist of a society. Narrative economics studies the impact of popular human-interest stories on economic fluctuations. Naveed Ghaffar and Rashed Iqbal outline a framework that uses natural language understanding to extract and analyze narratives in human communication. Read more.
17:2518:05 Wednesday, 23 May 2018
Data science and machine learning, Emerging technologies and case studies
Location: Capital Suite 13 Level: Intermediate
Darren Cook (QQ Trend)
Darren Cook demonstrates how to use LSTMs, state-of-the-art tokenizers, dictionaries, and other data sources to tackle translation, focusing on one of the most difficult language pairs: Japanese to English. Read more.
14:0514:45 Thursday, 24 May 2018
Data science and machine learning, Expo Hall
Location: Expo Hall Level: Intermediate
David Talby (Pacific AI), Saif Addin Ellafi (John Snow Labs), Paul Parau (UiPath)
Average rating: ****.
(4.50, 4 ratings)
Spark NLP natively extends Spark ML to provide natural language understanding capabilities with performance and scale that was not possible to date. David Talby, Saif Addin Ellafi, and Paul Parau explain how Spark NLP was used to augment the Recognos smart data extraction platform in order to automatically infer fuzzy, implied, and complex facts from long financial documents. Read more.