Presented By O’Reilly and Cloudera

San Francisco • London • New York

Make Data Work

September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Schedule: Text and Language processing and analysis sessions

9:00am–12:30pm Tuesday, 09/11/2018

Deep learning methods for natural language processing

Location: 1A 21/22 Level: Intermediate

Garrett Hoffman (StockTwits)

Average rating:

(4.75, 4 ratings)

Garrett Hoffman walks you through deep learning methods for natural language processing and natural language understanding tasks, using a live example in Python and TensorFlow with StockTwits data. Methods include word2vec, recurrent neural networks and variants (LSTM, GRU), and convolutional neural networks. Read more.

1:30pm–5:00pm Tuesday, 09/11/2018

Natural language understanding at scale with Spark NLP

Location: 1A 21/22 Level: Intermediate

David Talby (Pacific AI), Claudiu Branzan (Accenture), Alex Thomas (John Snow Labs)

Average rating:

(3.00, 7 ratings)

David Talby, Claudiu Branzan, and Alex Thomas lead a hands-on tutorial for scalable NLP using the highly performant, highly scalable open source Spark NLP library. You’ll spend about half your time coding as you work through four sections, each with an end-to-end working codebase that you can change and improve. Read more.

1:15pm–1:55pm Wednesday, 09/12/2018

Document vectors in the wild: Building a content recommendation system for Reuters.com

Location: 1A 06/07 Level: Intermediate

James Dreiss (Reuters)

Average rating:

(3.67, 3 ratings)

James Dreiss discusses the challenges in building a content recommendation system for one of the largest news sites in the world, Reuters.com. The particularities of the system include developing a scrolling newsfeed and the use of document vectors for semantic representation of content. Read more.

4:35pm–5:15pm Wednesday, 09/12/2018

Anxiety at scale: How Investopedia used readership data to track market volatility

Location: 1A 06/07 Level: Beginner

Masha Westerlund (Investopedia)

Average rating:

(5.00, 2 ratings)

Businesses rely on user data to power their sites, products, and sales. Can we give back by sharing those insights with users? Masha Westerlund explains how Investopedia harnessed reader data to build an index that tracks market anxiety and moves with the VIX, a proprietary measure of market volatility. You'll see how thinking outside the box helps turn data into tools for users, not stakeholders. Read more.

5:25pm–6:05pm Wednesday, 09/12/2018

From emotion analysis and topic extraction to narrative modeling

Location: 1A 08 Level: Beginner

Andreea Kremm (Netex Group), Mohammed Ibraaz Syed (UCLA)

Average rating:

(4.00, 2 ratings)

Narrative economics studies the impact of popular narratives and stories on economic fluctuations in the context of human interests and emotions. Andreea Kremm and Mohammed Ibraaz Syed describe the use of emotion analysis, entity relationship extraction, and topic modeling in modeling narratives from written human communication. Read more.

5:25pm–6:05pm Wednesday, 09/12/2018

Automating business processes with large-scale knowledge graphs

Location: Expo Hall

Mike Tung (Diffbot)

Mike Tung offers an overview of available open source and commercial knowledge graphs and explains how consumer and business applications are already taking advantage of them to provide intelligent experiences and enhanced business efficiency. Mike then discusses what's coming in the future. Read more.

11:20am–12:00pm Thursday, 09/13/2018

Applying petabyte-scale analytics and machine learning to billions of news reading sessions

Location: 1A 06/07 Level: Intermediate

Andrew Montalenti (Parse.ly )

Average rating:

(5.00, 1 rating)

What can we learn from a one-billion-person live poll of the internet? Andrew Montalenti explains how Parse.ly has gathered a unique dataset of news reading sessions of billions of devices, peaking at over two million sessions per minute on thousands of high-traffic news and information websites, and how the company uses this data to unearth the secrets behind online content. Read more.

1:10pm–1:50pm Thursday, 09/13/2018

Spark NLP in action: How SelectData uses AI to better understand home health patients

Location: 1A 06/07 Level: Intermediate

David Talby (Pacific AI), Alberto Andreotti (John Snow Labs), Stacy Ashworth (SelectData), Tawny Nichols (Select Data)

Average rating:

(3.00, 4 ratings)

David Talby, Alberto Andreotti, Stacy Ashworth, and Tawny Nichols outline a question-answering system for accurately extracting facts from free-text patient records and share best practices for training domain-specific deep learning NLP models. The solution is based on Spark NLP, an extension of Spark ML that provides state-of-the-art performance and accuracy for natural language understanding. Read more.

2:00pm–2:40pm Thursday, 09/13/2018

Digging for gold: Developing AI in healthcare against unstructured text data

Location: 1E 12/13 Level: Non-technical

Chiny Driscoll (MetiStream), Jawad Khan (Rush University Medical Center )

Average rating:

(4.00, 5 ratings)

Chiny Driscoll and Jawad Khan offer an overview of a solution by Cloudera and MetiStream that lets healthcare providers automate the extraction, processing, and analysis of clinical notes within an electronic health record in batch or real time, improving care, identifying errors, and recognizing efficiencies in billing and diagnoses. Read more.

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsors

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com