Sep 23–26, 2019
Please log in

Deep learning methods for natural language processing

Garrett Hoffman (StockTwits)
1:30pm5:00pm Tuesday, September 24, 2019
Location: 1A 12/14
Average rating: **...
(2.83, 6 ratings)

Who is this presentation for?

  • Data scientists, machine learning engineers, software engineers, and data engineers

Level

Intermediate

Description

Garrett Hoffman walks you through deep learning methods for natural language processing and natural language understanding tasks, using a live example in Python and TensorFlow with StockTwits data. Methods include Word2Vec, RNNs and variants (LSTM, GRU), and convolutional neural networks. You’ll explore use cases and motivations for using these methods, gain a conceptual intuition about how these models work, and briefly review the mathematics that underly each methodology.

Outline:

Representation learning for text with Word2Vec word embeddings

  • The continuous bag of words (CBOW) and skip-gram models
  • How to train custom word embeddings
  • How to use pretrained word embeddings such as those trained on Google News

Traditional RNNs

  • Why these types of models often perform better than traditional alternatives
  • Variants to traditional RNNs, such as LSTM cells and GRUs
  • Why these models provide improvements in accuracy

Convolutional neural networks (CNNs)

  • Why CNNs that are traditionally applied to computer vision are now being applied to language models
  • Advantages over RNNs
  • How RNNs can be used to learn generative models for text synthesis and the applications of this method

Prerequisite knowledge

  • A basic understanding of Python, machine learning, and feed-forward neural networks
  • Familiarity with TensorFlow (useful but not required)

Materials or downloads needed in advance

  • A laptop with the most recent version of Docker installed
  • Clone the course GitHub repository (Before the tutorial, follow the instructions in the course GitHub repository to set up your working environment. Note: It is not required that you follow along during the sessions. Approximately half of the session is focused on the models themselves and the underlying mathematics, with the other half involving going through implementations of the models in code. All of the models will be run in advance since they take a few hours to train.)

What you'll learn

  • Learn how deep learning is applied to NLP-related tasks and how to implement these models in Python using TensorFlow
  • Understand practical considerations for applying deep learning methods to business problems
Photo of Garrett Hoffman

Garrett Hoffman

StockTwits

Garrett Hoffman is a director of data science at StockTwits, where he leads efforts to use data science and machine learning to understand social dynamics and develop research and discovery tools that are used by a network of over one million investors. Garrett has a technical background in math and computer science but gets most excited about approaching data problems from a people-first perspective—using what we know or can learn about complex systems to drive optimal decisions, experiences, and outcomes.

Comments on this page are now closed.

Comments

Picture of Garrett Hoffman
Garrett Hoffman | Director, Data Science
09/24/2019 1:22pm EDT

Hi all! Thanks for coming to the talk. I hope you enjoyed. I fixed the small bug in the code that was brought up during the session. GitHub repo is updated.

Picture of Craig Palmer
Craig Palmer | Sr. Web Producer
09/18/2019 12:09pm EDT

Hello Ranajay. You are already registered for this tutorial.

Ranajay Nandy | Data Science Global Manager
09/18/2019 12:00pm EDT

Why cannot I register for this tutorial? There is no add to schedule button.

  • Cloudera
  • O'Reilly
  • Google Cloud
  • IBM
  • Cisco
  • Dataiku
  • Intel
  • Io-Tahoe
  • MemSQL
  • Microsoft Azure
  • Oracle Cloud Infrastructure
  • SAS
  • Arcadia Data
  • BMC Software
  • Hazelcast
  • SAP
  • Amazon Web Services
  • Anaconda
  • Esri
  • Infoworks.io, Inc.
  • Kyligence
  • Pitney Bowes
  • Talend
  • Google Cloud
  • Confluent
  • DataStax
  • Dremio
  • Immuta
  • Impetus Technologies Inc.
  • Keyence
  • Kyvos Insights
  • StreamSets
  • Striim
  • Syncsort
  • SK holdings C&C

    Contact us

    confreg@oreilly.com

    For conference registration information and customer service

    partners@oreilly.com

    For more information on community discounts and trade opportunities with O’Reilly conferences

    strataconf@oreilly.com

    For information on exhibiting or sponsoring a conference

    pr@oreilly.com

    For media/analyst press inquires