Natural language processing with deep learning

Delip Rao (AI Foundation), Brian McMahan (Wells Fargo)

Monday, Sep 9 & Tuesday, Sep 10,
9:00am - 5:00pm

Location: Santa Clara Room (Hilton)

Secondary topics: Deep Learning, Text, Language, and Speech

Average rating:

(4.50, 4 ratings)

Participants should plan to attend both days of this 2-day training course. To attend training courses, you must register for a Platinum or Training pass; does not include access to tutorials on Tuesday.

Delip Rao and Brian McMahan explore natural language processing using a set of machine learning techniques known as deep learning. They walk you through neural network architectures and NLP tasks and teach you how to apply these architectures for those tasks.

What you'll learn, and how you can apply it

Understand basic concepts in natural language processing (NLP) and deep learning as it applies to NLP
A hands-on approach to framing a real-world problem to the underlying NLP task and building a solution using Deep Learning

Prerequisites:

Working knowledge of Python and command-line familiarity
Familiarity with precalc math (multiply matrices, dot products of vectors) and derivatives of simple functions (useful but not required)
General knowledge of machine learning (setting up experiments, evaluation, etc.) (useful but not required)

Hardware and/or installation requirements:

Just a wi-fi enabled laptop.

Outline

NLP involves the application of machine learning and other statistical techniques to derive insights from human language. With large volumes of data exchanged as text (in the form of documents, tweets, email, chat, and so on), NLP techniques are indispensable to modern intelligent applications. The applications range from enterprise to pedestrian.

Day 1

Environment setup and data download
Introduction to supervised learning
Introduction to computational graphs
Introduction to NLP and NLP tasks
Representations for words: Word embeddings
- Hands-on: Word analogy problems
Overview of deep learning frameworks
Static versus dynamic
PyTorch basics
- Hands-on: PyTorch exercises
Feed-forward networks for NLP: Multi-layer perceptrons
- Hands-on: Chinese document classification
Convolutional networks: Modeling subword units
- Hands-on: Classifying names to ethnicities

Day 2

Sequence modeling: Basics of modeling sequences, representing sequences as tensors, the importance of the language modeling task
Recurrent neural networks (RNNs) to model sequences: Basic ideas
- Hands-on: Classification with an RNN
- Gated variants (long short-term memory (LSTM) and gated recurrent unit (GRU))
- Structural variants (bidirectional, stacked, tree)
- Hands-on: Generating sequences with an RNN
From sequence models to sequence-to-sequence models: Core ideas, encoder-decoder architectures, applications—translation and summarization
Attention: Core ideas and its role in Seq2Seq models
Advanced topics
- Self-attention and the Transformer
- Contextualized embedding models: BERT, ELMo
  - Hands-on: BERT
Overview of DL modeling for common NLP tasks
Choose your own adventure
- Hands-on: Work with an NLP problem end-to-end from a selection of tasks
DL for NLP: Best practices
Closing: When to use deep learning for NLP, when not to use deep learning for NLP, and summary

About your instructors

Delip Rao is the vice president of research at the AI Foundation, where he leads speech, language, and vision research efforts for generating and detecting artificial content. Previously, he founded the AI research consulting company Joostware and the Fake News Challenge, an initiative to bring AI researchers across the world together to work on fact checking-related problems, and he was at Google and Twitter. Delip is the author of a recent book on deep learning and natural language processing. His attitude toward production NLP research is shaped by the time he spent at Joostware working for enterprise clients, as the first machine learning researcher on the Twitter antispam team, and as an early researcher at Amazon Alexa.

Brian McMahan is a data scientist at Wells Fargo, working on projects that apply natural language processing (NLP) to solve real world needs. Recently, he published a book with Delip Rao on PyTorch and NLP. Previously, he was a research engineer at Joostware, a San Francisco-based company specializing in consulting and building intellectual property in NLP and Deep Learning. Brian is wrapping up his PhD in computer science from Rutgers University, where his research focuses on Bayesian and deep learning models for grounding perceptual language in the visual domain. Brian has also conducted research in reinforcement learning and various aspects of dialogue systems.