Presented By O’Reilly and Intel AI

Beijing • New York • San Francisco • London

Put AI to Work

April 29-30, 2018: Training

April 30-May 2, 2018: Tutorials & Conference

New York, NY

Using NLP, neural networks, and reporting metrics in production for continuous improvement in text classifications

Megan Yetman (Capital One)

4:00pm–4:40pm Tuesday, May 1, 2018

Models and Methods
Location: Nassau East/West

Average rating:

(5.00, 3 ratings)

Who is this presentation for?

Modelers who want to learn more about NLP, production use cases, and model interpretation

Prerequisite knowledge

A working knowledge of NLP and neural networks

What you'll learn

Explore Pensieve, a natural language processing (NLP) project that classifies reviews
Learn ways to improve model reporting and the ability for continuous model learning and improvement

Description

Pensieve—a natural language processing (NLP) project that classifies reviews for their sentiment, reason for sentiment, high-level content, and low-level content—is used in production to handle thousands of reviews daily and across multiple domains. Megan Yetman offers an overview of Pensieve as well as ways to improve model reporting and the ability for continuous model learning and improvement.

Raw text is input and transformed using a custom tokenized vocabulary set. The output is then sent through an embedding layer, a convolutional neural network (CNN), and a bidirectional long short-term memory network (bi-LSTM) to produce softmax outputs on the classification options. Monte Carlo simulations are then run, generating multiple softmax outputs per classification per review. Nonparametric tests are also performed to determine which outputs to report on. This enables optimization on accuracy by balancing model coverage.

Additionally, Pensieve has self-training capabilities. If review classifications are validated by a human, they are used to further train the model. If the new model weights pass an added layer of tests, the model is updated, increasing the scope and accuracy of the classifications. Fail scenarios are also in place to account for poor data as well as if the model stops performing as expected.

Megan Yetman

Capital One

Megan Yetman is a machine learning engineer at the Center for Machine Learning at Capital One. Megan has production experience with natural language processing and neural networks as well as data migration and data science. She holds a BA and MS in statistics from the University of Virginia.

Comments on this page are now closed.

Comments

Jonathan Bates | MANAGER - RISK MANAGEMENT

05/04/2018 6:40am EDT

Hi, @Megan,
Will the slides be made available? Great talk! Thank you!

Presented by

Elite Sponsors

Strategic Sponsors

Knowledge Sponsor

Contributing Sponsors

Impact Sponsors

Premier Exhibitors

Supporting Sponsors

Community Partner

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email aisponsorships@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of AI contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com