Mar 15–18, 2020

How biased is your natural language model? Assessing fairness in NLP

Moin Nadeem (Intel)
2:35pm3:15pm Wednesday, March 18, 2020
Location: 210 F

Who is this presentation for?

  • Applied machine learning engineers

Level

Intermediate

Description

The rise of contextualized word embeddings (such as BERT and GPT-2) have enabled a plethora of applications on downstream tasks. However, since they’re optimized to capture statistical properties in training data, these reusable models tend to exhibit stereotypes that appear in the training data as well.

Moin Nadeem details how Intel created the first large-scale dataset to quantify bias across race, religion, gender, and profession in contextualized language models; showed that this method is the first to demonstrate bias across a variety of common models (BERT, GPT-2, XLNet, etc.) and outperforms previous methods to assess bias; and released this dataset, along with sample training data and code, for evaluation of bias on custom language models. Along the way, you’ll be able to consider bias more broadly in your work and incentivize the creation and distribution of unbiased models.

Prerequisite knowledge

  • A basic understanding of machine learning, including what a language model is and how to use a contextualized language model like BERT

What you'll learn

  • Learn about the biases that exist in the common language models and how you can use this information to help select a language model for your own use
  • Discover how to assess these biases in their own language models with benefits to industrial applications
Photo of Moin Nadeem

Moin Nadeem

Intel

Moin Nadeem is an undergraduate at MIT, where he studies computer science with a minor in negotiations. His research broadly studies applications of natural language. Most recently, he performed an extensive study on bias in language models, culminating with the release of the largest dataset on bias in NLP in the world. Previously, he cofounded the Machine Intelligence Community at MIT, which aims to democratize machine learning across undergraduates on campus, and received the Best Undergraduate Paper award at MIT.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

Become a sponsor

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires