Presented By
O’Reilly + Intel AI
Put AI to Work
April 15-18, 2019
New York, NY
Discover opportunities for applied AI
Organizations that successfully apply AI innovate and compete more effectively. How is AI transforming your business?
Be a part of the program—apply to speak by October 16.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Chang Ming-Wei (Google)
1:00pm1:40pm Thursday, April 18, 2019
Machine Learning, Models and Methods
Location: Regent Parlor
Secondary topics:  Models and Methods, Text, Language, and Speech

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT representations can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications.
BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE benchmark to 80.4% (7.6% absolute improvement), MultiNLI accuracy to 86.7 (5.6% absolute improvement) and the SQuAD v1.1 question answering Test F1 to 93.2 (1.5% absolute improvement), outperforming human performance by 2.0%.

Photo of Chang Ming-Wei

Chang Ming-Wei


Ming-Wei Chang is a research scientist in Google AI Language, Seattle. He enjoys developing interesting machine learning algorithms for practical problems, especially in the field of natural language processing. He has won an Outstanding Paper award at ACL 2015 for his work on question answering over knowledge bases. Over the years, he has published more than 35 papers on the top-tier conferences and won several international machine learning competitions including entity linking, power load forecast prediction and sequential data classification. Together with his colleagues in Google AI Language, his recent paper, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding“, has demonstrated the power of language model pre-training and obtain the new state-of-the-art over 11 natural language processing tasks.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)