Presented By O’Reilly and Intel AI
Put AI to Work
April 29-30, 2018: Training
April 30-May 2, 2018: Tutorials & Conference
New York, NY

Humanizing technology: Emotion detection from face and voice

Taniya Mishra (Affectiva)
11:05am–11:45am Tuesday, May 1, 2018
Models and Methods
Location: Concourse A

Prerequisite knowledge

  • A basic understanding of deep learning and its data requirements

What you'll learn

  • Explore emotion AI: emotion detection software from face and voice


Humans display and perceive emotions from multiple channels, including facial expressions, voice, gestures, and words. To power effective human-to-computer interactions, machines need to understand emotions the way humans do—from all of these channels. But while our smart devices and advanced AI systems have sophisticated cognitive capabilities, they lack social and emotional skills, rendering our interactions with or through them superficial and limited.

With over 6 million faces analyzed in 87 countries, Affectiva has been leading facial emotion analysis for years and has now added voice analytics capabilities to realize its larger vision of a multimodal emotion AI. Drawing on Affectiva’s experience building this emotion AI, which can detect human emotions from face and voice, Taniya Mishra outlines various deep learning approaches for building multimodal emotion detection. Along the way, Taniya explains how to mitigate the challenges of data collection and annotation and how to avoid bias in model training. She also covers various use cases for multimodal emotion AI, such as providing analytics for automotive applications by monitoring driver state and personalizing the transportation experience.

Photo of Taniya Mishra

Taniya Mishra


Taniya Mishra is the lead speech scientist at Affectiva, where her current research focuses on developing techniques for estimating human emotion from spoken utterances, with a goal to improve human-machine or human-human communication. These techniques involve training deep learning models from speech, either alone or in conjunction with other information streams, such as text or facial expressions, to estimate a speaker’s emotion about the topic at hand, their engagement in a task, their confidence, or their stress level. Taniya’s past research includes text-to-speech synthesis, voice search, and usage of the latter in child-directed and accessibility applications. Taniya has been a coauthor on more than 25 technical publications and has been awarded more than 12 patents related to speech technology. She is passionate about STEM education and mentoring. Taniya holds a PhD in computer science from the OGI School of Science and Engineering at OHSU.

Comments on this page are now closed.


Peter Brooks | TEACHER
04/04/2018 9:36pm EDT

Recently, a article in the NYTimes highlighted that some major face-recognition programs from Microsoft and IBM mis-identified the gender of black people in photographs at a much higher rate than that of white people. How does Affectiva deal with an issue like that?