Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

Automatic speaker segmentation: Using machine learning to identify who is speaking when

Matar Haller (Winton Capital)
5:10pm5:50pm Wednesday, March 15, 2017
Data science & advanced analytics
Location: 212 A-B Level: Intermediate
Secondary topics:  Financial services
Average rating: *****
(5.00, 2 ratings)

Who is this presentation for?

  • Data scientists

Prerequisite knowledge

  • Familiarity with signal processing (time versus frequency space) and linear algebra (useful but not required)

What you'll learn

  • Understand how to scope down an open-ended machine-learning question in order to build a product that adds business value
  • Explore technical tools, including spectral decomposition, principal component analysis, and hierarchical clustering
  • Learn the importance of validation in the project iteration process


With the exploding growth of online video and audio content, there’s an increasing need for indexable and searchable audio. As a side project, Matar Haller built a production-ready tool to automatically identify speakers in a recorded conversation using a corpus of audio recordings. The tool analyzes audio recordings and calculates when different speakers are speaking in a conversation.

Matar shares how she approached this problem, the algorithms used, and steps taken to validate the results. She also outlines some of the challenges and pitfalls encountered and describes potential applications and extensions of the tool.

Photo of Matar Haller

Matar Haller

Winton Capital

Matar Haller is a data scientist at Winton Capital. Previously, Matar was a neuroscientist at UC Berkeley, where she recorded and analyzed signals from electrodes surgically implanted in human brains.