With the exploding growth of online video and audio content, there’s an increasing need for indexable and searchable audio. As a side project, Matar Haller built a production-ready tool to automatically identify speakers in a recorded conversation using a corpus of audio recordings. The tool analyzes audio recordings and calculates when different speakers are speaking in a conversation.
Matar shares how she approached this problem, the algorithms used, and steps taken to validate the results. She also outlines some of the challenges and pitfalls encountered and describes potential applications and extensions of the tool.
Matar Haller is a data scientist at Winton Capital. Previously, Matar was a neuroscientist at UC Berkeley, where she recorded and analyzed signals from electrodes surgically implanted in human brains.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.