Neural network architectures with memory and attention mechanisms exhibit certain reasoning capabilities required for question answering. Stephen Merity, Richard Socher, and Caiming Xiong offer an overview of their work with the dynamic memory network (DMN), which uses both of these mechanisms to achieve state-of-the-art performance on both the Visual Question Answering dataset and the bAbI-10k text question-answering dataset. Stephen, Richard, and Caiming demonstrate how attention mechanisms allow for improved inspection of the deep learning models, helping to understand the evidence behind specific decisions. The techniques they discuss are applicable to a wide range of tasks, improving both the accuracy and interpretability of the resulting models.
Stephen Merity is a senior research scientist at MetaMind, part of Salesforce Research, where he works on researching and implementing deep learning models for vision and text, with a focus on memory networks and neural attention mechanisms for computer vision and natural language processing tasks. Previously, Stephen worked on big data at Common Crawl, data analytics at Freelancer.com, and online education at Grok Learning. Stephen holds a master’s degree in computational science and engineering from Harvard University and a bachelor of information technology from the University of Sydney.
Caiming Xiong is a senior researcher at Metamind. Before that, he was a postdoctoral researcher in the Department of Statistics at the University of California, Los Angeles. Caiming holds a PhD in computer science and engineering from SUNY Buffalo and a BS and MS from Huazhong University of Science and Technology. His research interests include deep learning, computer vision, and human-robot interaction.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.