Which suppliers are most likely to have delivery or quality issues? Does service, product placement, or price make the biggest difference in customer sentiment? Finding the answers to these questions in structured data is often straightforward, but can we answer them using the unstructured data (free text) in emails, social media, call center transcripts, product reviews, and other sources?
Mark Turner explains how to clearly see the associations between any two variables in text data by combining large-scale in-database text analytics and the bipartite graph visualization technique. Mark describes which text analytics methods to use for various operational business questions and how to show the associations clearly in a bipartite graph, offering insight into which associations are strongest and which are weakest. This powerful combination of methods gives operational value to the increasingly huge amounts of text data, in which customers express their likes, dislikes, preferences, and issues.
Mark Turner is the text analytics lead at Teradata Aster, where he specializes in developing text analytics applications in a wide range of industries, including financial services, manufacturing, oil and gas, retail, and cable media. Prior to joining Teradata Aster, Mark was manager of the Natural Language Processing (NLP) Lab at Thomson Corporation (now Thomson Reuters) and was director of an applied research and development group at CACI, a major federal contractor. He also contributed to the development of the Unified Medical Language System (UMLS), a major biomedical vocabulary resource, at NIH. Mark holds an AB in linguistics from the University of Chicago and an MS in information and computer science from Georgia Tech. He was also a visiting scientist at Carnegie Mellon University.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.