Presented By O'Reilly and Cloudera
Make Data Work
September 26–27, 2016: Training
September 27–29, 2016: Tutorials & Conference
New York, NY

What ties to what? Visualizing large-scale customer text data with bipartite graphs

Mark Turner (Teradata)
4:35pm–5:15pm Wednesday, 09/28/2016
Visualization & user experience
Location: 1 E 10/1 E11 Level: Beginner
Average rating: ****.
(4.00, 1 rating)

What you'll learn

  • Learn a way to rapidly get insight into the "voice of the customer" as expressed in text: call center transcripts, product reviews, social media, and more
  • Description

    Which suppliers are most likely to have delivery or quality issues? Does service, product placement, or price make the biggest difference in customer sentiment? Finding the answers to these questions in structured data is often straightforward, but can we answer them using the unstructured data (free text) in emails, social media, call center transcripts, product reviews, and other sources?

    Mark Turner explains how to clearly see the associations between any two variables in text data by combining large-scale in-database text analytics and the bipartite graph visualization technique. Mark describes which text analytics methods to use for various operational business questions and how to show the associations clearly in a bipartite graph, offering insight into which associations are strongest and which are weakest. This powerful combination of methods gives operational value to the increasingly huge amounts of text data, in which customers express their likes, dislikes, preferences, and issues.

    Photo of Mark Turner

    Mark Turner


    Mark Turner is the text analytics lead at Teradata Aster, where he specializes in developing text analytics applications in a wide range of industries, including financial services, manufacturing, oil and gas, retail, and cable media. Prior to joining Teradata Aster, Mark was manager of the Natural Language Processing (NLP) Lab at Thomson Corporation (now Thomson Reuters) and was director of an applied research and development group at CACI, a major federal contractor. He also contributed to the development of the Unified Medical Language System (UMLS), a major biomedical vocabulary resource, at NIH. Mark holds an AB in linguistics from the University of Chicago and an MS in information and computer science from Georgia Tech. He was also a visiting scientist at Carnegie Mellon University.