Extracting Business Value from Semi-Structured Data

Disruption & Opportunity
Location: Mission City B1
Average rating: *....
(1.88, 8 ratings)

Often overlooked in treating data as either highly structured (databases) or completely unstructured (free text) is the fact that much useful business data is in “semi-structured” form: government filings, insurance claims, customer comment forms, etc. Semi-structured data are often documents broken up into free text fields. Individually, the content of these fields is highly variable, but given the context of the document, much more predictable than unstructured text. Without knowledge of that context, most search tools treat semi-structured data as free text, and are far less useful than they could be. But a little structure goes a long way: this talk will describe how, and show how semi-structured data can be interpreted, summarized, and applied to produce business value in several real-life examples.

Cindi Thompson

Deloitte Consulting LLP

Cindi leads the Text Analytics team in Deloitte’s Analytics Institute. She has over 12 years R&D and project management experience in industrial, consulting, and academic settings. Her research areas include text analytics, machine learning, and adaptive recommendation systems. She worked 4 years as a professor before joining PwC for 6 years as a Research Manager. Cindi has a PhD and MA in Computer Sciences from the University of Texas – Austin, and A BS in Computer Science from North Carolina State University.

Comments on this page are now closed.


02/13/2011 3:18am PST

It’s a great presentation. I got a lot of insights.Would it be possible to obtain the slide deck used for the presentation.


  • Thomson Reuters
  • EMC Data Computing Division
  • EnterpriseDB
  • Microsoft
  • Gnip
  • Rackspace Hosting
  • IBM
  • Windows Azure MarketPlace DataMarket
  • Amazon Mechanical Turk
  • Amazon Web Services
  • Aster Data
  • Cloudera
  • Clustrix
  • DataStax, Inc. (formerly Riptano, Inc.)
  • Digital Reasoning Systems
  • Heritage Provider Network
  • Impetus
  • Jaspersoft
  • Karmasphere
  • LinkedIn
  • MarkLogic
  • Pentaho
  • Pervasive
  • Revolution Analytics
  • Splunk
  • Urban Mapping
  • Wolfram|Alpha
  • Esri
  • ParAccel
  • Tableau Software

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Susan Young at syoung@oreilly.com

Download the Strata Sponsor/Exhibitor Prospectus

Contact Us

View a complete list of Strata Contacts