Presented By O'Reilly and Cloudera
December 5-6, 2016: Training
December 6–8, 2016: Tutorials & Conference
Singapore

Dealing with device data

Mark Madsen (Teradata)
11:15am–11:55am Wednesday, December 7, 2016
Becoming a data-centric company
Location: 328/329 Level: Beginner
Average rating: **...
(2.00, 1 rating)

What you'll learn

  • Explore lessons learned from dealing with distributed devices generating data at scale

Description

In 2007, a computer game company decided to jump ahead of competitors by capturing and using data created during online gaming. It believed that this data could be used to not only improve the in-game experience but also improve marketing, provide insight into customers, deliver personalized recommendations, research new products, and aid product managers responsible for the product life-cycle.

At the time, collecting and storing all the events generated by online game play was a novel idea. So was the idea of using this nontransactional data across multiple lines of business. The company thought its main problem would be dealing with Internet-scale data. Despite some bad technology choices and major project problems, it turned out that engineering was the easy part. None of the existing development or data practices prepared the company for dealing with the data management and process challenges stemming from distributed devices creating data: business estimation problems, distributed metadata, master data in operational systems and in firmware, varied SLAs, data quality problems, varied event data, and multiple engineering teams with different skills and expectations.

Mark Madsen shares a case study that explores the oversights, failures, and lessons the company learned along the way. The lessons from this project apply as much today in the post-Hadoop, Kafka, and Spark world as they did back then. The only part that has gotten easier is the ability to collect and store data.

Photo of Mark Madsen

Mark Madsen

Teradata

Mark Madsen is a fellow at Teradata, where he’s responsible for understanding, forecasting, and defining the analytics ecosystem and architecture. Previously, he was CEO of Third Nature, where he advised companies on data strategy and technology planning and vendors on product management. Mark has designed analysis, machine learning, data collection, and data management infrastructure for companies worldwide.