Presented By O'Reilly and Cloudera
Make Data Work
22–23 May 2017: Training
23–25 May 2017: Tutorials & Conference
London, UK

Executive Briefing: Dealing with device data

Mark Madsen (Teradata)
16:3517:15 Wednesday, 24 May 2017
Executive briefing, Strata Business Summit
Location: Capital Suite 17
Level: Beginner
Average rating: ****.
(4.75, 4 ratings)

Who is this presentation for?

  • Architects, data management professionals, and IT managers

Prerequisite knowledge

  • Knowledge of basic concepts in data technology and data management, databases, and development processes

What you'll learn

  • Explore lessons learned managing data from distributed devices generating data at scale


In 2007, a computer game company decided to jump ahead of competitors by capturing and using data created during online gaming. It believed that this data could be used to not only improve the in-game experience but also improve marketing, provide insight into customers, deliver personalized recommendations, research new products, and aid product managers responsible for the product life-cycle.

At the time, collecting and storing all the events generated by online game play was a novel idea. So was the idea of using this nontransactional data across multiple lines of business. The company thought its main problem would be dealing with internet-scale data. Despite some bad technology choices and major project problems, it turned out that engineering was the easy part. None of the existing development or data practices prepared the company for dealing with the data management and process challenges stemming from distributed devices creating data: business estimation problems, distributed metadata, master data in operational systems and in firmware, varied SLAs, data quality problems, varied event data, and multiple engineering teams with different skills and expectations.

Mark Madsen shares a case study that explores the oversights, failures, and lessons the company learned along the way. The lessons from this project apply as much today in the post-Hadoop, -Kafka, and -Spark world as they did back then. The only part that has gotten easier is the ability to collect and store data.

Photo of Mark Madsen

Mark Madsen


Mark Madsen is a Fellow at Teradata, where he’s responsible for understanding, forecasting, and defining analytics ecosystems and architectures. Previously, he was CEO of Third Nature, where he advised companies on data strategy and technology planning, and vendors on product management. Mark has designed analysis, machine learning, data collection, and data management infrastructure for companies worldwide.

Comments on this page are now closed.


2/06/2017 23:20 BST

Great session – the slides on the page seem to be from your other session. Any chance of sharing presentation?