Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Data Case Studies

9:00am- 5:00pm, Tuesday, March 6, 2018

From banking to biotech, retail to government, nonprofit to energy, every business sector is changing in the face of abundant data. Driven by competitive pressures and rising consumer expectations, firms are getting better at defining business problems and applying data solutions.


The road to a data-driven business is paved with hard-won lessons, painful mistakes, and clever insights. We’re introducing a new Tutorial Day track packed with case studies, where you can hear from practitioners across a wide range of industries. We call this track Data Case Studies. In a series of 12 half-hour talks aimed at a business audience, you’ll hear from household brands and global companies as they explain the challenges they wanted to tackle, the approaches they took, and the benefits—and drawbacks—of their solutions. If you want practical insights about applied data, look no further.

About your host

Taylor Martin, is principal learning scientist at O’Reilly Media, where she helps a team of data scientists and engineers mix in just the right amount of data-driven learning engineering to personalize the learning experience across various forms of published media. Taylor’s research focuses on understanding how people learn, and she’s particularly interested in how adaptive and personalized learning can best be used to help people reach their learning goals faster. As an established academic and thought leader in the learning sciences, Taylor has spearheaded data-centric approaches to developing learning environments and measuring how people learn science, math, engineering, and computer science in environments that include online games, online programming environments (e.g.,, internship programs, Maker spaces, and engineering design labs.

Tuesday, 03/06/2018


9:00am–9:05am Tuesday, 03/06/2018
Location: LL20 A
Taylor Martin Martin (O'Reilly Media)
Taylor Martin welcomes you to Data Case Studies. Read more.


9:05am–9:30am Tuesday, 03/06/2018
Data Case Studies
Data-driven business management
Location: LL20 A
Matt Conners (Microsoft)
Microsoft’s finance organization is reinventing forecasting using machine learning that its leaders describe as game changing. Matt Conners shares the lessons the data sciences and finance teams learned while bringing machine learning forecasting to the office of the CFO by improving forecast accuracy and frequency and driving cultural change through a finance center of excellence. Read more.


9:30am–10:00am Tuesday, 03/06/2018
Madhav Madaboosi and Meenakshisundaram Thandavarayan offer an overview of BP's self-service operational data lake, which improved operational efficiency, boosting productivity through fully identifiable data and reducing risk of a data swamp. They cover the path and big data technologies that BP chose, lessons learned, and pitfalls encountered along the way. Read more.


10:00am–10:30am Tuesday, 03/06/2018
Data Case Studies
Data-driven business management
Location: LL20 A
Katie Malone (Civis Analytics)
Average rating: *****
(5.00, 2 ratings)
The 2012 Obama campaign ran the first personalized presidential campaign in history. The data team was made up of people from diverse backgrounds who embraced data science in service of the goal. Civis Analytics emerged from this team and today enables organizations to use the same methods outside politics. Katie Malone shares lessons learned from these experiences for building effective teams. Read more.


10:30am–11:00am Tuesday, 03/06/2018
Location: Executive Concourse
Morning break (30m)


11:00am–11:30am Tuesday, 03/06/2018
Mike Prorock (
Average rating: *****
(5.00, 1 rating)
Mike Prorock offers an overview of, a game-changing climate awareness solution that combines smart sensor technology, data transmission, and state-of-the-art visual analytics to transform the agricultural and turf management market. enables growers to monitor areas of concern, providing immediate benefits to crop yield, supply costs, farm labor overhead, and water consumption. Read more.


11:30am–12:00pm Tuesday, 03/06/2018
Data Case Studies
Location: LL20 A
Divya Ramachandran (Captricity)
Divya Ramachandran explains how top insurance companies have used handwriting transcription powered by deep learning to achieve a more than 70% reduction in daily operational processing time, develop a best-in-industry predictive model for assessing mortality risk from decades of archived forms, and enable a smarter claims leakage review, which led to a 10x ROI in its first year. Read more.


12:00pm–12:30pm Tuesday, 03/06/2018
Data Case Studies
Data-driven business management
Location: LL20 A
Thomas Miller (Northwestern University)
Sports analytics today is more than a matter of analyzing box scores and play-by-play statistics. Faced with detailed on-field or on-court data from every game, sports teams face challenges in data management, data engineering, and analytics. Thomas Miller details the challenges faced by a Major League Baseball team as it sought competitive advantage through data science and deep learning. Read more.


12:30pm–1:30pm Tuesday, 03/06/2018
Location: 230 A-C
Lunch (1h)


1:30pm–2:00pm Tuesday, 03/06/2018
Ann Nguyen (Whole Whale)
Power Poetry is the largest online platform for young poets, with over 350K users. Ann Nguyen explains how Power Poetry is extending the learning potential with machine learning and covers the technical elements of the Poetry Genome, a series of ML tools to analyze and break down similarity scores of the poems added to the site. Read more.


2:00pm–2:30pm Tuesday, 03/06/2018
Jennie Shin (Kaiser Permanente)
As healthcare data becomes increasingly digitized, medical centers are able to leverage data in new ways to improve patient care. Jennie Shin explains how Kaiser Permanente developed a sophisticated flu predictor model to better determine where resources were needed and how to reduce outbreaks. Read more.


2:30pm–3:00pm Tuesday, 03/06/2018
Data Case Studies
Location: LL20 A
Valentin Bercovici (PencilDATA)
Valentin Bercovici explores the challenges in securing, maintaining, and repairing the dynamic, heterogeneous software supply chain for modern self-driving cars, from levels 0 to 5. Along the way, Valentin reviews implementation options, from centralized certificate authority-based architectures to decentralized blockchains networked over a fleet of cars. Read more.


3:00pm–3:30pm Tuesday, 03/06/2018
Location: Executive Concourse
Afternoon break (30m)


3:30pm–4:00pm Tuesday, 03/06/2018
Data Case Studies
Data-driven business management
Location: LL20 A
Joe Dumoulin (Next IT)
AI is transformative for business, but it’s not magic; it’s data. Joe Dumoulin shares how Next IT's global enterprise customers have transformed their businesses with AI solutions and outlines how companies should build AI strategies, utilize data to develop and evolve conversational intelligence and business intents, and ultimately increase ROI. Read more.


4:00pm–4:30pm Tuesday, 03/06/2018
Jules Malin (GoPro)
Drones and smart devices are generating billions of event logs for companies, presenting the opportunity to discover insights that inform product, engineering, and marketing team decisions. Jules Malin explains how technologies like Spark and analytics and visualization tools like Python and Plotly enable those insights to be discovered in the data. Read more.


4:30pm–5:00pm Tuesday, 03/06/2018
Wayde Fleener (General Mills)
Average rating: *****
(5.00, 1 rating)
Decision makers are busy. Businesses can hire people to analyze data for them, but most companies are resource constrained and can’t hire a small army to look through all their data. Wayde Fleener explains how General Mills implemented automation to enable decision makers to quickly focus on the metrics that matter and cut through everything else that does not. Read more.