Presented By O’Reilly and Cloudera
Make Data Work
March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Crafting data products for the augmented writing experience

Chris Harland (Textio)
1:50pm2:30pm Thursday, March 8, 2018

Who is this presentation for?

  • Product managers, engineering leaders, CEOs, IC engineers, data scientists, and machine learning engineers

Prerequisite knowledge

  • A basic understanding of the Agile software development process
  • Familiarity with modern web product deployment (useful but not required)

What you'll learn

  • Understand how complexity in delivering a data product is different from other engineering complexities
  • Learn how to spot specific hurdles in data product delivery and overcome those hurdles

Description

The mantra “be data driven” is likely spray-painted on the walls of every tech company, but when you look inside many products, what you’ll find is a far cry from an interconnected web of machine learning models all churning away to create user and customer value. Instead, you might stumble upon varying degrees of heuristics, long chains of if/else statements, and a patchwork of legacy machine learning models that have no hope of being retrained on new data.

This unfortunate situation isn’t intentional; rather it’s an emergent phenomenon driven by the gap between how we as an industry produce machine learning models (or any artifact for statistical inference) and subsequently deliver it into production. While implementation complexity is readily acknowledged in most of software engineering, we tend to ignore it when tossing data products into production.

The number of resources explaining how to build a machine learning model from data greatly overshadows information on how to make real data products from such models, creating a gap between what machine learning engineers and data scientists know is possible and what users experience. Chris Harland explores this imbalance through examples drawn from popular web products around the globe as well as his own career, which has transitioned from data science to machine learning engineering to heading data engineering at Textio, and walks you through a detailed example of machine learning as a data product in Textio’s augmented writing platform.

Photo of Chris Harland

Chris Harland

Textio

Chris Harland is director of data engineering at augmented writing platform Textio. Over his career, Chris has worked in a wide variety of fields spanning elementary science education, cutting-edge biophysical research, and recommendation and personalization engines. Previously, he was a data scientist and machine learning engineer at Versive (formerly Context Relevant) and a data scientist at Microsoft working on problems in Bing search, Xbox, Windows, and MSN. Chris holds a PhD in physics from the University of Oregon. Every year he thinks, “This is the year I’m going to stop thinking SQL is the best query language ever,” and every year he’s wrong.