Presented By O’Reilly and Cloudera

San Jose • London • New York

Make Data Work

March 5–6, 2018: Training
March 6–8, 2018: Tutorials & Conference
San Jose, CA

Crafting data products for the augmented writing experience

Chris Harland (Textio)

1:50pm–2:30pm Thursday, March 8, 2018

Data engineering and architecture
Location: LL21 E/F

Who is this presentation for?

Product managers, engineering leaders, CEOs, IC engineers, data scientists, and machine learning engineers

Prerequisite knowledge

A basic understanding of the Agile software development process
Familiarity with modern web product deployment (useful but not required)

What you'll learn

Understand how complexity in delivering a data product is different from other engineering complexities
Learn how to spot specific hurdles in data product delivery and overcome those hurdles

Description

The mantra “be data driven” is likely spray-painted on the walls of every tech company, but when you look inside many products, what you’ll find is a far cry from an interconnected web of machine learning models all churning away to create user and customer value. Instead, you might stumble upon varying degrees of heuristics, long chains of if/else statements, and a patchwork of legacy machine learning models that have no hope of being retrained on new data.

This unfortunate situation isn’t intentional; rather it’s an emergent phenomenon driven by the gap between how we as an industry produce machine learning models (or any artifact for statistical inference) and subsequently deliver it into production. While implementation complexity is readily acknowledged in most of software engineering, we tend to ignore it when tossing data products into production.

The number of resources explaining how to build a machine learning model from data greatly overshadows information on how to make real data products from such models, creating a gap between what machine learning engineers and data scientists know is possible and what users experience. Chris Harland explores this imbalance through examples drawn from popular web products around the globe as well as his own career, which has transitioned from data science to machine learning engineering to heading data engineering at Textio, and walks you through a detailed example of machine learning as a data product in Textio’s augmented writing platform.

Chris Harland

Textio

Chris Harland is director of data engineering at augmented writing platform Textio. Over his career, Chris has worked in a wide variety of fields spanning elementary science education, cutting-edge biophysical research, and recommendation and personalization engines. Previously, he was a data scientist and machine learning engineer at Versive (formerly Context Relevant) and a data scientist at Microsoft working on problems in Bing search, Xbox, Windows, and MSN. Chris holds a PhD in physics from the University of Oregon. Every year he thinks, “This is the year I’m going to stop thinking SQL is the best query language ever,” and every year he’s wrong.

Website

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsor

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com