The mantra “be data driven” is likely spray-painted on the walls of every tech company, but when you look inside many products, what you’ll find is a far cry from an interconnected web of machine learning models all churning away to create user and customer value. Instead, you might stumble upon varying degrees of heuristics, long chains of if/else statements, and a patchwork of legacy machine learning models that have no hope of being retrained on new data.
This unfortunate situation isn’t intentional; rather it’s an emergent phenomenon driven by the gap between how we as an industry produce machine learning models (or any artifact for statistical inference) and subsequently deliver it into production. While implementation complexity is readily acknowledged in most of software engineering, we tend to ignore it when tossing data products into production.
The number of resources explaining how to build a machine learning model from data greatly overshadows information on how to make real data products from such models, creating a gap between what machine learning engineers and data scientists know is possible and what users experience. Chris Harland explores this imbalance through examples drawn from popular web products around the globe as well as his own career, which has transitioned from data science to machine learning engineering to heading data engineering at Textio, and walks you through a detailed example of machine learning as a data product in Textio’s augmented writing platform.
Chris Harland is director of data engineering at augmented writing platform Textio. Over his career, Chris has worked in a wide variety of fields spanning elementary science education, cutting-edge biophysical research, and recommendation and personalization engines. Previously, he was a data scientist and machine learning engineer at Versive (formerly Context Relevant) and a data scientist at Microsoft working on problems in Bing search, Xbox, Windows, and MSN. Chris holds a PhD in physics from the University of Oregon. Every year he thinks, “This is the year I’m going to stop thinking SQL is the best query language ever,” and every year he’s wrong.
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com