14–17 Oct 2019

Creating smaller, faster, production-worthy mobile machine learning models

Jameson Toole (Fritz AI)
14:3515:15 Thursday, 17 October 2019
Location: Westminster Suite
Average rating: *****
(5.00, 1 rating)

Who is this presentation for?

  • Developers, data scientists, AI enthusiasts, and mobile developers

Level

Intermediate

Description

Getting machine learning models ready for use on device is a major challenge. Drag-and-drop training tools can get you started, but the models they produce aren’t small enough or fast enough to ship. Jameson Toole walks you through optimization, pruning, and compression techniques to keep app sizes small and inference speeds high.

Jameson explores flexible model architectures that meet performance and accuracy requirements across devices and platforms. You’ll discover pruning and distillation techniques to optimize model performance and quantization tools to compress models to a fraction of their original size. Jameson gives you a practical example of this process as he creates an artistic style transfer model that’s just 17 kb. All of these techniques are applied to mobile machine learning frameworks such as Core ML and TensorFlow Lite.

Prerequisite knowledge

  • Experience with mobile ML frameworks like Core ML or TensorFlow Lite
  • A basic understanding of how neural networks are designed and trained

What you'll learn

  • Learn specific techniques for optimizing mobile machine learning models for production, both in terms of size and inference speeds, through a case study
Photo of Jameson Toole

Jameson Toole

Fritz AI

Jameson Toole is the cofounder and CEO of Fritz AI, a company building tools to help developers optimize, deploy, and manage machine learning models on mobile devices. Previously, he built analytics pipelines for Google X’s Project Wing and ran the data science team at Boston technology startup Jana Mobile. He holds undergraduate degrees in physics, economics, and applied mathematics from the University of Michigan and both an MS and PhD in engineering systems from MIT, where he worked on applications of big data and machine learning to urban and transportation planning at the Human Mobility and Networks Lab.

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

aisponsorships@oreilly.com

For information on exhibiting or sponsoring a conference

pr@oreilly.com

For media/analyst press inquires