Put AI to work
June 26-27, 2017: Training
June 27-29, 2017: Tutorials & Conference
New York, NY

Building training data for autonomous driving

Matt Shobe (Mighty AI)
11:05am11:45am Thursday, June 29, 2017
Implementing AI
Location: Sutton South/Regent Parlor Level: Beginner
Secondary topics:  Machine Learning, Transportation and Logistics, Vision
Average rating: ****.
(4.00, 1 rating)

Prerequisite Knowledge

  • A basic understanding of computer vision model training

What you'll learn

  • Learn lessons on building a training dataset for autonomous driving, including workflow design and annotation tools, how to perform these tasks at scale, and using a reference dataset


Building training data for computer vision models that can detect and recognize foreground objects in images like trees, pedestrians, and bicyclists is one thing. Building training data for autonomous driving—which must see everything going on in the scene, from the objects to the environment itself—is quite another.

Say you have a photo of a street scene with multiple cars, pedestrians, trees, the sky, the road, etc. How do you label objects that aren’t as simple to define, like the gaps in the trees with sky in between them? What about defining the percentage of the sky in an image of the same road when the leaves are in full bloom versus when they’ve fallen to the ground? Or what about snow on the ground: do you tag the snow, the ground, or both? Questions like these challenged Mighty AI to reconsider its entire image-labeling workflow.

Matt Shobe shares lessons Mighty AI has learned while creating a training dataset for autonomous driving. Matt drills into the challenges of creating semantic segmentation masks, including workflow design and annotation tools, as well as how to perform these tasks at scale. Along the way, he covers the successes (and failures) that led the company to publish an open source dataset for autonomous driving.

Photo of Matt Shobe

Matt Shobe

Mighty AI

Matt Shobe is the cofounder and chief product officer at Mighty AI, the world’s leading training-data-as-a-service platform. Mighty AI provides the highly accurate, domain-specific, structured human insights that companies need to apply their artificial intelligence and machine learning models (including autonomous driving solutions) and operates Spare5, the microtask platform that enables people to spend their spare time productively. Matt’s technology startup goes back nearly 20 years. Matt worked with the same three other cofounders on three Chicago startups—FeedBurner (acquired by Google in 2007), Spyonit, and DKA—and learned the ground rules in user experience roles with Accenture and Microsoft. He holds an MS in human-centered design and engineering from the University of Washington. Matt is an avid distance runner, private pilot, and skier, although no such triathlon exists (yet).