Although it has long been used for has been used for use cases like simulation, training, and UX mockups, human in the loop (HITL) has emerged as a key design pattern for managing teams where people and machines collaborate. One approach, active learning (sometimes called semisupervised learning), employs mostly automated processes based on machine learning models but refers edge cases—typically the areas which are most uncertain or highest risk—to human experts, whose decisions help improve new iterations of the models. Meanwhile, an HITL practice can help organizations prepare datasets for use in deep learning.
Paco Nathan reviews case studies and management perspectives for leveraging HITL, along with related open source projects and commercial products. In particular, Paco examines a use case at O’Reilly Media in which ML pipelines for categorizing content are trained solely by subject-matter experts providing examples, based on HITL and leveraging Project Jupyter, Apache Spark, and scikit-learn for implementation.
Paco Nathan is known as a “player/coach” with core expertise in data science, natural language processing, machine learning, and cloud computing. He has 35+ years of experience in the tech industry, at companies ranging from Bell Labs to early-stage startups. His recent roles include director of the Learning Group at O’Reilly and director of community evangelism at Databricks and Apache Spark. Paco is the cochair of Rev conference and an advisor for Amplify Partners, Deep Learning Analytics, Recognai, and Primer. He was named one of the "top 30 people in big data and analytics" in 2015 by Innovation Enterprise.
©2018, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org