Presented By O’Reilly and Intel AI
Put AI to work
Sep 4-5, 2018: Training
Sep 5-7, 2018: Tutorials & Conference
San Francisco, CA

Speed versus specificity: Designing text annotation tasks for the people and algorithms that drive human-in-the-loop (HIL) products

Jason Laska (Clara Labs)
11:55am-12:35pm Friday, September 7, 2018
Implementing AI, Interacting with AI
Location: Imperial A
Secondary topics:  AI in the Enterprise, Text, Language, and Speech

Who is this presentation for?

  • Engineers, practitioners, founders, and investors

Prerequisite knowledge

  • A high-level understanding of how supervised learning problems are framed and why high-quality annotated data is important

What you'll learn

  • Understand how annotation tasks can fit into a real-time AI/human-in-the-loop architecture and how to frame problems so that annotation tasks can be both fast and accurate in such architectures
  • Learn how machine learning can enable simpler interfaces by taking on additional complexity

Description

Clara is a scheduling service that coordinates when, where, and how you meet with prospects, candidates, and collaborators. Clara delivers unprecedented efficiency and accuracy by combining the precision and consistency of intelligent software with the judgement of an expert team. This human-in-the-loop approach ensures scheduling communication is always clear, swift, and professional.

Clara’s platform leverages human skill by breaking up work into small tasks of two types: stateless annotation tasks and stateful automation-review workflow tasks. Annotations are generated by “mapping” email message text to distinct scheduling-parameter and machine learning (ML) labeling tasks. The task results are aggregated and reduced with state to automate the final scheduling outputs. These outputs are then sometimes reviewed for quality assurance in workflow tasks.

In contrast to conventional offline annotation systems, Clara’s annotation tasks may be used to drive automation in real time and are therefore time constrained. This means tasks must yield machine learning training data and be completed quickly and accurately. The tasks also need to be easily understandable by nonexperts. These properties are sometimes in conflict and may not all be equally achievable via a single interface or label definition.

Jason Laska explores the trade-offs between text annotations defined for fast data entry and those defined with high specificity and granularity meant for training ML models, using the application of DateTime text—modeled as recurrence-rule and coreference annotations—as it pertains to meeting-attendee availability to guide the discussion.

This is joint work with Joey Carmello, a software engineer at Clara Labs.

Photo of Jason Laska

Jason Laska

Clara Labs

Jason Laska is the head of engineering at Clara Labs. Previously, he spearheaded the computer vision program at Dropcam (acquired by Google in 2014), developing massive scale online vision systems for the product. Jason holds a PhD in electrical engineering from Rice University, where he made contributions to inverse problems, dimensionality reduction, and optimization. He briefly dabbled in publishing as a cofounder and editor of Rejecta Mathematica, a publication for previously rejected mathematics articles.