Sep 23–26, 2019

Executive Briefing: What it takes to use machine learning in fast data pipelines

Dean Wampler (Lightbend)
4:35pm5:15pm Thursday, September 26, 2019
Location: 1E 10/11

Who is this presentation for?

Business executives and managers who want to understand the importance and implications of combining ML/AI and streaming data pipelines.

Level

Intermediate

Description

This talk helps you develop a conceptual understanding of the challenges faced by your teams as they develop and deploy Machine Learning and Artificial Intelligence (ML/AI) services integrated with fast data (streaming) pipelines. While combining these technologies is challenging, the benefits include timely delivery of innovative services to your customers.

We’ll begin with a brief overview of the following topics:

  • The business justification for integrating ML/AI and streaming
  • ML/AI scenarios which are best delivered through streaming

With that background, we’ll pursue the following goals:

  • Understand the main challenges using these technologies together
  • Ways to bridge the gap between data science and production teams, their tools and methods and sometimes conflicting goals, for example, exploration of ideas and optimal scoring results vs. production reliability and efficiency
  • Understand that streaming ML/AI services must run reliably and handle variable loads for a long time, requiring us to leverage best practices from the microservices world
  • Understand how to update models in the streaming application before they become stale without downtime and other practical problems

Prerequisite knowledge

Prior awareness of ML/AI and streaming ideas are useful, but not required.

What you'll learn

The attendee will understand: 1. The business motivations for serving ML/AI in streaming pipelines 2. The organizational and technical challenges of combining data science and production-hardened, streaming pipelines 3. Approaches to several specific issues, such as updating models in running pipelines without downtime
Photo of Dean Wampler

Dean Wampler

Lightbend

Dean Wampler, Ph.D., is VP of Fast Data Engineering at Lightbend. He leads the Lightbend Fast Data Platform team, a scalable, distributed stream data processing stack using Kubernetes, Spark, Flink, Kafka, and Akka, with machine learning and management tools. Dean is the author of Fast Data Architectures for Streaming Applications, Programming Scala, Second Edition and Functional Programming for Java Developers and the coauthor of Programming Hive, all from O’Reilly Media. He is a contributor to several open source projects, a frequent Strata speaker, and the co-organizer of several conferences around the world and several user groups in Chicago. Dean yells at clouds on Twitter, @deanwampler.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

strataconf@oreilly.com

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts