San FranciscoLondonNew York

Presented By
O’Reilly + Cloudera

Make Data Work

29 April–2 May 2019
London, UK

Please log in

Add to Your Schedule

Executive Briefing: What it takes to use machine learning in fast data pipelines

Dean Wampler (Anyscale)

16:35–17:15 Thursday, 2 May 2019

Executive Briefing and best practices, Strata Business Summit
Location: Capital Suite 13

Secondary topics: Data Integration and Data Pipelines, Streaming and realtime analytics

Average rating:

(5.00, 4 ratings)

View slides

Level

Beginner

What you'll learn

Understand the business justification for transitioning from batch-oriented big data to stream-oriented fast data, including the delivery of stream-based, ML/AI services
Discover the main challenges faced when deploying these technologies together
Explore solutions to these challenges, including criteria to use when evaluating choices

Description

Dean Wampler helps you develop a conceptual understanding of the challenges faced by your teams as they develop and deploy machine learning and artificial intelligence services integrated with fast data (streaming) pipelines. While combining these technologies is challenging, the benefits include timely delivery of innovative services to your customers.

Dean begins by briefly discussing machine learning use cases that are best delivered as streaming data applications. He then explores the main challenges faced when deploying these technologies together and outlines solutions to these challenges, including criteria to use when evaluating choices. Along the way, he explains the tools your teams are already talking about and the role they play.

Topics include:

Bridging the gap between data science tools and methods versus data engineering tools and methods needed for robust production delivery
How fast data pipelines are forcing changes to data architectures, in order to meet higher demands for reliability, resiliency, dynamic scalability, etc.
Performance implications of different AI/ML and fast data tools and techniques
Deploying updates to ML/AI capabilities into running pipelines without forcing restarts

Dean Wampler

Anyscale

Dean Wampler is an expert in streaming data systems, focusing on applications of machine learning and artificial intelligence (ML/AI). He’s head of developer relations at Anyscale, which is developing Ray for distributed Python, primarily for ML/AI. Previously, he was an engineering VP at Lightbend, where he led the development of Lightbend CloudFlow, an integrated system for building and running streaming data applications with Akka Streams, Apache Spark, Apache Flink, and Apache Kafka. Dean is the author of Fast Data Architectures for Streaming Applications, Programming Scala, and Functional Programming for Java Developers, and he’s the coauthor of Programming Hive, all from O’Reilly. He’s a contributor to several open source projects. A frequent conference speaker and tutorial teacher, he’s also the co-organizer of several conferences around the world and several user groups in Chicago. He earned his PhD in physics from the University of Washington.

Website

Presented by

Global Sponsors

Zettabyte Sponsor

Exabyte Sponsor

Impact Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2019, O’Reilly UK Ltd • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com