Online Machine Learning in Streaming Applications
Who is this presentation for?ML engineers, Data scientists, Data engineers, Data architects, IoT experts
Applications such as:
- smart homes
- smart monitoring of industrial environments
- augmented reality in retail
- auto-connected cars
are driving a new era in on-line machine learning where machine learning (ML) algorithms have been moved to the edge instead of the cloud.
These applications are constrained in terms of:
- resources like power, cpu, memory etc
- responsiveness. Data flows in the system and the application needs to interact with the surrounding environment in a given time window.
Here we will describe the foundations of the algorithmic aspects (Hoeffding Adaptive Trees, classic sketch data structures, drift detection algorithms [MLDS]) of these applications and dive into the details of how they can be implemented and deployed efficiently in production.
We will evaluate production concerns like:
- performance (latency, memory footprint, etc)
- techniques for updating models being served in a running pipeline and future trends [LWEFS] like feature space representation and sampling.
- tools [SPAF] to use for the actual implementation and deployment of these algorithms.
Concepts we describe can also be applied in a cloud setting so this talk covers a lot of practical aspects that are universal and will benefit any practitioner of ML. Our main focus though is cutting-edge applications and technologies, who wants to miss a glance in the future?
[MLDS] Machine Learning for Data Streams: https://moa.cms.waikato.ac.nz/author/moa/
[SPAF] Streaming Predictive Analytics on Flink http://www.diva-portal.org/smash/get/diva2:843219/FULLTEXT01.pdf
[LWFES] Learning with Feature Evolvable Streams https://papers.nips.cc/paper/6740-learning-with-feature-evolvable-streams.pdf
Prerequisite knowledgeML basics, streaming basics, basic knowledge of tools for writing streaming applications like Apache Spark, Apache Flink and tools for application deployment/orchestration like Kubernetes
What you'll learn
Stavros is a senior engineer at data systems team at Lightbend. He helps with the implementation of the Lightbend’s fast data strategy. He has worked for several years building software solutions that scale in different verticals like telecoms and marketing. His interests among others are: distributed system design, streaming technologies, and NoSql databases.
Debasish Ghosh is principal software engineer at Lightbend. Passionate about technology and open source, he loves functional programming and has been trying to learn math and machine learning. Debasish is an occasional speaker in technology conferences worldwide, including the likes of QCon, Philly ETE, Code Mesh, Scala World, Functional Conf, and GOTO. He is the author of DSLs In Action and Functional & Reactive Domain Modeling. Debasish is a senior member of ACM. He’s also a father, husband, avid reader, and Seinfeld fanboy who loves spending time with his beautiful family.
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
View a complete list of Strata Data Conference contacts