Put open source to work
July 16–17, 2018: Training & Tutorials
July 18–19, 2018: Conference
Portland, OR

Distributed systems for stream processing: Apache Kafka and Spark Streaming

Lena Hall (Microsoft)
11:00am11:40am Thursday, July 19, 2018
Distributed computing
Location: Portland 255
Level: Intermediate
Average rating: ****.
(4.75, 4 ratings)

Who is this presentation for?

  • Engineers who work with real-time data

Prerequisite knowledge

  • A basic understanding of Scala (useful but not required)

What you'll learn

  • Learn how to set up and build a distributed streaming architecture on Azure using open source frameworks like Apache Kafka and Spark Streaming
  • Understand how to process data coming from multiple sources in real time and perform machine learning tasks

Description

Everything is a data source, and today’s online activities, financial operations, and IoT devices and sensors generate data at an ever-increasing rate. So how do we ingest, process, and manage that data? We need an architecture to ingest these incoming influxes of data that is flexible, scalable, fast, and resilient.

Alena Hall walks you through setting up and building a distributed streaming architecture on Azure using open source frameworks like Apache Kafka and Spark Streaming. You’ll use these distributed systems to process data coming from multiple sources in real time and perform machine learning tasks. Along the way, you’ll discover how to effectively and interactively experiment with streams.

Photo of Lena Hall

Lena Hall

Microsoft

Lena Hall is a senior software engineer and developer advocate at Microsoft working on Azure, where she focuses on large-scale distributed systems and modern architectures. Lena has more than 10 years of experience in software engineering with a focus on distributed cloud programming, real-time system design, highly scalable and performant systems, big data analysis, data science, functional programming, and machine learning. Previously, she was a senior software engineer at Microsoft Research. She’s an elected member of the F# Software Foundation’s board of trustees, co-organizes a conference called ML4ALL, and is often an invited member of program committees for conferences like Kafka Summit, Lambda World, and others. Lena holds a master’s degree in computer science.