Running streaming workloads successfully is a challenge, whether you’re deploying on-premises or in the cloud. While buying a managed service is an option, it’s usually quite expensive. Therefore, many companies opt for open source streaming engines like Apache Spark’s Structured Streaming.
Apache Spark’s Structured Streaming consolidates all big data processing under a unified API. Built on the foundation of the Spark SQL engine, not only does Structured Streaming allow developers to express the same queries for batch as for streaming, but it also allows for different execution strategies for streaming processing, including microbatching for high throughput or continuous processing for low latency.
Bill Chambers shares a decision making framework for determining the best tools and technologies for successfully deploying and maintaining streaming data pipelines to solve business problems. Bill then offers an overview of Apache Spark’s Structured Streaming processing engine and shares lessons learned running hundreds of Structured Streaming workloads in the cloud. Along the way, Bill dives into the internals of the Structured Streaming engine and explains why it’s suitable for a variety of uses cases.
Topics include:
William Chambers is a product manager at Databricks, where he works on Structured Streaming and data science products. He is lead author of Spark: The Definitive Guide, coauthored with Matei Zaharia. Bill also created SparkTutorials.net as a way to teach Apache Spark basics. Bill holds a master’s degree in information management and systems from UC Berkeley’s School of Information. During his time at school, Bill was also creator of the Data Analysis in Python with pandas course for Udemy and cocreator of and first instructor for Python for Data Science, part of UC Berkeley’s Masters of Data Science program.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com