People often think of stream processing as a toolset for running “big data” computations on high-velocity data streams: problems that are too fast to be computed in batches. But stream processing is really a far richer field that applies correctness to any arbitrary graph of intertwined functions—a problem that complex serverless applications are forced to contend with in one way or another. But serverless technologies optimize for a different goal: abstracting your programs away from the underlying infrastructure, executing when triggered, and scaling automatically. So what happens if we mix the best of these two worlds?
Ben Stopford explores this question by examining how stream processing systems have evolved away from being simple functions sewn together by a messaging system. Ben looks at the core building blocks (e.g., exactly once semantics and buffering and joining of event streams) as well as higher-level patterns used by contemporary event-driven applications (e.g., event sourcing, CQRS, and event streams as a source of truth). This inevitably leads to a model where stream processors play the role of a kind of database that derives rich, use case-specific event streams, keeping your functions simple, stateless, and scalable. Ben concludes by reflecting on what the future likely holds for these two fields as the approaches converge in serverless frameworks that have far richer capabilities than the state of the art today.
Ben Stopford is a technologist working in the Office of the CTO at Confluent, a company backing the popular Apache Kafka messaging system. He has two decades of experience in the field and has focused on distributed data infrastructure for the last half of it. He’s the author of Designing Event-Driven Systems from O’Reilly.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com