Storm makes it easy to write and scale complex realtime computations on a cluster of computers, doing for realtime processing what Hadoop did for batch processing. Storm guarantees that every message will be processed. And it’s fast — you can process millions of messages per second with a small cluster. Best of all, you can write Storm topologies using any programming language. Storm was open-sourced by Twitter in September of 2011 and has since been adopted by numerous companies around the world.
Storm provides a small set of simple, easy to understand primitives. These primitives can be used to solve a stunning number of realtime computation problems, from stream processing to continuous computation to distributed RPC. In this talk you’ll learn:
- The concepts of Storm: streams, spouts, bolts, and topologies
- Developing and testing topologies using Storm’s local mode
- Deploying topologies on Storm clusters
- How Storm achieves fault-tolerance and guarantees data processing
- Computing intense functions on the fly in parallel using Distributed RPC
- Making realtime computations idempotent using transactional topologies
- Examples of production usage of Storm
Nathan Marz is the lead engineer on Twitter’s Publisher Analytics team. He was previously the lead engineer at BackType before being acquired by Twitter in July of 2011.
Nathan is the author of numerous open-source projects relied upon by companies all around the world. These include Storm, Cascalog, and ElephantDB.
Nathan is also the author of an upcoming book from Manning Publications called “Big Data: principles and best practices of scalable realtime data systems”.
He has spoken about his work at conferences such as the Hadoop Summit, Strange Loop, Gluecon, Clojure/conj, and POSSCON. He writes a blog at nathanmarz.com.
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at (707) 827-7065 or email@example.com.
View a complete list of OSCON contacts