Walmart handles more than 1 million customer transactions every hour. The average Boeing 737 engine generates 10 terabytes of data every 30 minutes in flight. By 2020, researchers estimate there will be 100 million internet connected devices. To process this data in real time—whether from mobile phones or jet engines—will be the new normal. How are companies today adapting to this new real-time stream of data, using open source projects that allow them to do this kind of stream processing at scale, including Apache Kafka, Apache Storm, Apache Samza, Apache Spark, and so on?
At the same time, how are organizations adapting the compute power depending on business needs or to accommodate the relentless growth in inbound traffic? True elastic stream processing can be achieved by combining a highly-scalable platform like Apache Mesos, with stream-processing frameworks built on top of it such as Marathon, Spark, Kafka, and new emerging solutions. In this talk we will discuss the use cases and requirements, and demonstrate a Mesos-based solution for elastically processing data streams.
Michael Hausenblas is a data center application architect with Mesosphere. He helps DevOps to build and operate scalable and elastic distributed applications. His background is in large-scale data integration, Hadoop, and NoSQL. Michael is also contributing to open source software at Apache (Myriad, Drill).
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.