Walmart handles more than 1 million customer transactions every hour. The average Boeing 737 engine generates 10 terabytes of data every 30 minutes in flight. By 2020, researchers estimate there will be 100 million internet connected devices. To process this data in real time—whether from mobile phones or jet engines—will be the new normal. How are companies today adapting to this new real-time stream of data, using open source projects that allow them to do this kind of stream processing at scale, including Apache Kafka, Apache Storm, Apache Samza, Apache Spark, and so on?
At the same time, how are organizations adapting the compute power depending on business needs or to accommodate the relentless growth in inbound traffic? True elastic stream processing can be achieved by combining a highly-scalable platform like Apache Mesos, with stream-processing frameworks built on top of it such as Marathon, Spark, Kafka, and new emerging solutions. In this talk we will discuss the use cases and requirements, and demonstrate a Mesos-based solution for elastically processing data streams.
Michael Hausenblas is a developer advocate for OpenShift and Kubernetes at Red Hat, where he helps app ops engineers build and operate distributed services. Michael shares his experience with distributed systems and large-scale data processing through demos, blog posts, and public speaking engagements and contributes to open source software such as OpenShift and Kubernetes. Previously, Michael was a developer advocate at Mesosphere, chief data engineer at MapR Technologies, and a research fellow at the National University of Ireland, Galway, where he researched large-scale data integration and the internet of things and gained experience in advocacy and standardization (World Wide Web Consortium, IETF).
©2015, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.