Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Executive Briefing: What you need to know about fast data

Dean Wampler (Anyscale)
14:5515:35 Wednesday, 23 May 2018
Average rating: ****.
(4.00, 2 ratings)

Who is this presentation for?

  • Business executives

What you'll learn

  • Learn the business motivations for fast data applications, the organizational challenges required when moving to streaming architectures, and how what your organization already knows about microservices can meet those challenges

Description

Streaming data systems, so called fast data, promise accelerated access to information, leading to new innovations and competitive advantages. But they aren’t just faster versions of big data. They force architecture changes to meet new demands for reliability and dynamic scalability, more like microservices. Dean Wampler outlines what you need to know to exploit fast data successfully.

Big data started with an emphasis on batch-oriented architectures, where data is captured in large, scalable stores and then processed using batch jobs. To reduce the gap between data arrival and information extraction, these architectures are now evolving to be stream oriented, where data is processed as it arrives. While a new buzzword, fast data is also a new opportunity for innovation in how your organization leverages data.

However, fast data architectures introduce new challenges for your organization. Whereas a batch job might run for hours, a stream processing application might run for weeks or months. This raises the bar for making these systems resilient against traffic spikes, hardware and network failures, and so forth. The microservice world has faced these challenge for a while. Your data teams will likely need to evolve to resemble the teams you already have for your microservices-based systems. In fact, you’ll probably merge these teams over time, as your microservices do more data processing and your data systems leverage your microservices.

Topics include:

  • The business justification for transitioning from batch-oriented big data to stream-oriented fast data
  • The organizational changes that streaming architectures require to meet their higher demands for reliability, resiliency, dynamic scalability, etc.
  • How some of these requirements can be met by leveraging what your organization already knows about microservice architectures
Photo of Dean Wampler

Dean Wampler

Anyscale

Dean Wampler is an expert in streaming data systems, focusing on applications of machine learning and artificial intelligence (ML/AI). He’s head of developer relations at Anyscale, which is developing Ray for distributed Python, primarily for ML/AI. Previously, he was an engineering VP at Lightbend, where he led the development of Lightbend CloudFlow, an integrated system for building and running streaming data applications with Akka Streams, Apache Spark, Apache Flink, and Apache Kafka. Dean is the author of Fast Data Architectures for Streaming Applications, Programming Scala, and Functional Programming for Java Developers, and he’s the coauthor of Programming Hive, all from O’Reilly. He’s a contributor to several open source projects. A frequent conference speaker and tutorial teacher, he’s also the co-organizer of several conferences around the world and several user groups in Chicago. He earned his PhD in physics from the University of Washington.