Engineering the Future of Software
November 13–14, 2016: Training
November 14–16, 2016: Tutorials & Conference
San Francisco, CA

An architecture for merging fast data and enterprise applications: The SMACK stack

Dean Wampler (Anyscale)
3:50pm–4:40pm Wednesday, 11/16/2016
Integration architecture
Location: Georgian Level: Intermediate
Average rating: ****.
(4.40, 5 ratings)

Prerequisite knowledge

  • Prior experience building architectures for data-centric or enterprise systems (useful but not required)

What you'll learn

  • Understand the weaknesses of current, popular approaches and how the SMACK Stack addresses current needs in both data-centric systems and general enterprise systems


Big data architectures—those using large frameworks like Spark, YARN, HBase or Cassandra, HDFS, and Kafka—have been slow to embrace microservices. Everything else—i.e., enterprise architectures (whether microservice-based or not)—have been less concerned with large data volumes and more interested in reactive tools that are flexible, adaptive, scalable, resilient, and event/message driven.

These two spheres are now slowly converging, as data teams need answers faster (hence the growing interest in streaming architectures) and enterprises become more data driven (hence the need for sophisticated, scalable data processing options that are still event-driven).

Another trend affecting both spheres is the need to optimize resource utilization and lower costs, which has led to the growth of virtualized services on flexible, efficient clusters that are capable of running all services. While Hadoop has dominated the data world, it is a first-generation architecture that isn’t well suited for more general enterprise needs.

Dean Wampler explores the SMACK stack—Spark, Mesos, Akka, Cassandra, and Kafka—discussing the role each tool plays in addressing the needs of fast data and enterprise environments, as well as what’s missing and what areas need to mature.

Photo of Dean Wampler

Dean Wampler


Dean Wampler is an expert in streaming data systems, focusing on applications of machine learning and artificial intelligence (ML/AI). He’s head of developer relations at Anyscale, which is developing Ray for distributed Python, primarily for ML/AI. Previously, he was an engineering VP at Lightbend, where he led the development of Lightbend CloudFlow, an integrated system for building and running streaming data applications with Akka Streams, Apache Spark, Apache Flink, and Apache Kafka. Dean is the author of Fast Data Architectures for Streaming Applications, Programming Scala, and Functional Programming for Java Developers, and he’s the coauthor of Programming Hive, all from O’Reilly. He’s a contributor to several open source projects. A frequent conference speaker and tutorial teacher, he’s also the co-organizer of several conferences around the world and several user groups in Chicago. He earned his PhD in physics from the University of Washington.