Presented By O'Reilly and Cloudera
Make Data Work
March 13–14, 2017: Training
March 14–16, 2017: Tutorials & Conference
San Jose, CA

Stream me up, Scotty: Transitioning to the cloud using a streaming data platform

Gwen Shapira (Confluent), Bob Lehmann (Bayer)
2:40pm3:20pm Wednesday, March 15, 2017
Big data and the Cloud, Enterprise adoption
Location: 230 A Level: Intermediate
Secondary topics:  Architecture, Data Platform
Average rating: ****.
(4.50, 2 ratings)

Who is this presentation for?

  • Data architects and engineers

Prerequisite knowledge

  • A basic understanding of Apache Kafka and AWS concepts

What you'll learn

  • Learn a reference architecture for cloud migration with Apache Kafka


Many enterprises have a large technical debt in legacy applications hosted in on-premises data centers. There is a strong desire to modernize and move to a cloud-based infrastructure, but the world won’t stop for you to transition. Existing applications need to be supported and enhanced; data from legacy platforms is required to make decisions that drive the business. On the other hand, data from cloud-based applications does not exist in a vacuum. Legacy applications need access to these cloud data sources and vice versa.

Can an enterprise have it both ways? Can new applications be built in the cloud while existing applications are maintained in a private data center?

Monsanto has adopted a cloud-first mentality—today most new development is focused on the cloud. However, this transition did not happen overnight. Gwen Shapira and Bob Lehmann share their experience and patterns building and implementing a Kafka-based cross-data-center “data hub” to facilitate the move to the cloud—in the process, kick-starting Monsanto’s transition from batch to stream processing. Details include an overview of the challenges involved in transitioning to the cloud and a deep dive into the cross-data-center stream platform architecture, including best practices for running this architecture in production and a summary of the benefits seen after deploying this architecture.

Photo of Gwen Shapira

Gwen Shapira


Gwen Shapira is a system architect at Confluent, where she helps customers achieve success with their Apache Kafka implementations. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. Gwen currently specializes in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, the coauthor of Hadoop Application Architectures, and a frequent presenter at industry conferences. She is also a committer on Apache Kafka and Apache Sqoop. When Gwen isn’t coding or building data pipelines, you can find her pedaling her bike, exploring the roads and trails of California and beyond.

Photo of Bob Lehmann

Bob Lehmann


Bob Lehmann is an architect on the Data Platform team at Monsanto, where he leads efforts to both modernize enterprise technology and transition to the cloud. Bob has held a number of positions in IT and engineering working with data ranging from high-volume sensor data to enterprise data (and everything in between). He holds a master’s degree in electrical engineering from Missouri University of Science and Technology.