Engineering the Future of Software
16–18 October 2017: Conference & Tutorials
18–19 October 2017: Training
London, UK

Rethinking microservices with stateful streams

Ben Stopford (Confluent)
15:5016:40 Monday, 16 October 2017
Microservices, pros and cons
Location: King's Suite - Sandringham Level: Intermediate
Secondary topics:  Best Practice
Average rating: ****.
(4.38, 13 ratings)

Prerequisite Knowledge

  • Familiarity with microservices and messaging

What you'll learn

  • Understand why you should rethink your reliance on request response protocols and rethink service communication with retentive channels
  • Learn how stream processing makes the job of blending data from different services both easier and less tightly coupled

Description

When building service-based systems, we don’t generally think too much about data. If we need data from another service, we ask for it. This pattern works well for whole swathes of use cases, particularly ones where datasets are small and requirements are simple. But real business services have to join and operate on datasets from many different sources. This can be slow and cumbersome in practice.

These problems stem from an underlying dichotomy. Data systems are built to make data as accessible as possible—a mindset that focuses on getting the job done. Services, instead, focus on encapsulation—a mindset that allows independence and autonomy as we evolve and grow. But these two forces inevitably compete in most serious service-based architectures.

Ben Stopford explains why understanding and accepting this dichotomy is an important part of designing service-based systems at any significant scale. Ben looks at how companies use log-backed architectures to build an immutable narrative that balances data that sits inside their services with data that is shared, an approach that allows the likes of Uber, Netflix, and LinkedIn to scale to millions of events per second.

Ben concludes with a set implementation patterns, starting lightweight and gradually getting more functional, paving the way for an evolutionary approach to building log-backed microservices.

Photo of Ben Stopford

Ben Stopford

Confluent

Ben Stopford is an engineer and architect on the Apache Kafka core team at Confluent (the company behind Apache Kafka). A specialist in data, both from a technology and an organizational perspective, Ben previously spent five years leading data integration at a large investment bank, using a central streaming database. His earlier career spanned a variety of projects at Thoughtworks and UK-based enterprise companies. He writes at Benstopford.com.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Comments

Picture of Ben Stopford
Ben Stopford | ENGINEER
16/10/2017 22:46 BST

Hi Eitan

It depends a little on what you are trying to achieve.
The invariant is basically: if you want things to end up on the same machine, they need to be in the same partition number (this can be the same partition in different topics if those topics have the same number of partitions).

But if you’re talking about creating a view over a set of nodes, then Kafka provides an API that lets you discover where keys are. I’m going to push some demo code for this but there is already and example that demonstrates this at the link below. This checks to see where the key is located and forwards the request to that node if it needs to.

https://github.com/confluentinc/examples/blob/233c2bfedd68c0f032650c11969e54f2947b2581/kafka-streams/src/main/java/io/confluent/examples/streams/interactivequeries/kafkamusic/MusicPlaysRestService.java#L138

Eitan Yarden | ARCHITECT
16/10/2017 18:48 BST

Thanks for your presentation. You mentioned the option to have services that consume data from kafka and expose a REST api.
What solutions are available if we use consumer group to partition our data and following that have to route the api requests to the correct node based on partitioning key / partition assignment?