Presented By O’Reilly and Cloudera

San Francisco • London • New York

Make Data Work

September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Managing data chaos in the world of microservices

Oleksii Kachaiev (Attendify)

3:30pm–4:10pm Thursday, 09/13/2018

Data engineering and architecture
Location: 1A 10 Level: Intermediate

Average rating:

(3.50, 2 ratings)

Download slides (PDF)

Who is this presentation for?

Software engineers, software architects, and CTOs

Prerequisite knowledge

Familiarity with microservices and distributed data challenges
A basic understanding of why we need different storage solutions for different needs

What you'll learn

Understand why encapsulation of data is a good thing for your services and a bad thing for your data architecture, solutions to the problem of data observability, data fetching from different origins, data versioning and how to approach the problem in your organization.

Description

Microservices is one of the hottest topics in recent years, and the industry is shifting toward splitting applications into smaller and smaller independent units. This is all happening for very good reasons; you can gain a lot both in terms of technologies and organizational scalability. Many infrastructure tools to support the movement have been developed, from schedulers, deploy automation, and services discovery systems to development tools, like distribute tracers, log aggregators, and analyzers, and we’ve invented and reinvented protocols to make microservices communication even more efficient. However, one problem is often overlooked: the data layer is being diluted due to active encapsulation, which is essential for microservices to grow and evolve.

As we move toward more independently encapsulated services, we’re experiencing dramatically increased challenges managing data, including:

Observability, knowledge sharing, and data discovery (Who owns that piece of the data? Where can I find that thing?)
Querying the data (What API should I expose for others? How can I get this info from that dataset? Should I cache this or re-query when necessary?)
Structural and semantic changes in the datasets (Can I add a new field here? Who’s using this record, and how should I update one not breaking any other services?)

These problems are common, but most of our effort and attention is directed at infrastructure, which is easier to find generic solutions for. On the other hand, making sense of the data is hardly a generalizable problem. There have been many attempts to tame the chaos associated with independent dataset management. Alexey Kachayev discusses high-level approaches to build a sharable abstraction layer separating “physical” details from logical concerns as well as specific technologies you can leverage.

The growing complexity of your data layer may overshadow the benefits of microservices architecture you deployed, so the sooner you start working on the solution, the easier it will be to manage the chaos.

Oleksii Kachaiev

Attendify

Oleksii (Alexey) Kachaiev is the CTO at Attendify, where he spends his days coding in Clojure, Haskell, and Rust. His interests include algebra and protocols. Alexey is the author of the Muse and Fn.py libraries and is an active contributor to Aleph and other open source projects.

Website

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsors

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com