Engineering the Future of Software
Feb 3–4, 2019: Training
Feb 4–6, 2019: Tutorials & Conference
New York, NY
Please log in

ETL and event sourcing

Marc Siegel (Panorama Education)
10:45am–12:15pm Wednesday, February 6, 2019
Integration architecture
Location: Trianon Ballroom
Secondary topics:  Best Practice, Case Study
Average rating: ****.
(4.12, 8 ratings)

Who is this presentation for?

  • Software architects, information architects, engineers, and engineering managers

Level

Intermediate

Prerequisite knowledge

  • Experience with systems that extract data from external systems, transformed them, and loaded them to be queried

What you'll learn

  • Understand how event sourcing can work in practice when applied to traditional ETL problem domains
  • Learn how to evolve a pipeline while retaining determinism as a property

Description

Traditional ETL pipelines, consisting of extract, transform, and load stages, are a staple integration architecture pattern used in a wide variety of business domains. They present entire classes of familiar frustrations and impedance mismatches that many engineers have encountered firsthand.

More recently the concept of a data lake has grown in popularity, bringing certain ideas from domain-driven design, such as bounded contexts, to bear on these problem domains. Can you go even further into the world of DDD? What costs and benefits would you observe if you did?

Marc Siegel shares a real-life case study and lessons learned from going further into domain-driven design and applying the event sourcing pattern to the traditional problem domain of an ETL pipeline. If your work entails bringing lots of data into your system and building state off of it, you may find this talk interesting.

Topics include:

  • ETL and its challenges
  • Event sourcing basics
  • Case study: Event sourcing in your ETL
  • Lessons learned: Thinnest extractions possible
  • Lessons learned: Extracted files as a source of truth
  • Lessons learned: Better iteration on transformations
  • Lessons learned: Why TL must be kept fast and run often
Photo of Marc Siegel

Marc Siegel

Panorama Education

Marc Siegel is an engineering manager at Panorama Education, an education technology firm based in Boston. He has experience in web applications and event-driven systems, sometimes simultaneously. He’s passionate about building systems that tell the truth when asked questions. He got his start developing heterogeneous mobile network nodes at MIT Lincoln Laboratory and led development efforts at a number of startups, for everything from bidding systems for internet advertising to mobile handheld inspection software for tower cranes, although he is afraid of heights. Marc holds a BS in computer science from Brown University. You can find the slides from his last talk at the O’Reilly Software Architecture conference here.

Comments on this page are now closed.

Comments

Picture of Marc Siegel
Marc Siegel | ENGINEERING MANAGER
02/06/2019 12:36pm EST

Thanks for everyone who came to the session and all the great questions!