Sep 23–26, 2019

Orchestrating Data Workflows Using a Fully Serverless Architecture

Tomer Levi (Fundbox)
2:05pm2:45pm Wednesday, September 25, 2019
Location: 1E 07/08
Secondary topics:  Cloud Platforms and SaaS, Data, Analytics, and AI Architecture

Who is this presentation for?

Senior Data Engineer at Fundbox

Level

Intermediate

Description

Fundbox is a growing fintech company that provides an automatic underwriting platform based on data and AI.
While scheduling a limited number of data workflows is a generally manageable task, scaling to hundreds of data workflows with dependencies and diverse job types, requires a substantial customized engineering, complexity, and overall expensive resources. Serverless-based architectures offer an alternative to traditional resource management.

Tomer Levi explains how the data engineering team at Fundbox uses AWS StepFunctions, Docker containers, and Spark to build a live serverless data orchestration platform, focusing on their decision to build a user-freindly yet a powerful and scalable solution. Tomer will further describe AWS StepFunctions state machines, their limitations, and how to overcome them by building custom job scheduling and dependency features. Finally, the talk will illustrate how resource bottlenecks were overcome using Docker containers and AWS Fargate.
Fundbox’s architecture is scalable and already serves dozens of engineers, BI developers and data scientists in the company.

Prerequisite knowledge

A basic understanding of serverless solutions. Familiarity with the challenges introduced by enterprise architectures.

What you'll learn

Learn how Fundbox used AWS StepFunctions, Docker containers, and ECS Fargate to build a serverless data workflow platform. Understand key considerations from a data engineering perspective for deploying data workflow jobs.
Photo of Tomer Levi

Tomer Levi

Fundbox

Tomer is a senior data engineer on the DataOps team at Fundbox, where he helps shape the data platform architecture to drive business goals.
Previously, he was a data engineer at Intel’s advanced analytics group helping to build out the data platform supporting the data storage and analysis needs of Intel® Pharma Analytics Platform, an edge-to-cloud artificial intelligence solution for remote monitoring of patients during clinical trials.
He is incredibly passionate about the power of data. Tomer holds a BSc in software engineering.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

strataconf@oreilly.com

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts