Sep 23–26, 2019

The why and how of data lineage

Neelesh Salian (Stitch Fix)
4:35pm5:15pm Thursday, September 26, 2019
Location: 1A 15/16

Who is this presentation for?

Software Engineers, Data Engineers, Data Architects, Product Managers

Level

Intermediate

Description

Every data team has to build an ecosystem that sustains the data, the users and the use of the data itself. This data ecosystem comes with its own challenges during the building phase as well as during its maintenance and enhancement. One of the challenges is the need for a mechanism for data to be reliably monitored, associated with a purpose and having the ability to be traced and retrieved. This is the foundation of the idea of Data Lineage.

It is important to understand the role of data lineage in an organization and how it can impact the data ecosystem. This talk focuses on the why and the how behind having a mechanism for data lineage in your organization. The Why includes understanding the exact need for lineage by examining the use cases it would power. While the How talks about the requirements and design that are needed to build such a mechanism.

After covering the philosophy behind building data lineage, there will also be a discussion of what tools can be readily used versus the idea of building something on your own.

Prerequisite knowledge

Understanding of data infrastructure

What you'll learn

This session's goal is to raise awareness about data lineage as a mechanism that needs to exist in an organization to enhance the use of its own data.
Photo of Neelesh Salian

Neelesh Salian

Stitch Fix

Neelesh Srinivas Salian is a Software Engineer on the Data Platform team at Stitch Fix, where he works on the compute infrastructure used by the company’s data scientists. Previously, he worked at Cloudera, where he worked with Apache projects like YARN, Spark, and Kafka.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

strataconf@oreilly.com

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts