Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

Big data, big quality: Data quality at Spotify

Irene Gonzálvez (Spotify)
11:1511:55 Thursday, 24 May 2018
Data engineering and architecture
Location: S11B Level: Intermediate
Secondary topics:  Data Integration and Data Pipelines sessions, Data Platforms, Media, Advertising, Entertainment
Average rating: ***..
(3.88, 8 ratings)

Who is this presentation for?

  • Data engineers, data scientists, and managers

Prerequisite knowledge

  • A basic understanding of big data concepts and technologies (e.g., pipelines, Hadoop, and Google Cloud Platform)

What you'll learn

  • Explore Spotify's process for ensuring data quality

Description

The data quality domain is enormously large, so you need to understand your company pain points to know what to focus on first. Data quality is not an add-on but an equal component in the data strategy of every single company. Irene Gonzálvez shares Spotify’s process for ensuring data quality, covering why and how the company became aware of its importance, the products it has developed, and future strategy.

Topics include:

  • What data quality is
  • The most important data quality dimensions for Spotify
  • How and why Spotify became aware of the importance of quality in the data domain
  • The products Spotify has built to tackle each of the data quality dimensions
  • TC4D (Test Certified for Data): An educational program to enforce best practices when creating new pipelines
  • The evolution of the current products and Spotify’s future strategy
Photo of Irene Gonzálvez

Irene Gonzálvez

Spotify

Irene Gonzálvez is a product manager at Spotify. Passionate about innovation and the transformation of business values and customers’ needs into new technical solutions, Irene combines highly technical expertise with accurate planning and leadership capabilities.

Comments on this page are now closed.

Comments

Manohar Patil | SENIOR BIG DATA ARCHITECT
31/05/2018 14:42 BST

It was a great session with pragmatic approach for most common problems found in most of the organization. Would appreciate if you can share the slides

Fran De Backer | DATA PRACTICE
30/05/2018 10:15 BST

Hi Irene,

I saw your presentation. Thank you for that.
Can you share your slides?

Thx