Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA
Please log in

Building Rakuten analytics: A story of evolutions

11:50am12:30pm Thursday, March 28, 2019
Average rating: ****.
(4.75, 4 ratings)

Who is this presentation for?

  • Data architects, data engineers, and engineering managers

Level

Beginner

Prerequisite knowledge

  • A basic understanding of data pipelines and batch and real-time processing for analytics
  • Familiarity with productionalizing software

What you'll learn

  • Learn the challenges in transporting data in real time
  • Understand the trade-offs in building an analytics service on-premises
  • Discover how a diverse and agile team culture can enable a small team to have a big impact to the organization
  • See the impact real-time data can have on businesses that used to rely on traditional batch reporting

Description

For many years, Rakuten relied on a third-party analytics service. Due to the rising cost that accompanies increasing web and app traffic and the complexity of user data governance, Rakuten decided to build its own. Rakuten Analytics is an on-premises analytics service created in-house to help stakeholders accelerate their decision-making process by having deeper insights into their businesses.

Juan Paulo Gutierrez discusses the evolutions the service went through, covering requirements (from data collection to data engineering and data visualization and reporting); the team (required and learned technical skills); the infrastructure (operations, automation); processing (from daily batch to hourly batch to real time); and architecture (from monolith to SOA/microservices and containerization). Join in to learn how a small team approached challenges and designed the architecture and why diversity and agility was critical to deliver a product that was designed to evolve over time.

Photo of Juan Paulo Gutierrez

Juan Paulo Gutierrez

Rakuten

Juan Paulo Gutierrez is a senior software engineer at Rakuten, where he leads data architecture, data engineering, and data visualization teams. Paulo contributes to open source projects through code, documentation, feature requests, and discussions. Previously, he was the product development lead for Media Links’s network management software.