Genji: A framework for building resilient near-real-time data pipelines

Swaminathan Sundaramurthy (Salesforce Inc), Mark Cho (Pinterest)

2:25pm–3:05pm Wednesday, October 4, 2017

Distributed Data & Databases, Real time, events, streams & scale
Location: Grand Ballroom West

Download slides (PDF)

Who is this presentation for?

Data leaders, engineers, and engineering managers

Prerequisite knowledge

A basic understanding of data warehousing (useful but not required)

What you'll learn

Explore Pinterest's near-real-time data warehouses

Description

Pinterest operates on data at petabyte scale. Previously, the company’s fact tables were generated daily using Hadoop, resulting in data that was frequently 24–48 hours old. In order to support real-time decision making, stats, and analytics, Pinterest modeled its warehouse on quasi-Kappa architecture, treating batch processing as a special case of stream processing and warehousing data with sub-15-minute lag.

Swaminathan Sundaramurthy and Mark Cho offer an overview of Pinterest’s real-time data pipeline, discussing the company’s decision to warehouse data at near-real-time to enable downstream systems to operate on much fresher data, the platform’s architecture, and its impact on Pinterest’s systems, tools, and processes. They conclude by demonstrating how Pinterest models real-time ads analytics use cases on the platform and sharing lessons learned along the way.

Swaminathan Sundaramurthy

Salesforce Inc

Swaminathan Sundaramurthy is a Director of Engineering at Salesforce Einstein, where he manages Machine Learning Services and Orchestration teams. Prior to Salesforce, Swami worked at Pinterest, where he initiated and managed the company’s stream platform and machine learning training platform, and managed anti-Spam and fraud efforts. He began his career as an IC, spending more than 12 years building distributed systems and cloud platforms at Amazon, Yahoo, Microsoft and Ask Jeeves. Swami is passionate about technology, distributed systems, promoting diversity and eliminating bias in the workplace.

Website

Mark Cho

Mark Cho is a software engineer at Pinterest.

Elite Sponsor

Google Cloud

Platinum Sponsors

Gold Sponsors

Silver Sponsors

Innovators

Supporters

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email velocity@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Velocity contacts

©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com