Presented By O’Reilly and Cloudera

San Francisco • London • New York

Make Data Work

September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

How to avoid drowning in logs: Streaming 80 billion events and batch processing 40 TB/hour (sponsored by Pure Storage)

Ivan Jibaja (Pure Storage)

5:25pm–6:05pm Wednesday, 09/12/2018

What you'll learn

Learn how Pure Storage uses Spark for both streaming and batch jobs, helping engineers understand the state of its continuous integration pipeline

Description

Continuous integration (CI) pipelines generate massive amounts of messy log data. Pure Storage engineering runs over 70,000 tests per day creating a large triage problem that would require at least 20 triage engineers. Instead, Spark’s flexible computing platform allows the company to write a single application for both streaming and batch jobs so that a team of only three triage engineers can understand the state of the company’s CI pipeline. Spark indexes log data for real-time reporting (streaming), uses machine learning for performance modeling and prediction (batch job), and reindexes old data for newly encoded patterns (batch job). Ivan Jibaja discusses the use case for big data analytics technologies, the architecture of the solution, and lessons learned.

This session is sponsored by Pure Storage.

Ivan Jibaja

Pure Storage

Ivan Jibaja is a tech lead for the big data analytics team at Pure Storage. Previously, he was a part of the core development team that built the FlashBlade from the ground up. Ivan holds a PhD in computer science with a focus on systems and compilers from the University of Texas at Austin.

Website

Presented by

Elite Sponsors

Strategic Sponsors

Zettabyte Sponsors

Contributing Sponsors

Exabyte Sponsors

Impact Sponsors

Supporting Sponsors

Sponsorship Opportunities

For exhibition and sponsorship opportunities, email strataconf@oreilly.com

Partner Opportunities

For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com

Contact Us

View a complete list of Strata Data Conference contacts

©2018, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com