Build resilient systems at scale
May 27–29, 2015 • Santa Clara, CA

Building a system for machine and event-oriented data

Eric Sammer (Rocana)
11:50am–12:30pm Thursday, 05/28/2015
Location: Ballroom CD
Average rating: ****.
(4.00, 16 ratings)

Prerequisite Knowledge

A basic understanding of monitoring and metric collection, as well as a rudimentary knowledge of typical big data systems is assumed.

Description

In this session, we’ll follow the flow of data through an end-to-end system built to handle tens of terabytes an hour of event-oriented data, providing real-time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive can be stitched together to form the base platform; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality.

Attendees will leave this session knowing not just which open source projects go into a system such as this, but how they work together, what tradeoffs and decisions need to be addressed, and how to present a single general purpose data platform to multiple applications. This session should be attended by data infrastructure engineers and Ops engineers planning, building, or maintaining similar systems, or those looking to centralize and correlate user activity, quality of service, operational, and other forms of data.

Photo of Eric Sammer

Eric Sammer

Rocana

Eric Sammer is the CTO and co-founder of ScalingData. Prior to ScalingData, he was an engineering manager at Cloudera. His background is in the development and operations of distributed, highly concurrent, data ingest and processing systems. He’s been involved in the open source community and has contributed to a large number of projects over the last decade. Eric is the author of Hadoop Operations (O’Reilly).

Eric is the author of O’Reilly Media’s Hadoop Operations. Learn more. http://oreil.ly/1I0ddf6