Build resilient systems at scale
May 27–29, 2015 • Santa Clara, CA

Scaling ingest pipelines with high performance computing principles

Rajiv Kurian (SignalFx)
5:05pm–5:45pm Friday, 05/29/2015
Location: Mission City M1-2
Average rating: *****
(5.00, 2 ratings)
Slides:   1-PDF    external link

Prerequisite Knowledge

A very basic understanding of how modern hardware works, e.g. for cache hierarchy. Familiarity with basic data structures like hash maps and arrays. All prerequisites will be covered, though, so it mostly requires an interest in making software go fast.

Description

At SignalFx, we deal with high-volume high-resolution data from our users. This requires a high performance ingest pipeline. Over time we’ve found that we needed to adapt architectural principles from specialized fields such as HPC to get beyond performance plateaus encountered with more generic approaches. Some key examples include:

  • Write very simple single threaded code, instead of complex algorithms
  • Parallelize by running multiple copies of simple single threaded code, instead of using concurrent algorithms
  • Separate the data plane from the control plane, instead of slowing data for control
  • Write compact, array-based data structures with minimal indirection, instead of pointer-based data structures and uncontrolled allocation

This presentation will provide examples of putting these principles into practice and the before/after results we’ve experienced in the performance of our own services. We believe these lessons will be useful to anyone building services that have to consume large amounts of data. :) Which is probably everyone.

Rajiv Kurian

SignalFx

Rajiv Kurian is a software engineer with over five years experience building high performance distributed systems like databases, networking protocols and image processing. At SignalFx, Rajiv works on improving the performance of the ingest pipeline.