Build & maintain complex distributed systems
October 1–2, 2017: Training
October 2–4, 2017: Tutorials & Conference
New York, NY

Scalable, fluent time series data analysis

Leif Walsh (Two Sigma)
2:25pm3:05pm Tuesday, October 3, 2017
Distributed Data & Databases
Location: Regent
Average rating: **...
(2.00, 1 rating)

What you'll learn

  • Explore Flint, Two Sigma's open source time series extension to Spark

Description

Time series analysis has become a central requirement for data science across many data disciplines, including the IoT, finance and econometrics, advertising, public policy, and systems operations. The need to understand and semantically manipulate time-ordered events is notably missing from many databases and analytics tools.

Two Sigma has extended the pandas and PySpark analytics stack to provide integrated support for transformations and analytics that understand time as a first-class construct. Leif Walsh offers an overview of Flint, Two Sigma’s open source time series extension to Spark, explains how it fits in with the Spark programming model, and lays out the roadmap for the future of pandas, PySpark, and Flint.

Photo of Leif Walsh

Leif Walsh

Two Sigma

Leif Walsh is an engineering manager at Two Sigma, where he works on the company’s next-generation data analysis platform for distributed time series research and simulation. Leif’s background is in high-performance storage. Previously, he built fractal trees at Tokutek. He loves the Oxford comma, cooking, and playing with cats.