Presented By O’Reilly and Cloudera
Make Data Work
September 11, 2018: Training & Tutorials
September 12–13, 2018: Keynotes & Sessions
New York, NY

Performant time series data management and analytics with Postgres

Michael Freedman (TimescaleDB)
2:55pm–3:35pm Wednesday, 09/12/2018
Data engineering and architecture, Expo Hall
Location: Expo Hall Level: Intermediate

Who is this presentation for?

  • Engineers, DBAs, product managers, and data analysts

Prerequisite knowledge

  • A basic understanding of databases and data storage

What you'll learn

  • Explore TimescaleDB, the open source times series database engineered as a Postgres plug-in, and its new time series data management features

Description

Time series databases are one of the fastest growing segments of the database market. Common requirements include ingesting high volumes of structured data, answering complex, performant queries for both recent and historical time intervals, and performing specialized time-centric analysis and data management.

Today, many developers working with time series data turn to polyglot solutions: a NoSQL database to store their time series data (for scale) and a relational database for associated metadata and key business data. Yet this leads to engineering complexity, operational challenges, and even referential integrity concerns.

Michael Freedman explains how to avoid these operational problems by reengineering Postgres to serve as a general data platform, including for high-volume time series workloads. TimescaleDB—an open source time-series databases, implemented as a Postgres plug-in—improves insert rates by 20x over vanilla Postgres and much faster queries even while offering full SQL (including JOINs). TimescaleDB achieves this by storing data on an individual server in a manner more common to distributed systems: heavily partitioning (sharding) data into chunks to ensure that hot chunks corresponding to recent time records are maintained in memory.

Drawing on real-world use cases, Michael focuses on two newly released features of TimescaleDB—the automated adaptation of time-partitioning intervals, which the database learns by observing data volumes, and continuous aggregations in near real time, in a manner robust to late-arriving data and transparently supporting queries across different aggregation levels—covering how these capabilities ease time series data management.

Photo of Michael Freedman

Michael Freedman

TimescaleDB

Michael J. Freedman is the cofounder and CTO of TimescaleDB, an open source database that scales SQL for time series data, and a professor of computer science at Princeton University, where his research focuses on distributed systems, networking, and security. Previously, Michael developed CoralCDN (a decentralized CDN serving millions of daily users) and Ethane (the basis for OpenFlow and software-defined networking) and cofounded Illuminics Systems (acquired by Quova, now part of Neustar). He is a technical advisor to Blockstack. Michael’s honors include the Presidential Early Career Award for Scientists and Engineers (PECASE, given by President Obama), the SIGCOMM Test of Time Award, the Caspar Bowden Award for Privacy Enhancing Technologies, a Sloan Fellowship, the NSF CAREER Award, the Office of Naval Research Young Investigator Award, a DARPA Computer Science Study Group membership, and multiple award publications. He holds a PhD in computer science from NYU’s Courant Institute and bachelor’s and master’s degrees from MIT.