Sep 23–26, 2019

Performant time-series data management and analytics with Postgres

Michael Freedman (TimescaleDB)
11:20am12:00pm Thursday, September 26, 2019
Location: 1A 15/16
Secondary topics:  Data Management and Storage, Streaming and IoT

Who is this presentation for?

Software developers, DBAs, product managers, and data analysts

Level

Intermediate

Prerequisite knowledge

A basic understanding of databases and data storage

What you'll learn

1. When dealing with time-series data, using a combination of NoSQL databases and relational databases often leads to unnecessary complexity. 2. This complexity can be avoided by re-engineering Postgres to serve as a general data platform. 3. TimescaleDB, which is implemented as a Postgres extension, improves insert rates by 20x over vanilla Postgres and achieves much faster queries while offering full SQL.

Description

Time-series databases are one of the fastest growing segments of the database market, spreading across industries and use cases. Common requirements include ingesting high volumes of structured data; answering complex, performant queries for both recent and historical time intervals; and performing specialized time-centric analysis and data management.

Today, many developers working with time series data turn to polyglot solutions: a NoSQL database to store their time series data (for scale) and a relational database for associated metadata and key business data. Yet this leads to engineering complexity, operational challenges, and even referential integrity concerns.

In this talk, I will explain how one can avoid these operational problems by re-engineering Postgres to serve as a general data platform, including high-volume time-series workloads. In particular, TimescaleDB is an open-source time-series databases, implemented as a Postgres plugin, that improves insert rates by 20x over vanilla Postgres and much faster queries, even while offering full SQL (including JOINs). TimescaleDB achieves this by storing data on an individual server in a manner more common to distributed systems: heavily partitioning (sharding) data into chunks to ensure that hot chunks corresponding to recent time records are maintained in memory.

I will focus on two newly-released features of TimescaleDB, and discuss how these capabilities ease time-series data management: (1) the automated adaptation of time-partitioning intervals, which the database learns by observing data volumes; (2) continuous aggregations in near-real-time, in a manner robust to late-arriving data and transparently supporting queries across different aggregation levels, and how these capabilities have been leveraged across several different use cases.

Photo of Michael Freedman

Michael Freedman

TimescaleDB

Michael J. Freedman is the cofounder and CTO of TimescaleDB, an open source database that scales SQL for time series data, and a professor of computer science at Princeton University, where his research focuses on distributed systems, networking, and security. Previously, Michael developed CoralCDN (a decentralized CDN serving millions of daily users) and Ethane (the basis for OpenFlow and software-defined networking) and cofounded Illuminics Systems (acquired by Quova, now part of Neustar). He is a technical advisor to Blockstack. Michael’s honors include the Presidential Early Career Award for Scientists and Engineers (PECASE, given by President Obama), the SIGCOMM Test of Time Award, the Caspar Bowden Award for Privacy Enhancing Technologies, a Sloan Fellowship, the NSF CAREER Award, the Office of Naval Research Young Investigator Award, a DARPA Computer Science Study Group membership, and multiple award publications. He holds a PhD in computer science from NYU’s Courant Institute and bachelor’s and master’s degrees from MIT.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

strataconf@oreilly.com

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts