Sep 23–26, 2019
Please log in

Performant time series data management and analytics with PostgreSQL

Michael Freedman (TimescaleDB | Princeton University)
11:20am12:00pm Thursday, September 26, 2019
Location: 1A 23/24
Average rating: ****.
(4.00, 3 ratings)

Who is this presentation for?

  • Software developers, database administrators (DBAs), product managers, and data analysts

Level

Intermediate

Description

Time series databases are one of the fastest growing segments of the database market, spreading across industries and use cases. Common requirements include ingesting high volumes of structured data; answering complex, performant queries for both recent and historical time intervals; and performing specialized time-centric analysis and data management. Today, many developers working with time series data turn to polyglot solutions: a NoSQL database to store their time series data (for scale) and a relational database for associated metadata and key business data. Yet this leads to engineering complexity, operational challenges, and even referential integrity concerns.

Michael Freedman explains how you can avoid these operational problems by re-engineering PostgreSQL to serve as a general data platform, including high-volume time series workloads. In particular, TimescaleDB is an open source time series database, implemented as a PostgreSQL plugin, that improves insert rates by 20x over vanilla PostgreSQL and much faster queries, even while offering full SQL (including JOINs). TimescaleDB achieves this by storing data on an individual server in a manner more common to distributed systems: heavily partitioning (sharding) data into chunks to ensure that hot chunks corresponding to recent time records are maintained in memory.

You’ll discover two newly released features of TimescaleDB and how these capabilities ease time series data management through the automated adaptation of time-partitioning intervals, which the database learns by observing data volumes; and continuous aggregations in near real time, in a manner robust to late-arriving data and transparently supporting queries across different aggregation levels, and how these capabilities have been leveraged across several different use cases.

Prerequisite knowledge

  • A basic understanding of databases and data storage

What you'll learn

  • Understand that when dealing with time series data, using a combination of NoSQL databases and relational databases often leads to unnecessary complexity; how this complexity can be avoided by re-engineering PostgreSQL to serve as a general data platform; and how TimescaleDB, which is implemented as a PostgreSQL extension, improves insert rates by 20x over vanilla PostgreSQL and achieves much faster queries while offering full SQL
Photo of Michael Freedman

Michael Freedman

TimescaleDB | Princeton University

Michael J. Freedman is the cofounder and CTO of TimescaleDB and a full professor of computer science at Princeton University. His work broadly focuses on distributed and storage systems, networking, and security, and his publications have more than 12,000 citations. He developed CoralCDN (a decentralized content distribution network serving millions of daily users) and helped design Ethane (which formed the basis for OpenFlow and software-defined networking). Previously, he cofounded Illuminics Systems (acquired by Quova, now part of Neustar) and served as a technical advisor to Blockstack. Michael’s honors include a Presidential Early Career Award for Scientists and Engineers (given by President Obama), the SIGCOMM Test of Time Award, a Sloan Fellowship, an NSF CAREER award, the Office of Naval Research Young Investigator award, and support from the DARPA Computer Science Study Group. He earned his PhD at NYU and Stanford and his undergraduate and master’s degrees at MIT.

  • Cloudera
  • O'Reilly
  • Google Cloud
  • IBM
  • Cisco
  • Dataiku
  • Intel
  • Io-Tahoe
  • MemSQL
  • Microsoft Azure
  • Oracle Cloud Infrastructure
  • SAS
  • Arcadia Data
  • BMC Software
  • Hazelcast
  • SAP
  • Amazon Web Services
  • Anaconda
  • Esri
  • Infoworks.io, Inc.
  • Kyligence
  • Pitney Bowes
  • Talend
  • Google Cloud
  • Confluent
  • DataStax
  • Dremio
  • Immuta
  • Impetus Technologies Inc.
  • Keyence
  • Kyvos Insights
  • StreamSets
  • Striim
  • Syncsort
  • SK holdings C&C

    Contact us

    confreg@oreilly.com

    For conference registration information and customer service

    partners@oreilly.com

    For more information on community discounts and trade opportunities with O’Reilly conferences

    strataconf@oreilly.com

    For information on exhibiting or sponsoring a conference

    pr@oreilly.com

    For media/analyst press inquires