Time series data is emerging everywhere, from IoT sensors and industrial machines to transportation and logistics to DevOps and monitoring to finance. Many users start by storing their time series data in a relational database, but once their data reaches a certain scale, they give up its query power and ecosystem by migrating to some NoSQL or “modern” time series architecture.
Michael Freedman offers an overview of TimescaleDB, a new scale-out database designed for time series workloads yet open-sourced and engineered up as a plugin to Postgres. TimescaleDB is implemented as a PostgreSQL extension and available under the Apache 2 license. It supports full SQL while offering performance improvements for both single-node and cluster deployments.
Michael explains why the characteristics and needs of time series workloads (compared to general OLTP and even OLAP workloads) present a new point in the design space of databases and how TimescaleDB was architected to embrace these differences. TimescaleDB automatically partitions data across both time and space, even though it exposes the illusion of a single, continuous table (a hypertable) across all your data spread across one or many servers. Michael details the design of TimescaleDB’s dynamic chunking mechanisms that reasons both about time intervals and table sizes to provide scalable performance, while avoiding any manual tuning or configuration; its distributed query optimizations both hide the fact that users are interacting with many chunks of data spread across one or many server and minimize which chunks are accessed to answer queries. Along the way, Michale shares benchmarks demonstrating that TimescaleDB provides constant insert performance as the database scales and avoids the “performance cliff” that vanilla PostgreSQL experiences when writing to tables of tens to hundreds of millions of rows while also offering superior query performance to both Postgres and other time series databases across a variety of complex queries.
Michael J. Freedman is the cofounder and CTO of TimescaleDB, an open source database that scales SQL for time series data, and a professor of computer science at Princeton University, where his research focuses on distributed systems, networking, and security. Previously, Michael developed CoralCDN (a decentralized CDN serving millions of daily users) and Ethane (the basis for OpenFlow and software-defined networking) and cofounded Illuminics Systems (acquired by Quova, now part of Neustar). He is a technical advisor to Blockstack. Michael’s honors include the Presidential Early Career Award for Scientists and Engineers (PECASE, given by President Obama), the SIGCOMM Test of Time Award, the Caspar Bowden Award for Privacy Enhancing Technologies, a Sloan Fellowship, the NSF CAREER Award, the Office of Naval Research Young Investigator Award, a DARPA Computer Science Study Group membership, and multiple award publications. He holds a PhD in computer science from NYU’s Courant Institute and bachelor’s and master’s degrees from MIT.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com