Sensors, IoT devices, servers, and other data sources are generating increasingly huge volumes of time series data. This data is critical for many important use cases, from operational monitoring, troubleshooting, and DevOps to condition-based and predictive maintenance, real-time control, asset tracking, and more, which require similar innovations in data infrastructure.
Most time series architectures today use column stores to support fast roll-ups on individual metrics (e.g., compute the average CPU load over the last 10-minute interval). However, these new applications often need to query data in more-complex and arbitrary ways (e.g., compute how many sensors had temperatures greater than X, grouped by location, over the last 10-minute interval). Michael Freedman outlines a new distributed time series database designed for such workloads (i.e., one that supports highly efficient queries, including complex predicates across many metrics).
Michael describes how the characteristics of these time series workloads allow specific design decisions that enable both elastic scaling and efficient SQL-like queries, even though achieving both properties has remained elusive for general OLTP workloads. Michael explains how you can leverage these workload characteristics to perform smart time/space partitioning and placement of data, even though it exposes the abstraction of a simple continuous table across all devices and time intervals. Time permitting, Michael will also cover various query optimizations such as architecture enables.
Michael presents these architecture insights in the context of TimescaleDB, a new distributed time-series database. He demonstrates TimescaleDB’s use and provides performance numbers and intuition about the scenarios in which its design shines (and those in which it does not).
Michael J. Freedman is the co-founder and CTO of TimescaleDB, an open-source database that scales SQL for time-series data, and a Professor of Computer Science at Princeton University. His research focuses on distributed systems, networking, and security.
Previously, Freedman developed CoralCDN (a decentralized CDN serving millions of daily users) and Ethane (the basis for OpenFlow / software-defined networking). He co-founded Illuminics Systems (acquired by Quova, now part of Neustar) and is a technical advisor to Blockstack.
Honors include: Presidential Early Career Award for Scientists and Engineers (PECASE, given by President Obama), SIGCOMM Test of Time Award, Caspar Bowden Award for Privacy Enhancing Technologies, Sloan Fellowship, NSF CAREER Award, Office of Naval Research Young Investigator Award, DARPA Computer Science Study Group membership, and multiple award publications. Prior to joining Princeton in 2007, he received his Ph.D. in computer science from NYU’s Courant Institute, and his bachelors and masters degrees from MIT.
Comments on this page are now closed.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.