Sep 23–26, 2019

Handling data gaps in time series using imputation

Alfred Whitehead (Klick), clare jeon (Klick)
1:15pm1:55pm Thursday, September 26, 2019
Location: 1A 08/10
Average rating: ****.
(4.33, 3 ratings)

Who is this presentation for?

  • Early career data scientists handling time series signals from sensors or other real-world sources

Level

Beginner

Description

Time series forecasting is everywhere. It tells you what tomorrow’s temperature will be, your company’s stock price on Friday, and your blood glucose levels before bed. Often these forecasts depend on sensors or measurements made out in the real, messy world. Those sensors flake out, get turned off, disconnect, and otherwise conspire to cause missing data in your signals.

Alfred Whitehead and Clare Jeon explore a number of methods for handling data gaps and advise you on which to consider and when. You’ll see how to perform tests to determine which method suits your problem the best. And all of this is illustrated with real data from a continuous blood glucose monitor.

The methods they handle include random assignment, average-based imputation, last observed carried forward, linear interpolation, spline interpolation, moving average, Kalman smoothing with structural model, Kalman smoothing with auto-ARIMA model, Stineman interpolation, k-nearest neighbors, and seasonality with Prophet.

Prerequisite knowledge

  • A basic understanding of statistics, load, and how to manipulate data in R or Python with pandas

What you'll learn

  • Understand the variety of methods available to impute missing data and a sense of how to apply them effectively
Photo of Alfred Whitehead

Alfred Whitehead

Klick

Alfred Whitehead is the senior vice president of data science at Klick, where he’s responsible for the delivery of data science solutions and oversees a team of data scientists and AI researchers. He brings over 15 years of experience in data science, software development, and high-performance computing to the Klick team, combining his scientific background with an appreciation of the craft of code writing. Previously, he was an information security officer, technology vice president, and acting chief technology officer. He holds two master’s degrees in physical sciences, including thesis work in computational astrophysics, and is also a certified information systems security professional (CISSP).

Photo of clare jeon

clare jeon

Klick

Clare Jeon is a data scientist at Klick, where she focuses on identifying digital biomarkers for diagnosis, risk assessment of diseases, and prevention of health problems. She also explores the applications of machine learning to optimize clinic performance. Previously, she was involved in working on the systems biology of cancer and the development of the computational pipeline to identify key genomic and clinical signatures for cancer treatment. She holds a PhD in bioinformatics and systems biology.

Comments on this page are now closed.

Comments

Picture of Alfred Whitehead
Alfred Whitehead | Senior Vice President, Data Science
09/26/2019 10:14am EDT

Thanks for attending everyone! Our slides and code can be found here: https://github.com/KlickInc/datasci-strata-talk-missing-data

    Contact us

    confreg@oreilly.com

    For conference registration information and customer service

    partners@oreilly.com

    For more information on community discounts and trade opportunities with O’Reilly conferences

    strataconf@oreilly.com

    For information on exhibiting or sponsoring a conference

    pr@oreilly.com

    For media/analyst press inquires