Sep 23–26, 2019

Handling Data Gaps in Time Series Using Imputation

Alfred Whitehead (Klick), Clare Jeon (KLICK INC)
1:15pm1:55pm Thursday, September 26, 2019
Location: 1A 08/10
Secondary topics:  Temporal data and time-series analytics

Who is this presentation for?

Early career data scientists handling time-series signals coming from sensors or other real-world sources.

Level

Beginner

Description

Time series forecasting is everywhere. What will tomorrow’s temperature be? How about my company’s stock price on Friday? My blood glucose levels tonight before bed? Often these forecasts depend on sensors or measurements made out in the real, messy world. Those sensors flake out, get turned off, disconnect, and otherwise conspire to cause missing data in our signals.

In this talk, we’ll show a number of methods for handling data gaps and give advice on which to consider and when. We’ll also show how to perform tests to determine which method suits your problem the best. All of this will be illustrated with real data from a continuous blood glucose monitor.

Methods handled will include:

  • Random assignment
  • Average-based imputation
  • Last observed carried forward
  • Linear interpolation
  • Spline interpolation
  • Moving average
  • Kalman smoothing with structural model
  • Kalman smoothing with auto-ARIMA model
  • Stineman interpolation
  • k-Nearest Neighbours
  • Seasonality with Prophet

Prerequisite knowledge

Basic statistics, load and manipulate data in R or Python with Pandas.

What you'll learn

An understanding of the wide variety of methods available to impute missing data, and a sense of how to apply them effectively.
Photo of Alfred Whitehead

Alfred Whitehead

Klick

Alf is responsible for the delivery of data science solutions at Klick Health, where he oversees a team of data scientists and AI researchers. He brings over 15 years of experience in data science, software development, and high-performance computing to the Klick team, combining his scientific background with an appreciation of the craft of code-writing. He has previously served as an information security officer, technology VP, and acting Chief Technology Officer. He holds two Masters degrees in the physical sciences, including thesis work in computational astrophysics, and is also a Certified Information Systems Security Professional (CISSP).

Photo of Clare Jeon

Clare Jeon

KLICK INC

Clare is a data scientist at Klick Health, where she focuses on identifying digital biomarkers for diagnosis, risk assessment of diseases and prevention of health problems. Also she is exploring the applications of machine learning to optimize clinic performance. She was previously involved in working on the systems biology of cancer and the development of computational pipeline to identify key genomic and clinical signatures for cancer treatment. She holds a Ph.D degree in bioinformatics and systems biology.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)

Contact us

confreg@oreilly.com

For conference registration information and customer service

partners@oreilly.com

For more information on community discounts and trade opportunities with O’Reilly conferences

strataconf@oreilly.com

For information on exhibiting or sponsoring a conference

Contact list

View a complete list of Strata Data Conference contacts