Presented By
O’Reilly + Cloudera
Make Data Work
March 25-28, 2019
San Francisco, CA
Please log in

Machine learning prediction of blood alcohol content: A digital signature of behavior

Kirstin Aschbacher (UCSF Cardiology)
2:40pm3:20pm Thursday, March 28, 2019
Secondary topics:  Health and Medicine
Average rating: ****.
(4.20, 5 ratings)

Who is this presentation for?

  • Data scientists and analysts, researchers, and product managers



Prerequisite knowledge

  • A basic understanding of statistics and machine learning concepts
  • Familiarity with SQL, Python, and AWS (useful but not required)

What you'll learn

  • Learn how to use principles from the neuropsychology of reward and habit formation to create predictive features from device-based data
  • See an example of how integrating data from publicly available sources that reflect the social determinants of health and health behaviors can improve prediction models
  • Build skills in the application of the XGBoost gradient-boosted classification tree algorithm to a dataset with many user-entries but few initial features


Individuals can track their blood alcohol content (BAC) using commercially available devices, providing an opportunity for behavioral health interventions such as personalized messaging to target specific users at high-risk times or locations. To do so, a machine learning (ML) model would need to predict user BAC levels with high precision based on minimal information, such as timestamps, geolocation, and device/app engagement.

Kirstin Aschbacher shares a machine learning approach to identify a digital signature of self-monitored BAC levels that predicts the times, locations, and circumstances under which a user is likely to exceed the legal BAC driving limit of 0.08%. Kristin and her teammates analyzed over 1 million data points from 33,452 distinct users of the BACtrack device (with established accuracy comparable to police-grade devices) collected between 2013 and 2017. Extensive feature generation was performed on BAC levels, app engagement, timestamps, and geolocation. They used census data to quantify zip codes by % rural/urban and integrated state-level motor vehicle death rates from the Center for Disease Control. Feature selection was performed using a gradient-boosted classification tree model (XGBoost; learning rate=0.1, max depth=5; boosting rounds=30). They optimized around precision specifically because commercial devices that employ ML-driven strategies to reach out to users should consider that recommendations made on the basis of low precision (predicting a user has a high BAC when they do not) could harm product trust or engagement.

In a separate test set, they predicted whether BAC≥0.08% for a given user at a given time and location, with an average precision (positive predictive value) of 79%. The most predictive features in rough order of importance were users’ prior behavior (average BAC, subjective estimation of their BAC, tracking frequency, engagement quantity), temporal features (time of day/day of week), and geolocation (elevation, distances traveled between subsequent measurements, country, rural/urban percentage).

Join in to learn how BAC levels exceeding the safe legal driving limit of 0.08% can be predicted with good precision using machine learning to quantify a digital phenotype. BAC prediction from minimal information establishes the foundation to conduct precision medicine behavioral interventions using a digital app and BAC tracking device.

Kirstin wants to acknowledge and thank her UCSF coauthors on this work: R. Avram, G. Tison, K. Rutledge, M. Pletcher, J. Olgin, and G. Marcus.

Photo of Kirstin Aschbacher

Kirstin Aschbacher

UCSF Cardiology

Kirstin Aschbacher is a data scientist, a licensed clinical psychologist with a specialty in behavioral medicine, an associate professor in cardiology at UCSF, and the data team lead on the Health eHeart (HeH)/Eureka Digital Research Platform. One of her passions is to bridge the worlds of behavior change and data science in order to transform health, and she enjoys finding creative ways to take knowledge from psychology, neuroscience, and biology and apply them to discover new insights in large datasets. At UCSF, she builds active partnerships with companies in the behavior change and lifestyle medicine space. Previously, she was a data scientist at Silicon Valley startup Jawbone, where she helped design, test, and analyze mini-interventions to help users make healthier behavior choices and lose weight. When she’s not at work, she enjoys being a mother to her two children, biking and dancing, and learning to speak Mandarin.