Presented By O'Reilly and Cloudera
Make Data Work
March 28–29, 2016: Training
March 29–31, 2016: Conference
San Jose, CA

Health conference sessions

2:30pm–3:00pm Tuesday, 03/29/2016
Mr Prabhat (Berkeley Lab)
Prabhat reviews the top data analytics problems in modern science—covering problems at all scales, from full-scale astronomy surveys to subatomic physics—and outlines Berkeley Lab's hardware and software strategy for dealing with these daunting challenges.
4:20pm–5:00pm Wednesday, 03/30/2016
Brandon Ballinger (Cardiogram), Johnson Hsieh (Cardiogram)
Each year, 15 million people suffer strokes, and at least a fifth of those are due to atrial fibrillation, the most common heart arrhythmia. Brandon Ballinger reports on a collaboration between UCSF cardiologists and ex-Google data scientists that detects atrial fibrillation with deep learning.
4:20pm–5:00pm Thursday, 03/31/2016
Timothy Danford (Tamr, Inc.)
To keep up with the DNA-sequencing-technology revolution, bioinformaticians need more-scalable tools for genomics analysis. Timothy Danford outlines one possible solution in a case study of a cancer genomics analysis pipeline implemented as part of the open source genomics software project, ADAM, which uses Apache Spark-generated abstractions executed on commodity computing infrastructure.
11:00am–11:40am Wednesday, 03/30/2016
Jake Porway (DataKind), Rachel Quint (Hewlett Foundation), Sue-Ann Ma, Jeremy Anderson (IBM)
So many of the data projects making headlines—from a new app for finding public services to a new probabilistic model for predicting weather patterns for subsistence farmers—are great accomplishments but don’t seem to have end users in mind. Discover how organizations are designing with, not for, people, accounting for what drives them in order to make long-lasting impact.
5:10pm–5:50pm Wednesday, 03/30/2016
Moderated by:
Michael Dauber (Amplify Partners)
Yael Garten (LinkedIn), Monica Rogati (Data Natives), Daniel Tunkelang (Various)
We’ve all heard that rare breed the data scientist described as a unicorn. In building your DS team, should you hold out for that unicorn or create groups of specialists who can work together? Michael Dauber, Yael Garten, Monica Rogati, and Daniel Tunkelang discuss the pros and cons of various team models to help you decide what works best for your particular situation and organization.
11:50am–12:30pm Thursday, 03/31/2016
Jeremy Howard ( | USF | and
In his 20+ years of applying machine learning and data analysis to a wide range of industries, Jeremy Howard never felt that his work really changed anyone's life in a deep and positive way, so he spent a year researching ways he might effect real change. Jeremy outlines the impact that deep learning is going to make on the world and explains how you too can make a difference.
5:10pm–5:50pm Wednesday, 03/30/2016
Josh Patterson (Patterson Consulting), Dave Kale (Skymind), Zachary Lipton (University of California, San Diego)
Time series data is increasingly ubiquitous with both the adoption of electronic health record (EHR) systems in hospitals and clinics and the proliferation of wearable sensors. Josh Patterson, David Kale, and Zachary Lipton bring the open source deep learning library DL4J to bear on the challenge of analyzing clinical time series using recurrent neural networks (RNNs).
4:00pm–4:30pm Tuesday, 03/29/2016
Matt Butner (Stride Health)
Matt Butner shares real-world data, methods, and outcomes from helping consumers make smarter health decisions at scale.
1:50pm–2:30pm Thursday, 03/31/2016
Linus Liang (Embrace), Brad Allen (Silicon Valley Data Science)
Linus Liang and Brad Allen explain how big data is helping Embrace save millions of babies around the world. Embrace invented the world's most affordable infant incubator, but the data it collects—from the hardest to reach and most rural parts of the world—will actually save more lives than the device will.
2:40pm–3:20pm Wednesday, 03/30/2016
Robert Grossman (University of Chicago)
There is a big difference between running a machine-learning algorithm manually from time to time and building a production system that runs thousands of machine-learning algorithms each day on petabytes of data, while also dealing with all the edge cases that arise. Robert Grossman discusses some of the lessons learned when building such a system and explores the tools that made the job easier.
11:50am–12:30pm Thursday, 03/31/2016
Jeffrey Shmain (Cloudera), Mohammad Quraishi (Cigna)
How do you implement Apache Hadoop in a large healthcare company with a mature data-analysis infrastructure? Jeffrey Shmain and Mohammad Quraishi describe Cigna's journey toward big data and Hadoop, including an overview of new Hadoop capabilities like heterogeneous data integration and large-scale machine learning.
11:50am–12:30pm Thursday, 03/31/2016
David Beyer (Amplify Partners)
Over the past decade, machine learning has become intertwined with newer, Internet-born businesses. This despite the fact that the vast majority of global GDP turns on larger, less visible industries like energy and construction. David Beyer explores the ways these backbone industries are adopting machine-intelligent applications and the trends underlying this shift.
9:40am–10:00am Tuesday, 03/29/2016
Trina Chiasson (Tableau Software)
The Quantified Self movement continues to grow apace. Trina Chiasson explains what we can learn from these hobbyist data hackers about how to make data fun, personally relevant, and actionable.
1:50pm–2:10pm Tuesday, 03/29/2016
Robin Thottungal (US Environmental Protection Agency)
A key challenge of today’s federal government is to ensure that data and evidence are used to inform regulation and policy decisions. Data-driven decision making uses analytics techniques to transform data into information and, ultimately, actionable knowledge. Robin Thottungal discusses the analytical approaches that support the EPA's mission of protecting the environment and human health.