Presented By O'Reilly and Cloudera
Make Data Work
September 25–26, 2017: Training
September 26–28, 2017: Tutorials & Conference
New York, NY

Challenges in using machine learning to direct healthcare services

Brian Dalessandro (Capital One)
2:55pm3:35pm Wednesday, September 27, 2017
Secondary topics:  Data for good, ecommerce, Healthcare
Average rating: ****.
(4.67, 3 ratings)

Who is this presentation for?

  • Data scientists, CTOs, CMOs, machine learning engineers, information security engineers, and data engineers

What you'll learn

  • Understand the data and machine learning constraints faced by the healthcare industry
  • Explore the machine learning methods Zocodc has implemented


Zocdoc is an online marketplace that allows easy doctor discovery and instant online booking. Many of Zocdoc’s core product and business functions are fairly common in internet services, and each function can be cast as a well-defined and -researched machine learning opportunity. However, dealing with healthcare involves many constraints and challenges that render standard approaches to common problems infeasible. Brian Dalessandro surveys the various machine learning problems Zocdoc has faced and shares the data, legal, and ethical constraints that shape its solution space, explaining how regulatory-induced business model constraints translate into new formulations of common machine learning problems.

Healthcare service companies are required to comply with a number of regulations (HIPAA being the most notable). Operating in a HIPAA-regulated data environment imposes restrictions that contradict the “open data” cultural values of many data-driven organizations and limits the way certain data may be used. Nonetheless, HIPAA concerns can often be solved with existing technology solutions and policies. Other policies put constraints on how a company might monetize their service, creating constraints on the company’s business model. For instance, Zocdoc uses a subscription-based rather than a transaction-based model. A basic transformation in business model has many downstream effects on how machine learning problems are formulated and solved.

Beyond regulatory constraints, there are ethical considerations that are particularly important in making decisions related to healthcare. Brian walks you through the common machine learning development pipeline (from problem formulation to research, then to deployment and evaluation) and shows how the aforementioned constraints can be worked into the process, with a focus on how to adapt to new constraints and develop the appropriate patterns working within such a constrained data mining framework.

Photo of Brian Dalessandro

Brian Dalessandro

Capital One

Brian d’Alessandro is a Sr Director of data science at Capital One (Financial Services). Brian is also an active professor for NYU’s Center for Data Science graduate degree program. Previously, Brian built and led data science programs for several NYC tech startups, including Zocdoc and Dstillery. A veteran data scientist and leader with over 18 years of experience developing machine learning-driven practices and products, Brian holds several patents and has published dozens of peer-reviewed articles on the subjects of causal inference, large-scale machine learning, and data science ethics. When not doing data science, Brian likes to cook, create adventures with his family, and surf in the frigid north Atlantic waters.