Zocdoc is an online marketplace that allows easy doctor discovery and instant online booking. Many of Zocdoc’s core product and business functions are fairly common in internet services, and each function can be cast as a well-defined and -researched machine learning opportunity. However, dealing with healthcare involves many constraints and challenges that render standard approaches to common problems infeasible. Brian Dalessandro surveys the various machine learning problems Zocdoc has faced and shares the data, legal, and ethical constraints that shape its solution space, explaining how regulatory-induced business model constraints translate into new formulations of common machine learning problems.
Healthcare service companies are required to comply with a number of regulations (HIPAA being the most notable). Operating in a HIPAA-regulated data environment imposes restrictions that contradict the “open data” cultural values of many data-driven organizations and limits the way certain data may be used. Nonetheless, HIPAA concerns can often be solved with existing technology solutions and policies. Other policies put constraints on how a company might monetize their service, creating constraints on the company’s business model. For instance, Zocdoc uses a subscription-based rather than a transaction-based model. A basic transformation in business model has many downstream effects on how machine learning problems are formulated and solved.
Beyond regulatory constraints, there are ethical considerations that are particularly important in making decisions related to healthcare. Brian walks you through the common machine learning development pipeline (from problem formulation to research, then to deployment and evaluation) and shows how the aforementioned constraints can be worked into the process, with a focus on how to adapt to new constraints and develop the appropriate patterns working within such a constrained data mining framework.
Brian d’Alessandro is a Sr Director of data science at Capital One (Financial Services). Brian is also an active professor for NYU’s Center for Data Science graduate degree program. Previously, Brian built and led data science programs for several NYC tech startups, including Zocdoc and Dstillery. A veteran data scientist and leader with over 18 years of experience developing machine learning-driven practices and products, Brian holds several patents and has published dozens of peer-reviewed articles on the subjects of causal inference, large-scale machine learning, and data science ethics. When not doing data science, Brian likes to cook, create adventures with his family, and surf in the frigid north Atlantic waters.
For exhibition and sponsorship opportunities, email strataconf@oreilly.com
For information on trade opportunities with O'Reilly conferences, email partners@oreilly.com
View a complete list of Strata Data Conference contacts
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • confreg@oreilly.com