As children, we are taught that sharing is caring. As data scientists, success often requires building on the work of others and ensuring others can build on your work. This means being able to easily find datasets, projects, and models and being able to hand off your own project to another data scientist or a data engineer who can help operationalize your latest model. However, this is easier said than done.
Differing Python and R environments, multiple algorithm implementations, “unscalable” machine learning, and novel libraries (e.g., deep learning) not yet approved by IT are only a few of the challenges customers encounter on a regular basis. When coupled with the security requirements of regulated industry, it can be incredibly difficult to share data and analysis, let alone reproduce or deploy machine learning in the enterprise. As a result, many organizations struggle to build a scalable data science practice.
Thomas Dinsmore and Johnson Poh share common technology considerations and patterns for collaboration between data scientists, data engineers, and the business teams they support and best practices for moving machine learning into production at scale.
Thomas W. Dinsmore is a Senior Director at DataRobot. Previously, he served as Director of Product Marketing for Cloudera Data Science; as a Knowledge Expert on the Strategic Analytics team at the Boston Consulting Group; Director of Product Management for Revolution Analytics; and in consulting roles at IBM Big Data Solutions, SAS, PricewaterhouseCoopers, and Oliver Wyman. Thomas has led or contributed to analytic solutions for more than five hundred clients across vertical markets and around the world, including AT&T, Banco Santander, Citibank, Dell, J.C.Penney, Monsanto, Morgan Stanley, Office Depot, Sony, Staples, United Health Group, UBS, and Vodafone. His international experience includes work for clients in the United States, Puerto Rico, Canada, Mexico, Venezuela, Brazil, Chile, the United Kingdom, Belgium, Spain, Italy, Turkey, Israel, Malaysia, and Singapore.
Johnson Poh heads the data science practice at DBS’s Big Data Analytics Center of Excellence, where he drives the development of core data science capabilities for enhancing decision analysis. He spent the past decade leading teams in applying statistical learning models across government, pharmaceutical, and financial industries. Johnson holds a postgraduate degree in statistical computing from Yale University and bachelor degrees in mathematics, statistics, and economics from UC Berkeley.
©2017, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org