Presented By O’Reilly and Cloudera
Make Data Work
21–22 May 2018: Training
22–24 May 2018: Tutorials & Conference
London, UK

A roadmap for open data science

Thomas Dinsmore (DataRobot)
14:0514:45 Thursday, 24 May 2018

Who is this presentation for?

  • Analytics and data science leaders

Prerequisite knowledge

  • A basic understanding of commercial and open source data science tools and enterprise software provisioning and deployment

What you'll learn

  • Learn best practices for developing a culture of open data science


Working data scientists prefer to use open source software, such as Python, R, and Apache Spark, for many reasons, including comprehensive functionality, flexibility and extensibility, transparency, and innovation. Open source software can scale to support the needs of large enterprises at an acceptable cost. However, migrating to open data science is challenging for several reasons: existing users of legacy software often have strong personal preferences, and resist switching; programs written with legacy software must be rebuilt in new tools; and data may be siloed within the legacy platform. Complicating matters, commercial software vendors use community-building techniques to cultivate loyalty among end users. At the same time, many organizations have a large footprint of legacy analytics software. Executives in these organizations often struggle to both manage the growing cost to provision this software and encourage users to adopt open source tooling.

However, there are clear best practices to accelerate adoption and success with open data science. Thomas Dinsmore shares a model to help organizations begin the journey, build momentum, and reduce reliance on legacy software. This includes such things as executive leadership, cost transparency, and clear metrics of user adoption and success with open data science tools.

Topics include:

  • Understanding the needs of users
  • Aligning software (commercial or open source) to actual user needs
  • Avoiding duplication and overlicensing
  • Options for code migration and rebuilding
  • Eliminating data silos
  • The most effective way to train and retrain users
Photo of Thomas Dinsmore

Thomas Dinsmore


Thomas W. Dinsmore is a Senior Director at DataRobot. Previously, he served as Director of Product Marketing for Cloudera Data Science; as a Knowledge Expert on the Strategic Analytics team at the Boston Consulting Group; Director of Product Management for Revolution Analytics; and in consulting roles at IBM Big Data Solutions, SAS, PricewaterhouseCoopers, and Oliver Wyman. Thomas has led or contributed to analytic solutions for more than five hundred clients across vertical markets and around the world, including AT&T, Banco Santander, Citibank, Dell, J.C.Penney, Monsanto, Morgan Stanley, Office Depot, Sony, Staples, United Health Group, UBS, and Vodafone. His international experience includes work for clients in the United States, Puerto Rico, Canada, Mexico, Venezuela, Brazil, Chile, the United Kingdom, Belgium, Spain, Italy, Turkey, Israel, Malaysia, and Singapore.

Leave a Comment or Question

Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?

Join the conversation here (requires login)