Managing data science in the enterprise
Who is this presentation for?
- Data science managers and executives overseeing data science organizations
The honeymoon era of data science is ending and accountability is coming. Not content to wait for results that may or may not arrive, successful data science leaders must deliver measurable impact on an increasing share of an enterprise’s KPIs. You’ll learn how leading organizations take a holistic approach to people, process, and technology to build a sustainable competitive advantage.
- How to select the right data science project: Many organizations start with the data and look for something “interesting” rather than building a deep understanding of the existing business process and then pinpointing the decision point that can be augmented or automated.
- How to organize data science within the enterprise: There are trade-offs between centralized and federated models; alternatively, you could use a hybrid approach with something like a center of excellence.
- Why rapid prototyping and design sprints aren’t just for software developers: Leading organizations put prototyping ahead of the data collection process to ensure that stakeholder feedback is captured, increasing the probability of adoption. Some organizations even create synthetic data and naive baseline models to show how the model would impact existing business processes.
- Why order-of-magnitude ROI math should be on every hiring checklist: The ability to estimate the potential business impact of a change in a statistical measure is one the best predictors of success for a data science team.
- The difference between “pure research” and applied templates all data scientists, 80% think they’re doing the former, but realistically, the vast majority are applying well-known templates to novel business cases. Knowing which is which and how to manage them differently improves morale and output.
- Defining a stakeholder-centric project management process: The most common failure mode is when data science delivers results that are either too late or don’t fit into how the business works today, so results gather dust. Share insights early and often.
- Building for the scale that really matters: Many organizations optimize for scale of data but ultimately are overwhelmed by the scale of the growing data science team and its business stakeholders. Team throughput grinds to a crawl as information loss compounds from the number of interactions in a single project, much less a portfolio of hundreds or thousands of projects.
- Why time to iterate is the most important metric: Many organizations consider model deployment to be a moonshot when it really should be laps around a racetrack. Minimal obstacles (without sacrificing rigorous review and checks) to test real results is another great predictor of data science success. Facebook and Google deploy new models in minutes, whereas large financial services companies can take 18 months.
- Why delivered is not done: Many organizations have such a hard time deploying a model into production that the data scientists breathe a sigh of relief and move on to the next project. Yet this neglects the critical process of monitoring to ensure the model performs as expected and is used appropriately.
- Measure everything, including yourself: Ironically, data scientists live in the world of measurement yet rarely turn that lens on themselves. Tracking patterns in aggregate workflows helps create modular templates and guides investment in internal tooling and people to alleviate bottlenecks.
- Risk and change management aren’t just for consultants: Data science projects don’t usually fail because of the math but rather because of the humans who use the math. Establish training, provide predetermined feedback channels, and measure usage and engagement to ensure success.
- A basic understanding of data science
What you'll learn
- Learn how to design and run data science organizations to have a sustained, scalable, and predictable impact on business outcomes
- Understand why data science may have more to learn from product management than software engineering
Alex Izydorczyk leads the data science team at Coatue, overseeing the engineering and statistical teams’ process of integrating “alternative data” into the investment process. The team uses cutting edge methods from machine learning and statistics to digest and analyze a broad universe of data points to identify market and investing trends. Alex is also involved on the private investment side, particularly on topics of cryptocurrency and data science infrastructure. He graduated from the Wharton School at the University of Pennsylvania in 2015 with a degree in statistics.
Benjamin Singleton is the Director of Data Science & Analytics at JetBlue. He leads the design and development of data and analytics solutions for multiple functional areas, establishing JetBlue’s strategic roadmap for data governance, and leading management over data architecture, data science and analytics. Previously, he was the Director of Analytics at the New York Police Department where he focused on building data platform capabilities and data products to support operational and strategic needs.
Domino Data Lab
Josh Poduska is the chief data scientist at Domino Data Lab. He has 17 years of experience in analytics. His work experience includes leading the statistical practice at one of Intel’s largest manufacturing sites, working on smarter cities data science projects with IBM, and leading data science teams and strategy with several big data software companies. Josh holds a master’s degree in applied statistics from Cornell University.
Comments on this page are now closed.
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
For media/analyst press inquires