Executive Briefing: The black box—Interpretability, reproducibility, and data management
The growing complexity of data science leads to black box solutions that few people in an organization understand. You often hear about the difficulty of interpretability—explaining how an analytic model works—and that you need it to deploy models. But people use many black boxes without understanding them…if they’re reliable. It’s when the black box becomes unreliable that people lose trust.
Mistrust is more likely to be created by the lack of reliability, and the lack of reliability is often the result of misunderstanding essential elements of analytics infrastructure and practice. The concept of reproducibility—the ability to get the same results given the same information—extends your view to include the environment and the data used to build and execute models.
Mark Madsen examines reproducibility and the areas that underlie production analytics and explores the most frequently ignored and yet most essential capability, data management. The industry needs to consider its practices so that systems are more transparent and reliable, improving trust and increasing the likelihood that your analytic solutions will succeed.
Mark Madsen is a fellow at Teradata, where he’s responsible for understanding, forecasting, and defining the analytics ecosystem and architecture. Previously, he was CEO of Third Nature, where he advised companies on data strategy and technology planning and vendors on product management. Mark has designed analysis, machine learning, data collection, and data management infrastructure for companies worldwide.
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
For media/analyst press inquires