While building machine learning models for most large projects, data scientists typically design dozens of models using different combinations of hyperparameters, data configurations, and training settings. Most data scientists lack the time to keep track of all these configurations, and the ones who do use spreadsheets or save the best ones for final modeling. A variety of different approaches have been published for different machine learning frameworks on how to inventory these settings for model reproducibility, focusing on the emerging area of explainable AI. Keeping track of how models were developed enables an empirical approach to iterative machine learning modeling and also improves the transparency of communicating how models were designed to clients and stakeholders.
Catherine Ordun shares current insights based on published literature in model reproducibility, illustrating how tracking model development can aid in empirical design. She then reviews a use case in developing recurrent neural networks for forecasting and walks you through a Jupyter notebook on how to build a leaderboard using models built in Keras.
Catherine Ordun is a senior data scientist at Booz Allen focused on growing AI capabilities for biosurveillance and biodefense clients across the public health and defense markets. She specializes in leading teams to develop machine learning models for computer vision, natural language processing, and time series forecasting and collaborates with modern software and Agile development teams to build environments for deployable models. Over the course of her career at Booz Allen, Catherine has served clients in the intelligence community, the Centers for Disease Control and Prevention (CDC), the Food and Drug Administration (FDA), the Department of Veterans Affairs (VA), the US Army, and the Department of Treasury. The breadth of her experience is reflected by the diversity of the data, use cases, and client requirements across these organizations, ranging from leading prototypes that combine computer vision and robotic process automation at the Department of Treasury to predicting hostile work environment risk at the VA to developing time series disease forecasting models for the DoD and developing cognitive search capabilities for the US Army. Recently, Catherine has been leading a team of data scientists to develop prototype sentiment modeling on images and is working to help lead investments in model reproducibility and interpretability at Booz Allen. She’s passionate about mentoring junior talent and promoting education for the firm’s Women in Data Science group. Previously, Catherine worked for the CDC, the Defense Advanced Research Projects Agency (DARPA), and the US intelligence community. She holds a BS in applied biology from Georgia Tech, an MPH in environmental and occupational health from Emory University, and an MBA from George Washington University. She’s also a Booz Allen NVIDIA-certified Deep Learning Instructor.
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • firstname.lastname@example.org