While building machine learning models for most large projects, data scientists typically design dozens of models using different combinations of hyperparameters, data configurations, and training settings. Most data scientists lack the time to keep track of all these configurations, and the ones who do use spreadsheets or save the best ones for final modeling. A variety of different approaches have been published for different machine learning frameworks on how to inventory these settings for model reproducibility, focusing on the emerging area of “Explainable AI”. Keeping track of how models were developed enables an empirical approach to iterative machine learning modeling, and also improves the transparency of communicating how models were designed to clients and stakeholders. For data scientists seeking to inventory model configurations and settings using the Keras framework, we provide an overview and template of how to create your own leaderboard to track your machine learning models.
This session will share current insights based on published literature in model reproducibility, illustrating how tracking model development can aid in empirical design. The presenter will review a use case in developing recurrent neural networks for forecasting and walk through a Jupyter Notebook on how to build a leaderboard using models built in Keras.
Catherine Ordun is a Senior Data Scientist at Booz Allen focused on growing AI capabilities for biosurveillance and biodefense clients across the public health and defense markets. She specializes in leading teams to develop machine learning models for computer vision, natural language processing, and time series forecasting, and collaborates with modern software and Agile development teams to build environments for deployable models. Over the course of her career at Booz Allen, Catherine has served clients in the Intelligence Community, the Centers for Disease Control and Prevention (CDC), the Food and Drug Administration (FDA), Department of Veterans Affairs (VA), the U.S. Army, and the Department of Treasury. The breadth of her experience is reflected by the diversity of the data, use cases, and client requirements across these organizations ranging from leading prototypes that combine computer vision and robotic process automation at the Department of Treasury to predicting hostile work environment risk at the VA to developing time series disease forecasting models for the DoD, and developing cognitive search capabilities for the U.S Army. Recently, Catherine has been leading a team of data scientists to develop prototype sentiment modeling on images and is working to help lead investments in model reproducibility and interpretability at Booz Allen. She is passionate about mentoring junior talent and promoting education for the firm’s Women in Data Science group. Prior to joining Booz Allen, Catherine worked for the Centers for Disease Control and Prevention (CDC), the Defense Advanced Research Projects Agency (DARPA), and the U.S. Intelligence Community. She has a B.S. in Applied Biology from Georgia Tech, M.P.H. in Environmental and Occupational Health from Emory University, and MBA from George Washington University, and is a Booz Allen NVIDIA certified Deep Learning Instructor.
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
©2019, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com