Principled tools for analyzing weight matrices of production-scale deep neural networks
Who is this presentation for?
An important practical challenge is developing theoretically principled tools that can guide the use of production-scale deep neural networks. Much of the theory is applicable at best to small toy models, and much practice is driven by heuristics that are very reasonable, but not particularly principled.
Michael Mahoney explores recent work focused on using spectral-based methods from scientific computing and statistical mechanics to develop such tools. These tools can develop metrics characterizing the quality of models, without even examining training or test data, which can in turn understand why the learning process works as it does. They can predict trends in generalization (and not just bounds on generalization) for state-of-the-art production-scale models. Related tools can exploit adversarial data to characterize and modify the curvature properties of the penalty landscape and perform tasks such as model quantization in a more automated way. You’ll learn the basic ideas underlying these methods and see their use for analyzing production-scale deep neural networks in computer vision, natural language processing, and related tasks.
- Familiarity with the basic ideas of machine learning or neural networks (useful but not required)
What you'll learn
- Learn the basic ideas behind principled tools for analyzing weight matrices of production-scale deep neural networks and where to go to find code that implements these ideas
Michael W. Mahoney is a professor in the Department of Statistics and the International Computer Science Institute (ICSI) at the University of California, Berkeley. He works on the algorithmic and statistical aspects of modern large-scale data analysis. He’s also the director of the NSF/TRIPODS-funded Foundations of Data Analysis (FODA) Institute at UC Berkeley. Much of his recent research has focused on large-scale machine learning, including randomized matrix algorithms and randomized numerical linear algebra, geometric network analysis tools for structure extraction in large informatics graphs, scalable implicit regularization methods, computational methods for neural network analysis, and applications in genetics, astronomy, medical imaging, social network analysis, and internet data analysis. Previously, he worked and taught in the Mathematics Department at Yale University, at Yahoo Research, and in the Mathematics Department at Stanford University. Among other things, he’s on the national advisory committee of the Statistical and Applied Mathematical Sciences Institute (SAMSI), he was on the National Research Council’s Committee on the Analysis of Massive Data, he co-organized the Simons Institute’s fall 2013 and 2018 programs on the foundations of data science, and he runs the biennial MMDS Workshops on Algorithms for Modern Massive Data Sets. He earned his PhD from Yale University with a dissertation in computational statistical mechanics. More information is available at https://www.stat.berkeley.edu/~mmahoney/.
Leave a Comment or Question
Help us make this conference the best it can be for you. Have questions you'd like this speaker to address? Suggestions for issues that deserve extra attention? Feedback that you'd like to share with the speaker and other attendees?
Join the conversation here (requires login)
For conference registration information and customer service
For more information on community discounts and trade opportunities with O’Reilly conferences
For information on exhibiting or sponsoring a conference
For media/analyst press inquires