Principled tools for analyzing weight matrices of production-scale deep neural networks

Michael Mahoney (UC Berkeley)

11:05–11:45 Thursday, 17 October 2019

Location: King's Suite - Balmoral

Models and Methods

Secondary topics: Deep Learning, Deep Learning tools

Average rating:

(3.00, 4 ratings)

Who is this presentation for?

Engineers

Level

Intermediate

Description

An important practical challenge is developing theoretically principled tools that can guide the use of production-scale deep neural networks. Much of the theory is applicable at best to small toy models, and much practice is driven by heuristics that are very reasonable, but not particularly principled.

Michael Mahoney explores recent work focused on using spectral-based methods from scientific computing and statistical mechanics to develop such tools. These tools can develop metrics characterizing the quality of models, without even examining training or test data, which can in turn understand why the learning process works as it does. They can predict trends in generalization (and not just bounds on generalization) for state-of-the-art production-scale models. Related tools can exploit adversarial data to characterize and modify the curvature properties of the penalty landscape and perform tasks such as model quantization in a more automated way. You’ll learn the basic ideas underlying these methods and see their use for analyzing production-scale deep neural networks in computer vision, natural language processing, and related tasks.

Prerequisite knowledge

Familiarity with the basic ideas of machine learning or neural networks (useful but not required)

What you'll learn

Learn the basic ideas behind principled tools for analyzing weight matrices of production-scale deep neural networks and where to go to find code that implements these ideas

Michael Mahoney

UC Berkeley

Michael W. Mahoney is a professor in the Department of Statistics and the International Computer Science Institute (ICSI) at the University of California, Berkeley. He works on the algorithmic and statistical aspects of modern large-scale data analysis. He’s also the director of the NSF/TRIPODS-funded Foundations of Data Analysis (FODA) Institute at UC Berkeley. Much of his recent research has focused on large-scale machine learning, including randomized matrix algorithms and randomized numerical linear algebra, geometric network analysis tools for structure extraction in large informatics graphs, scalable implicit regularization methods, computational methods for neural network analysis, and applications in genetics, astronomy, medical imaging, social network analysis, and internet data analysis. Previously, he worked and taught in the Mathematics Department at Yale University, at Yahoo Research, and in the Mathematics Department at Stanford University. Among other things, he’s on the national advisory committee of the Statistical and Applied Mathematical Sciences Institute (SAMSI), he was on the National Research Council’s Committee on the Analysis of Massive Data, he co-organized the Simons Institute’s fall 2013 and 2018 programs on the foundations of data science, and he runs the biennial MMDS Workshops on Algorithms for Modern Massive Data Sets. He earned his PhD from Yale University with a dissertation in computational statistical mechanics. More information is available at https://www.stat.berkeley.edu/~mmahoney/.