Pratik Chaudhari, “Does the Data Induce Capacity Control in Deep Learning?”
Pratik Chaudhari, PhD
Assistant Professor
University of Pennsylvania
“Does the Data Induce Capacity Control in Deep Learning?”
Abstract: Accepted statistical wisdom suggests that larger the model class, the more likely it is to overfit the training data. And yet, deep networks generalize extremely well. The larger the deep network, the better its accuracy on new data. This talk seeks to shed light upon this apparent paradox.
We will argue that deep networks are successful because of a characteristic structure in the space of learning tasks. The input correlation matrix for typical tasks has a peculiar (“sloppy”) eigen spectrum where, in addition to a few large eigenvalues (salient features), there are a large number of small eigenvalues that are distributed uniformly over exponentially large ranges. This structure in the input data is strongly mirrored in the representation learned by the network. A number of quantities such as the Hessian, the Fisher Information Matrix, as well as others activation correlations and Jacobians, are also sloppy. Even if the model class for deep networks is very large, there is an exponentially small subset of models (in the number of data) that fit such sloppy tasks. This talk will demonstrate the first analytical non-vacuous generalization bound for deep networks that does not use compression. We will also discuss an application of these concepts that develops new algorithms for semi-supervised learning. References 1. Does the data induce capacity control in deep learning?. Rubing Yang, Jialin Mao, and Pratik Chaudhari. [ICML ’22] https://arxiv.org/abs/2110.14163 2. Deep Reference Priors: What is the best way to pretrain a model? Yansong Gao, Rahul Ramesh, Pratik Chaudhari. [ICML ’22] https://arxiv.org/abs/2202.00187
Biography: Pratik Chaudhari is an Assistant Professor in Electrical and Systems Engineering and Computer and Information Science at the University of Pennsylvania. He is a member of the GRASP Laboratory. From 2018-19, he was a Senior Applied Scientist at Amazon Web Services and a Postdoctoral Scholar in Computing and Mathematical Sciences at Caltech. Pratik received his PhD (2018) in Computer Science from UCLA, his Master’s (2012) and Engineer’s (2014) degrees in Aeronautics and Astronautics from MIT. He was a part of NuTonomy Inc. (now Hyundai- Aptiv Motional) from 2014—16. He received the NSF CAREER award and the Intel Rising Star Faculty Award in 2022.
Tuesdays, 12pm-1:15pm
Held virtually in person at Clark 110 & over Zoom
Check for event details: https://www.minds.jhu.edu/events/calendar/
Join Zoom Meeting
https://wse.zoom.us/j/98624413365
Meeting ID: 986 2441 3365
One tap mobile
+13017158592,,98624413365# US (Washington DC)
+16469313860,,98624413365# US
Dial by your location
+1 301 715 8592 US (Washington DC)
+1 646 931 3860 US
+1 309 205 3325 US
+1 312 626 6799 US (Chicago)
+1 646 558 8656 US (New York)
+1 669 900 6833 US (San Jose)
+1 719 359 4580 US
+1 253 215 8782 US (Tacoma)
+1 346 248 7799 US (Houston)
+1 386 347 5053 US
+1 564 217 2000 US
+1 669 444 9171 US
Meeting ID: 986 2441 3365
Find your local number: https://wse.zoom.us/u/asoOElnUp
Join by SIP
Join by H.323
162.255.37.11 (US West)
162.255.36.11 (US East)
115.114.131.7 (India Mumbai)
115.114.115.7 (India Hyderabad)
213.19.144.110 (Amsterdam Netherlands)
213.244.140.110 (Germany)
103.122.166.55 (Australia Sydney)
103.122.167.55 (Australia Melbourne)
149.137.40.110 (Singapore)
64.211.144.160 (Brazil)
149.137.68.253 (Mexico)
69.174.57.160 (Canada Toronto)
65.39.152.160 (Canada Vancouver)
207.226.132.110 (Japan Tokyo)
149.137.24.110 (Japan Osaka)
Meeting ID: 986 2441 3365