Ambar Pal – “A Regularization view of Dropout in Neural Networks”
Talk 2: A Regularization view of Dropout in Neural Networks, by Ambar Pal (CS, JHU)
Abstract: Dropout is a popular training technique used to improve the performance of Neural Networks. However, a complete understanding of the theoretical underpinnings behind this success remains elusive. In this talk, we will take a regularization view of explaining the empirically observed properties of Dropout. In the first part, we will investigate the case of a single layer linear neural network with Dropout applied to the hidden layer, and observe how the Dropout algorithm can be seen as an instance of Gradient Descent applied to a changing objective. Then we will understand how training with Dropout can be seen to be equivalent to adding a regularizer to the original network. With these tools we would be able to show that Dropout is equivalent to a nuclear-norm regularized problem, where the nuclear-norm is taken on the product of the weight matrices of the network.
Inspired by the success of Dropout, several variants have been proposed recently in the community. In the second part of the talk, we will analyze some of these variants (DropBlock and DropConnect), and obtain theoretical reasons for their success over vanilla Dropout. Finally, we will end with a unified theory to analyze Dropout variants, and understand some of the implications.
Bio: Ambar is a PhD student in the Computer Science Department at the Johns Hopkins University. I am advised by René Vidal, and am affiliated with the Mathematical Institute for Data Science and the Vision Lab at JHU. Previously he obtained his Bachelor’s degree in Computer Science from IIIT Delhi. His current research interest lies in the theory of deep learning, specifically, trying to theoretically understand the properties induced by common deep learning techniques on the optimization of deep architectures. He is currently working on understanding the regularization properties induced by common tricks used in training DNNs. He has a secondary interest in understanding adversarial examples generated for computer vision systems. He is a MINDS Data Science Fellow and his research has been supported by the IARPA DIVA and the DARPA GARD grants