Maryam Fazel: Policy Gradient Descent for Control: Global Optimality via Convex Parameterization
Abstract: Motivated by the widely used “policy gradients” and related methods in reinforcement learning, revisiting these methods in the context of classical control problems has been a recent focus of study. In this talk, we start by examining the convergence and optimality of these methods for the infinite-horizon Linear Quadratic Regulator (LQR), where we show that despite nonconvexity (with respect to policy parameters), gradient descent converges to the optimal policy under mild assumptions. Next, we make a connection between classical convex parameterizations in control theory on one hand, and the gradient dominance property of the nonconvex cost function, on the other. Such a connection between nonconvex and convex landscapes gives a unified way to prove similar results for a whole range of control design problems, as long as they admit a convex parameterization.
Bio: Maryam Fazel is the Moorthy Family Professor of Electrical and Computer Engineering at the University of Washington, with adjunct appointments in Computer Science and Engineering, Mathematics, and Statistics. Maryam received her MS and PhD from Stanford University, and her BS from Sharif University of Technology in Iran, and was a postdoctoral scholar at Caltech before joining UW. She is a recipient of the NSF Career Award, UWEE Outstanding Teaching Award, and UAI conference Best Student Paper Award with her student. She directs the Institute for Foundations of Data Science (IFDS), a multi-site, collaborative NSF TRIPODS Institute. Her current research interests are in the area of optimization in machine learning and control.