Dmitrii M. Ostrovskii, “Recent advances in nonconvex-(non)concave min-max optimization”

/ January 25, 2022/

When:
February 9, 2022 @ 12:00 pm – 1:00 pm
2022-02-09T12:00:00-05:00
2022-02-09T13:00:00-05:00

In the classical formulation of min-max optimization, which dates back to the seminal works of
Nash and von Neyman from the 1930s{1940s, one focuses on optimization problems of the form

where X,Y is a pair of convex sets, and the objective function f is smooth (in both variables)
and convex-concave|that is, f(; y) is convex for any y 2 Y and f(x; ) is concave for any x 2 X.
The convexity-concavity of f guarantees computational tractability, while also being satis fied in
many practical scenarios. However, this assumption is too restrictive for some applications in data
science- in particular, those related to Generalized Adversarial Networks and adversarially robust
training. Hence, a lot of effort has recently been exerted to go beyond the convex-concave setup.
In this talk, I will present two results pertaining to min-max optimization beyond the classical
setup. The first result, which will be presented only briefly, deals with the nonconvex-concave case,
where f(x; ) is still concave, but f(; y) is not assumed convex; e.g., this encompasses the task
of minimizing the pointwise maximum over a fi nite family of smooth nonconvex functions. Our
algorithm admits a convergence guarantee that is best known in the literature and is conjectured
to be optimal. I will then move to the more challenging nonconvex-nonconcave setup. While
the general case of smooth nonconvex-nonconcave min-max optimization is widely believed to be
computationally intractable (this being just ified by some partial theoretical evidence), one may get
tractability by imposing on the nested maximization problem some other structure than concavity.
We consider a very natural way of doing so, by assuming that the maximization domain Y is small.
Our methodology then relies upon replacing f(x; ), at any x 2 X, with its Taylor approximation
(in y) of order k, solving the resulting surrogate min-max problem, and certifying that the resulting
surrogate solution also “works” for the initial problem if Y has a “small enough” Euclidean diameter.
To guarantee its success, we provide nearly matching upper and lower bounds on the maximal
diameter of Y allowing for such a reduction, for any k 2 f0;Ng. Using Taylor approximations with
k 6 2 then leads to efficient algorithms with provable guarantees.

Zoom link: https://wse.zoom.us/j/95448608570

Share this Post