Abstract
Compared to minimization problems, the min-max landscape in machine learning
applications is considerably more convoluted because of the existence of cycles
and similar phenomena. Such oscillatory behaviors are well-understood in the
convex-concave regime, and many algorithms are known to overcome them. In this
paper, we go beyond the convex-concave setting and we characterize the
convergence properties of a wide class of zeroth-, first-, and (scalable)
second-order methods in non-convex/non-concave problems. In particular, we show
that these state-of-the-art min-max optimization algorithms may converge with
arbitrarily high probability to attractors that are in no way min-max optimal
or even stationary. Spurious convergence phenomena of this type can arise even
in two-dimensional problems, a fact which corroborates the empirical evidence
surrounding the formidable difficulty of training GANs.
Users
Please
log in to take part in the discussion (add own reviews or comments).