When: September 14, 2016, 12:30 PM
Location: 3rd Floor Orchard View Room , Discovery Building
Contact: 608-316-4401, firstname.lastname@example.org
Faster Projection-free Algorithms for Optimization and Learning
Projected gradient descent (PGD), and its close variants, are often considered the method of choice for solving a very large variety of machine learning optimization problems, including sparse recovery problems, empirical risk minimization, stochastic optimization, and online convex optimization. This is not surprising, since PGD is often optimal in a very appealing information-theoretic sense. However, for many problems PGD is infeasible both in theory and practice since each step requires to compute an orthogonal projection onto the feasible set. In many important cases, such as when the feasible set is a non-trivial polytope, or a convex surrogate for a low-rank structure, computing the projection is computationally inefficient in high-dimensional settings. An alternative is the conditional gradient method (aka Frank-Wolfe algorithm) that replaces the expensive projection step with a linear optimization step over the feasible set. Indeed in many problems of interest, the linear optimization step admits much more efficient algorithms than the projection step, which is the reason to the substantial regained interest in this method in the past decade. On the downside, the convergence rates of the CG method often falls behind that of PGD and its variants. In this talk I will survey an ongoing effort to design CG variants that on one hand enjoy the cheap iteration complexity of the original method, and on the other hand converge provably faster, and are applicable to a wider variety of machine learning settings. In particular I will focus on the cases in which the feasible set is either a polytope or a surrogate for low-rank matrices. Results will be demonstrated on applications including: matrix completion, multi-class classification, video co-localization, and optical character recognition problems.
The weekly SILO seminar series is made possible through the generous support of the 3M Company and its Advanced Technology Group
SILO is a lecture series with speakers from the UW faculty, graduate students or invited researchers that discuss mathematical related topics. The seminars are organized by WID’s Optimization research group.
SILO’s purpose is to provide a forum that helps connect and recruit mathematically-minded graduate students. SILO is a lunch-and-listen format, where speakers present interesting math topics while the audience eats lunch.