12/07/2020

On the Iteration Complexity of Hypergradient Computations

Riccardo Grazzi, Saverio Salzo, Massimiliano Pontil, Luca Franceschi

Keywords: Transfer, Multitask and Meta-learning

Abstract: We study a general class of bilevel optimization problems, in which the upper-level objective is defined via the solution of a fixed point equation. Important instances arising in machine learning include hyper-parameter optimization, meta-learning, graph and recurrent neural networks. Typically the gradient of the upper-level objective is not known explicitly or is hard to compute exactly, which has raised the interest in approximation methods. We investigate two popular approaches to compute the hypergradient, based on reverse mode iterative differentiation and approximate implicit differentiation. We present a unified analysis which allows for the first time to quantitatively compare these methods, providing explicit bounds for their iteration complexity. This analysis suggests a hierarchy in terms of computational efficiency among the above methods, with approximate implicit differentiation based on conjugate gradient performing best. We present an extensive experimental comparison among the methods which confirm the theoretical findings.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers