06/12/2021

Implicit Sparse Regularization: The Impact of Depth and Early Stopping

Jiangyuan Li, Thanh Nguyen, Chinmay Hegde, Ka Wai Wong

Keywords: optimization

Abstract: In this paper, we study the implicit bias of gradient descent for sparse regression. We extend results on regression with quadratic parametrization, which amounts to depth-2 diagonal linear networks, to more general depth-$N$ networks, under more realistic settings of noise and correlated designs. We show that early stopping is crucial for gradient descent to converge to a sparse model, a phenomenon that we call \emph{implicit sparse regularization}. This result is in sharp contrast to known results for noiseless and uncorrelated-design cases. We characterize the impact of depth and early stopping and show that for a general depth parameter $N$, gradient descent with early stopping achieves minimax optimal sparse recovery with sufficiently small initialization $w_0$ and step size $\eta$. In particular, we show that increasing depth enlarges the scale of working initialization and the early-stopping window so that this implicit sparse regularization effect is more likely to take place.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd

Similar Papers