Sharper Generalization Bounds for Learning with Gradient-dominated Objective Functions

Abstract: Stochastic optimization has become the workhorse behind many successful machine learning applications, which motivates a lot of theoretical analysis to understand its empirical behavior. As a comparison, there is far less work to study the generalization behavior especially in a non-convex learning setting. In this paper, we study the generalization behavior of stochastic optimization by leveraging the algorithmic stability for learning with $\beta$-gradient-dominated objective functions. We develop generalization bounds of the order $O(1/(n\beta))$ plus the convergence rate of the optimization algorithm, where $n$ is the sample size. Our stability analysis significantly improves the existing non-convex analysis by removing the bounded gradient assumption and implying better generalization bounds. We achieve this improvement by exploiting the smoothness of loss functions instead of the Lipschitz condition in Charles & Papailiopoulos (2018). We apply our general results to various stochastic optimization algorithms, which show clearly how the variance-reduction techniques improve not only training but also generalization. Furthermore, our discussion explains how interpolation helps generalization for highly expressive models.

06/12/2020

Sharper Generalization Bounds for Learning with Gradient-dominated Objective Functions

Yunwen Lei, Yiming Ying

Comments

Similar Papers

Efficient Learning of Generative Models via Finite-Difference Score Matching

Tianyu Pang, Kun Xu, Chongxuan LI and Yang Song, Stefano Ermon, Jun Zhu

Keywords Abstract Paper

Learning to Guide Random Search

Ozan Sener, Vladlen Koltun

Keywords Abstract Paper

Random search, Derivative-free optimization, Learning continuous control

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Abstract Paper

Deep Learning - Algorithms

Concentration of Non-Isotropic Random Tensors with Applications to Learning and Empirical Risk Minimization

Mathieu Even, Laurent Massoulie

Keywords Abstract Paper

A Value-Function-based Interior-point Method for Non-convex Bi-level Optimization

Risheng Liu, Xuan Liu, Xiaoming Yuan and Shangzhi Zeng, Jin Zhang

Keywords Abstract Paper

Optimization, Non-Convex Optimization

One Sample Stochastic Frank-Wolfe

Mingrui Zhang, Zebang Shen, Aryan Mokhtari and Hamed Hassani, Amin Karbasi

Keywords Abstract Paper

Leveraging Non-uniformity in First-order Non-convex Optimization

Jincheng Mei, Yue Gao, Bo Dai and Csaba Szepesvari, Dale Schuurmans

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

A simpler approach to accelerated optimization: iterative averaging meets optimism

Pooria Joulani, Anant Raj, András György, Csaba Szepesvari

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Communication-Efficient Frank-Wolfe Algorithm for Nonconvex Decentralized Distributed Learning

Wenhan Xian, Feihu Huang, Heng Huang

Keywords Abstract Paper

Geometric Insights into the Convergence of Nonlinear TD Learning

David Brandfonbrener, Joan Bruna

Keywords Abstract Paper

TD, nonlinear, convergence, value estimation, reinforcement learning

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

Kenji Kawaguchi, Haihao Lu

Keywords Abstract Paper

Fine-grained Generalization Analysis of Vector-Valued Learning

Liang Wu, Antoine Ledent, Yunwen Lei, Marius Kloft

Keywords Abstract Paper

Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems

Tianyi Chen, Yuejiao Sun, Wotao Yin

Keywords Abstract Paper

theory, optimization, reinforcement learning and planning, machine learning

Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking

Haoran Sun, Songtao Lu, Mingyi Hong

Keywords Abstract Paper

Optimization - Non-convex

On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them

Chen Liu, Mathieu Salzmann, Tao Lin and Ryota Tomioka, Sabine Süsstrunk

Keywords Abstract Paper

Algorithms -> Representation Learning, Applications -> Dialog- or Communication-Based Learning

Adaptive Discretization for Model-Based Reinforcement Learning

Sean Sinclair, Tianyu Wang, Gauri Jain and Sid Banerjee, Christina Yu

Keywords Abstract Paper

DP-LSSGD: A Stochastic Optimization Method to Lift the Utility in Privacy-Preserving ERM

Bao Wang, Quanquan Gu, March Boedihardjo and Lingxiao Wang, Farzin Barekat, Stanley J. Osher

Keywords Abstract Paper

Generalization Bound of Gradient Descent for Non-Convex Metric Learning

MINGZHI DONG, Xiaochen Yang, Rui Zhu and Yujiang Wang, Jing-Hao Xue

Keywords Abstract Paper

DAGs with No Fears: A Closer Look at Continuous Optimization for Learning Bayesian Networks

Dennis Wei, Tian Gao, Yue Yu

Keywords Abstract Paper

Smooth Bilevel Programming for Sparse Regularization

Clarice Poon, Gabriel Peyré

Keywords Abstract Paper

machine learning

A Novel Sequential Coreset Method for Gradient Descent Algorithms

Jiawei Huang, Ruomin Huang, wenjie liu and Nikolaos Freris, Hu Ding

Keywords Abstract Paper

Optimization

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

Tianyu Pang, Kun Xu, Chongxuan LI and
Yang Song, Stefano Ermon, Jun Zhu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Risheng Liu, Xuan Liu, Xiaoming Yuan and
Shangzhi Zeng, Jin Zhang

Keywords Paper

Mingrui Zhang, Zebang Shen, Aryan Mokhtari and
Hamed Hassani, Amin Karbasi

Keywords Paper

Jincheng Mei, Yue Gao, Bo Dai and
Csaba Szepesvari, Dale Schuurmans

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Chen Liu, Mathieu Salzmann, Tao Lin and
Ryota Tomioka, Sabine Süsstrunk

Keywords Paper

Sean Sinclair, Tianyu Wang, Gauri Jain and
Sid Banerjee, Christina Yu

Keywords Paper

Bao Wang, Quanquan Gu, March Boedihardjo and
Lingxiao Wang, Farzin Barekat, Stanley J. Osher

Keywords Paper

MINGZHI DONG, Xiaochen Yang, Rui Zhu and
Yujiang Wang, Jing-Hao Xue

Keywords Paper

Keywords Paper

Keywords Paper

Jiawei Huang, Ruomin Huang, wenjie liu and
Nikolaos Freris, Hu Ding

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yeming Wen, Kevin Luk, Maxime Gazeau and
Guodong Zhang, Harris Chan, Jimmy Ba

Keywords Paper

Keywords Paper

ZHENHUAN YANG, Yunwen Lei, Puyu Wang and
Tianbao Yang, Yiming Ying

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper