Adaptive Momentum Coefficient for Neural Network Optimization

14/09/2020

Adaptive Momentum Coefficient for Neural Network Optimization

Zana Rashidi, Kasra Ahmadi K. A., Aijun An, Xiaogang Wang

Keywords: adaptive momentum, neural networks, optimization, accelerated gradient descent, convex optimization

Abstract Paper Similar Papers

Abstract: We propose a novel and efficient momentum-based first-order algorithm for optimizing neural networks which uses an adaptive coefficient for the momentum term. Our algorithm, called Adaptive Momentum Coefficient (AMoC), utilizes the inner product of the gradient and the previous update to the parameters, to effectively control the amount of weight put on the momentum term based on the change of direction in the optimization path. The algorithm is easy to implement and its computational overhead over momentum methods is negligible. Extensive empirical results on both convex and neural network objectives show that AMoC performs well in practise and compares favourably with other first and second-order optimization algorithms. We also provide a convergence analysis and a convergence rate for AMoC, showing theoretical guarantees similar to those provided by other efficient first-order methods.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ECML PKDD 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Wei Deng, Guang Lin, Faming Liang

Keywords Paper

0

0

0

0

3:26

06/12/2021

Large-Scale Wasserstein Gradient Flows

Petr Mokrov, Alexander Korotin, Lingxiao Li and
Aude Genevay, Justin Solomon, Evgeny Burnaev

Keywords Paper

deep learning, optimization, machine learning, optimal transport

0

0

1

1

11:26

12/07/2020

Acceleration through spectral density estimation

Fabian Pedregosa, Damien Scieur

Keywords Paper

Optimization - Convex

0

0

0

0

11:34

03/05/2021

Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime

Atsushi Nitanda, Taiji Suzuki

Keywords Paper

stochastic gradient descent, neural tangent kernel, over-parameterization, two-layer neural network

0

0

0

0

18:48

26/04/2020

Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks

Wei Hu, Lechao Xiao, Jeffrey Pennington

Keywords Paper

deep learning theory, non-convex optimization, orthogonal initialization

0

0

0

0

5:10

03/08/2020

C-MI-GAN : Estimation of Conditional Mutual Information using MinMax formulation

Arnab Mondal, Arnab Bhattacharjee, Sudipto Mukherjee and
Himanshu Asnani, Sreeram Kannan, Prathosh A P

Keywords Paper

0

0

0

0

7:56

12/07/2020

Randomized Block-Diagonal Preconditioning for Parallel Learning

Celestine Mendler-Dünner, Aurelien Lucchi

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

12:57

06/12/2020

Ode to an ODE

Krzysztof Choromanski, Jared Davis, Valerii Likhosherstov and
Xingyou Song, Jean-Jacques Slotine, Jacob Varley, Honglak Lee, Adrian Weller, Vikas Sindhwani

Keywords Paper

0

0

0

0

3:16

12/07/2020

Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization

Hien Le, Nicolas Gillis, Panagiotis Patrinos

Keywords Paper

Optimization - General

0

0

0

0

15:31

13/04/2021

DebiNet: Debiasing linear models with nonlinear overparameterized neural networks

Shiyun Xu, Zhiqi Bu

Keywords Paper

0

0

0

0

2:56

26/04/2020

Escaping Saddle Points Faster with Stochastic Momentum

Jun-Kun Wang, Chi-Heng Lin, Jacob Abernethy

Keywords Paper

SGD, momentum, escaping saddle point

0

0

0

0

5:26

06/12/2021

Particle Dual Averaging: Optimization of Mean Field Neural Network with Global Convergence Rate Analysis

Atsushi Nitanda, Denny Wu, Taiji Suzuki

Keywords Paper

theory, deep learning, optimization

0

0

0

0

12:59

06/12/2020

Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs

Talgat Daulbaev, Alexandr Katrutsa, Larisa Markeeva and
Julia Gusak, Andrzej Cichocki, Ivan Oseledets

Keywords Paper

0

0

0

0

3:18

12/07/2020

Convergence Rates of Variational Inference in Sparse Deep Learning

Badr-Eddine Chérief-Abdellatif

Keywords Paper

Deep Learning - General

0

0

0

0

15:05

06/12/2021

Efficient hierarchical Bayesian inference for spatio-temporal regression models in neuroimaging

Ali Hashemi, Yijing Gao, Chang Cai and
Sanjay Ghosh, Klaus-Robert Müller, Srikantan Nagarajan, Stefan Haufe

Keywords Paper

theory, optimization

0

0

0

0

14:31

02/02/2021

ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization

Xunpeng Huang, Runxin Xu, Hao Zhou and
Zhe Wang, Zhengyang Liu, Lei Li

Keywords Paper

0

0

0

0

17:26

18/07/2021

Bilevel Optimization: Convergence Analysis and Enhanced Design

Kaiyi Ji, Junjie Yang, Yingbin LIANG

Keywords Paper

Optimization, Non-Convex Optimization

0

0

0

0

5:02

26/08/2020

A Linear-time Independence Criterion Based on a Finite Basis Approximation

Longfei Yan, W. Bastiaan Kleijn, thushara abhayapala

Keywords Paper

0

0

0

0

12:09

26/08/2020

Stochastic Variance-Reduced Algorithms for PCA with Arbitrary Mini-Batch Sizes

Cheolmin Kim, Diego Klabjan

Keywords Paper

0

0

0

0

14:26

06/12/2020

Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks

Ryo Karakida, Kazuki Osawa

Keywords Paper

0

0

0

0

3:19

13/04/2021

Mirror descent view for neural network quantization

Thalaiyasingam Ajanthan, Kartik Gupta, Philip Torr and
Richard Hartley, Puneet Dokania

Keywords Paper

0

0

0

0

3:04

18/07/2021

Learning Neural Network Subspaces

Mitchell Wortsman, Maxwell Horton, Carlos Guestrin and
Ali Farhadi, Mohammad Rastegari

Keywords Paper

Deep Learning, Applications, Dialog- or Communication-Based Learning, Algorithms, Representation Learning

0

0

0

0

5:07

02/02/2021

On the Verification of Neural ODEs with Stochastic Guarantees

Sophie Grunbacher, Ramin Hasani, Mathias Lechner and
Jacek Cyranka, Scott A. Smolka, Radu Grosu

Keywords Paper

0

0

0

0

14:50

26/04/2020

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks

Yu Bai, Jason D. Lee

Keywords Paper

Neural Tangent Kernels, over-parametrized neural networks, deep learning theory

0

0

0

0

5:25

26/08/2020

Kernel Conditional Density Operators

Ingmar Schuster, Mattes Mollenhauer, Stefan Klus, Krikamol Muandet

Keywords Paper

0

0

0

0

14:59

06/12/2020

Collegial Ensembles

Etai Littwin, Ben Myara, Sima Sabah and
Joshua Susskind, Shuangfei Zhai, Oren Golan

Keywords Paper

0

0

0

0

3:17

06/12/2021

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Clement Gehring, Kenji Kawaguchi, Jiaoyang Huang, Leslie Kaelbling

Keywords Paper

theory, deep learning, optimization, reinforcement learning and planning

0

0

0

0

13:08

13/04/2021

Explicit regularization of stochastic gradient methods through duality

Anant Raj, Francis Bach

Keywords Paper

0

0

0

0

2:53

06/12/2021

Towards Sharper Generalization Bounds for Structured Prediction

Shaojie Li, Yong Liu

Keywords Paper

theory, graph learning

0

0

0

0

12:11

26/04/2020

Accelerating SGD with momentum for over-parameterized learning

Chaoyue Liu, Mikhail Belkin

Keywords Paper

SGD, acceleration, momentum, stochastic, over-parameterized, Nesterov

0

0

0

0

4:50

06/12/2020

Demystifying Orthogonal Monte Carlo and Beyond

Han Lin, Haoxian Chen, Krzysztof M Choromanski and
Tianyi Zhang, Clement Laroche

Keywords Paper

0

0

0

0

3:19

18/07/2021

On Estimation in Latent Variable Models

Guanhua Fang, Ping Li

Keywords Paper

Optimization, Stochastic Optimization

0

0

0

0

4:55

14/09/2020

Orthant Based Proximal Stochastic Gradient Method for l1-Regularized Optimization

Tianyi Chen, Tianyu Ding, Bo Ji and
Guanyi Wang, Yixin Shi, Jing Tian, Sheng Yi, Xiao Tu, Zhihui Zhu

Keywords Paper

stochastic learning, sparsity, orthant prediction

0

0

0

0

15:18

26/08/2020

One Sample Stochastic Frank-Wolfe

Mingrui Zhang, Zebang Shen, Aryan Mokhtari and
Hamed Hassani, Amin Karbasi

Keywords Paper

0

0

0

0

6:05

03/05/2021

Linear Last-iterate Convergence in Constrained Saddle-point Optimization

Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, Haipeng Luo

Keywords Paper

Game Theory, Last-iterate Convergence, Optimistic Multiplicative Weights Update, Optimistic Gradient Descent Ascent, Optimistic Mirror Decent, Saddle-point Optimization

0

0

0

0

4:59

06/12/2021

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Dylan J Foster, Akshay Krishnamurthy

Keywords Paper

theory, reinforcement learning and planning, bandits, online learning

0

0

0

0

19:34

12/07/2020

State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes

William Wilkinson, Paul Chang, Michael Andersen, Arno Solin

Keywords Paper

Gaussian Processes

0

0

0

0

13:31

18/07/2021

Distributed Second Order Methods with Fast Rates and Compressed Communication

Rustem Islamov, Xun Qian, Peter Richtarik

Keywords Paper

Optimization

0

0

0

0

4:51

13/04/2021

Fast adaptation with linearized neural networks

Wesley Maddox, Shuai Tang, Pablo Moreno and
Andrew Gordon Wilson, Andreas Damianou

Keywords Paper

0

0

0

0

3:13

26/04/2020

Kernelized Wasserstein Natural Gradient

M Arbel, A Gretton, W Li, G Montufar

Keywords Paper

kernel methods, natural gradient, information geometry, Wasserstein metric

0

0

0

0

4:56