Non-convex Learning via Replica Exchange Stochastic Gradient MCMC

12/07/2020

Non-convex Learning via Replica Exchange Stochastic Gradient MCMC

Wei Deng, Qi Feng, Liyao Gao, Faming Liang, Guang Lin

Keywords: Probabilistic Inference - Approximate, Monte Carlo, and Spectral Methods

Abstract Paper Similar Papers

Abstract: Replica exchange method (RE), also known as parallel tempering, is an important technique for accelerating the convergence of the conventional Markov Chain Monte Carlo (MCMC) algorithms. However, such a method requires the evaluation of the energy function based on the full dataset and is not scalable to big data. The na\"ive implementation of RE in mini-batch settings introduces large biases, which cannot be directly extended to the stochastic gradient MCMC (SG-MCMC), the standard sampling method for simulating from deep neural networks (DNNs). In this paper, we propose an adaptive replica exchange SG-MCMC (reSG-MCMC) to automatically correct the bias and study the corresponding properties. The analysis implies an acceleration-accuracy trade-off in the numerical discretization of a Markov jump process in a stochastic environment. Empirically, we test the algorithm through extensive experiments on various setups and obtain the state-of-the-art results on CIFAR10, CIFAR100, and SVHN in both supervised learning and semi-supervised learning tasks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

Accelerating Convergence of Replica Exchange Stochastic Gradient MCMC via Variance Reduction

Wei Deng, Qi Feng, Georgios Karagiannis and
Guang Lin, Faming Liang

Keywords Paper

Markov jump process, uncertainty quantification, generalized Girsanov theorem, change of measure, stochastic gradient Langevin dynamics, parallel tempering, replica exchange, Dirichlet form, variance reduction

0

0

0

0

5:19

26/04/2020

Accelerating SGD with momentum for over-parameterized learning

Chaoyue Liu, Mikhail Belkin

Keywords Paper

SGD, acceleration, momentum, stochastic, over-parameterized, Nesterov

0

0

0

0

4:50

06/12/2021

On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms

Shuyu Cheng, Guoqiang Wu, Jun Zhu

Keywords Paper

optimization, reinforcement learning and planning, adversarial robustness and security

0

0

0

0

13:49

06/12/2020

Hamiltonian Monte Carlo using an adjoint-differentiated Laplace approximation: Bayesian inference for latent Gaussian models and beyond

Charles Margossian, Aki Vehtari, Daniel Simpson, Raj Agrawal

Keywords Paper

0

0

0

0

3:05

26/08/2020

Finite-Time Error Bounds for Biased Stochastic Approximation with Applications to Q-Learning

Gang Wang, Georgios B. Giannakis

Keywords Paper

0

0

0

0

14:03

06/12/2020

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations

Zhuoran Yang, Chi Jin, Zhaoran Wang and
Mengdi Wang, Michael Jordan

Keywords Paper

0

0

0

0

3:42

02/02/2021

Learning Energy-Based Model with Variational Auto-Encoder as Amortized Sampler

Jianwen Xie, Zilong Zheng, Ping Li

Keywords Paper

0

0

0

0

14:54

06/12/2021

Structured Dropout Variational Inference for Bayesian Neural Networks

Son Nguyen, Duong Nguyen, Khai Nguyen and
Khoat Than, Hung Bui, Nhat Ho

Keywords Paper

deep learning, generative model

0

0

0

0

11:28

26/08/2020

Asynchronous Gibbs Sampling

Alexander Terenin, Daniel Simpson, David Draper

Keywords Paper

0

0

0

0

12:31

30/11/2020

Fast and Differentiable Message Passing on Pairwise Markov Random Fields

Zhiwei Xu, Thalaiyasingam Ajanthan, Richard Hartley

Keywords Paper

0

0

0

0

9:41

02/02/2021

Distribution Adaptive INT8 Quantization for Training CNNs

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper

0

0

0

0

16:42

12/07/2020

Spectral Subsampling MCMC for Stationary Time Series

Robert Salomone, Matias Quiroz, Robert kohn and
Mattias Villani, Minh-Ngoc Tran

Keywords Paper

Probabilistic Inference - Approximate, Monte Carlo, and Spectral Methods

0

0

0

0

15:22

13/04/2021

Sampling in combinatorial spaces with SurVAE flow augmented MCMC

Priyank Jaini, Didrik Nielsen, Max Welling

Keywords Paper

0

0

0

0

3:04

06/12/2020

MomentumRNN: Integrating Momentum into Recurrent Neural Networks

Tan Nguyen, Richard Baraniuk, Andrea Bertozzi and
Stanley Osher, Bao Wang

Keywords Paper

0

0

0

0

3:09

06/12/2021

Entropy-based adaptive Hamiltonian Monte Carlo

Marcel Hirt, Michalis Titsias, Petros Dellaportas

Keywords Paper

generative model

0

0

0

0

5:40

18/07/2021

Momentum Residual Neural Networks

Michael Sander, Pierre Ablin, Mathieu Blondel, Gabriel Peyré

Keywords Paper

Deep Learning

0

0

0

0

5:07

03/05/2021

MALI: A memory efficient and reverse accurate integrator for Neural ODEs

Juntang Zhuang, Nicha C Dvornek, sekhar tatikonda, James s Duncan

Keywords Paper

neural ode, memory efficient, gradient estimation, reverse accuracy

0

0

0

0

5:12

26/08/2020

Revisiting Stochastic Extragradient

Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin and
Peter Richtarik, Yura Malitsky

Keywords Paper

0

0

0

0

11:24

18/07/2021

A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network

Jun-Kun Wang, Chi-Heng Lin, Jake Abernethy

Keywords Paper

Optimization, Non-Convex Optimization

0

0

0

0

5:07

06/12/2020

Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians

Juhan Bae, Roger Grosse

Keywords Paper

0

0

0

0

3:20

06/12/2021

Distributional Gradient Matching for Learning Uncertain Neural Dynamics Models

Lenart Treven, Philippe Wenk, Florian Dorfler, Andreas Krause

Keywords Paper

deep learning, reinforcement learning and planning, kernel methods, active learning

0

0

0

0

14:46

06/12/2021

Robust Regression Revisited: Acceleration and Improved Estimation Rates

Arun Jambulapati, Jerry Li, Tselil Schramm, Kevin Tian

Keywords Paper

theory, optimization

0

0

0

0

14:22

06/12/2020

Beyond the Mean-Field: Structured Deep Gaussian Processes Improve the Predictive Uncertainties

Jakob Lindinger, David Reeb, Christoph Lippert, Barbara Rakitsch

Keywords Paper

0

0

0

0

3:21

26/04/2020

Unbiased Contrastive Divergence Algorithm for Training Energy-Based Latent Variable Models

Yixuan Qiu, Lingsong Zhang, Xiao Wang

Keywords Paper

energy model, restricted Boltzmann machine, contrastive divergence, unbiased Markov chain Monte Carlo, distribution coupling

0

0

0

0

4:34

06/12/2020

Mixed Hamiltonian Monte Carlo for Mixed Discrete and Continuous Variables

Guangyao Zhou

Keywords Paper

0

0

0

0

3:35

06/12/2020

Improved Analysis of Clipping Algorithms for Non-convex Optimization

Bohang Zhang, Jikai Jin, Cong Fang, Liwei Wang

Keywords Paper

0

0

0

0

3:16

06/12/2020

Analytical Probability Distributions and Exact Expectation-Maximization for Deep Generative Networks

Randall Balestriero, Sebastien PARIS, Richard Baraniuk

Keywords Paper

0

0

0

0

3:17

13/04/2021

Rao-blackwellised parallel MCMC

Tobias Schwedes, Ben Calderhead

Keywords Paper

0

0

0

0

3:02

13/04/2021

Faster & more reliable tuning of neural networks: Bayesian optimization with importance sampling

Setareh Ariafar, Zelda Mariet, Dana Brooks and
Jennifer Dy, Jasper Snoek

Keywords Paper

0

0

0

0

3:01

26/08/2020

Expressiveness and Learning of Hidden Quantum Markov Models

Sandesh Adhikary, Siddarth Srinivasan, Geoff Gordon, Byron Boots

Keywords Paper

0

0

0

0

16:10

09/07/2020

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise

Maksim Kaledin, Eric Moulines, Alexey Naumov and
Vladislav Tadic, Hoi-To Wai

Keywords Paper

Stochastic optimization, Reinforcement learning

0

0

0

0

12:29

06/12/2021

Provably Faster Algorithms for Bilevel Optimization

Junjie Yang, Kaiyi Ji, Yingbin Liang

Keywords Paper

optimization, machine learning, meta learning

0

0

0

0

15:07

18/07/2021

Bilevel Optimization: Convergence Analysis and Enhanced Design

Kaiyi Ji, Junjie Yang, Yingbin LIANG

Keywords Paper

Optimization, Non-Convex Optimization

0

0

0

0

5:02

18/07/2021

A Gradient Based Strategy for Hamiltonian Monte Carlo Hyperparameter Optimization

Andrew Campbell, Wenlong Chen, Vincent Stimper and
Jose Miguel Hernandez-Lobato, Yichuan Zhang

Keywords Paper

Probabilistic Methods, Monte Carlo Methods

0

0

0

0

5:09

03/08/2020

An Interpretable and Sample Efficient Deep Kernel for Gaussian Process

Yijue Dai, Tianjian Zhang, Zhidi Lin and
Feng Yin, Sergios Theodoridis, Shuguang Cui

Keywords Paper

0

0

0

0

8:31

02/02/2021

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Xiangyu Chang, Yingcong Li, Samet Oymak, Christos Thrampoulidis

Keywords Paper

0

0

0

0

18:14

26/08/2020

Hamiltonian Monte Carlo Swindles

Dan Piponi, Matthew Hoffman, Pavel Sountsov

Keywords Paper

0

0

0

0

14:56

18/07/2021

Stochastic Sign Descent Methods: New Algorithms and Better Theory

Mher Safaryan, Peter Richtarik

Keywords Paper

Optimization, Distributed and Parallel Optimization

0

0

0

0

5:12

06/12/2020

Efficient semidefinite-programming-based inference for binary and multi-class MRFs

Chirag Pabbaraju, Po-Wei Wang, J. Zico Kolter

Keywords Paper

0

0

0

0

3:19

12/07/2020

Training Deep Energy-Based Models with f-Divergence Minimization

Lantao Yu, Yang Song, Jiaming Song, Stefano Ermon

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

14:37