Nondeterminism and Instability in Neural Network Optimization

18/07/2021

Nondeterminism and Instability in Neural Network Optimization

Cecilia Summers, Michael J Dinneen

Keywords: Deep Learning, Optimization for Deep Networks

Abstract Paper Similar Papers

Abstract: Nondeterminism in neural network optimization produces uncertainty in performance, making small improvements difficult to discern from run-to-run variability. While uncertainty can be reduced by training multiple model copies, doing so is time-consuming, costly, and harms reproducibility. In this work, we establish an experimental protocol for understanding the effect of optimization nondeterminism on model diversity, allowing us to isolate the effects of a variety of sources of nondeterminism. Surprisingly, we find that all sources of nondeterminism have similar effects on measures of model diversity. To explain this intriguing fact, we identify the instability of model training, taken as an end-to-end procedure, as the key determinant. We show that even one-bit changes in initial parameters result in models converging to vastly different values. Last, we propose two approaches for reducing the effects of instability on run-to-run variability.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Adversarial Robustness with Semi-Infinite Constrained Learning

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

theory, deep learning, optimization, robustness, adversarial robustness and security

0

0

0

0

14:48

03/05/2021

Influence Functions in Deep Learning Are Fragile

Samyadeep Basu, Phil Pope, Soheil Feizi

Keywords Paper

Influence Functions, Interpretability

0

0

1

1

6:15

06/12/2021

Bayesian Adaptation for Covariate Shift

Aurick Zhou, Sergey Levine

Keywords Paper

deep learning, machine learning, robustness, vision, domain adaptation

0

0

0

0

8:21

18/07/2021

A Distribution-dependent Analysis of Meta Learning

Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvari

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:06

06/12/2020

Learning from Failure: De-biasing Classifier from Biased Classifier

Junhyun Nam, Hyuntak Cha, Sungsoo Ahn and
Jaeho Lee, Jinwoo Shin

Keywords Paper

0

0

0

0

3:21

18/07/2021

Examining and Combating Spurious Features under Distribution Shift

Chunting Zhou, Xuezhe Ma, Paul Michel, Graham Neubig

Keywords Paper

Deep Learning, Embedding and Representation learning

0

0

0

0

5:53

06/12/2021

Adaptive Sampling for Minimax Fair Classification

Shubhanshu Shekhar, Greg Fields, Mohammad Ghavamzadeh, Tara Javidi

Keywords Paper

deep learning, machine learning, fairness

0

0

0

0

15:19

14/06/2020

Learning to Forget for Meta-Learning

Sungyong Baik, Seokil Hong, Kyoung Mu Lee

Keywords Paper

meta learning, few-shot learning, reinforcement learning

0

0

0

0

1:01

06/12/2021

Reliable Estimation of KL Divergence using a Discriminator in Reproducing Kernel Hilbert Space

Sandesh Ghimire, Aria Masoomi, Jennifer Dy

Keywords Paper

theory, deep learning, machine learning, kernel methods

0

0

0

0

14:58

06/12/2021

Integrated Latent Heterogeneity and Invariance Learning in Kernel Space

Jiashuo Liu, Zheyuan Hu, Peng Cui and
Bo Li, Zheyan Shen

Keywords Paper

deep learning, reinforcement learning and planning, machine learning

0

0

0

0

11:11

06/12/2020

Posterior Re-calibration for Imbalanced Datasets

Junjiao Tian, Yen-Cheng Liu, Nathaniel Glaser and
Yen-Chang Hsu, Zsolt Kira

Keywords Paper

Algorithms -> Few-Shot Learning, Applications -> Computer Vision

0

0

0

0

3:23

02/02/2021

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu

Keywords Paper

0

0

0

0

14:30

26/04/2020

Why Not to Use Zero Imputation? Correcting Sparsity Bias in Training Neural Networks

Joonyoung Yi, Juhyuk Lee, Kwang Joon Kim and
Sung Ju Hwang, Eunho Yang

Keywords Paper

Missing Data, Collaborative Filtering, Health Care, Tabular Data, High Dimensional Data, Deep Learning, Neural Networks

0

0

0

0

5:00

06/12/2021

Joint Inference for Neural Network Depth and Dropout Regularization

Kishan K C, Rui Li, MohammadMahdi Gilany

Keywords Paper

deep learning, generative model, continual learning

0

0

0

0

11:01

06/12/2021

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

deep learning, optimization

0

0

0

0

14:26

18/07/2021

Fundamental Tradeoffs in Distributionally Adversarial Training

Mohammad Mehrabi, Adel Javanmard, Ryan A. Rossi and
Anup Rao, Tung Mai

Keywords Paper

Theory

0

0

0

1

5:50

06/12/2021

Model-Based Domain Generalization

Alexander Robey, George J. Pappas, Hamed Hassani

Keywords Paper

theory, deep learning, optimization, robustness, domain adaptation

0

0

0

0

15:08

02/02/2021

Improving Sample Efficiency in Model-Free Reinforcement Learning from Images

Denis Yarats, Amy Zhang, Ilya Kostrikov and
Brandon Amos, Joelle Pineau, Rob Fergus

Keywords Paper

0

0

0

0

12:19

14/06/2020

Conditional Channel Gated Networks for Task-Aware Continual Learning

Davide Abati, Jakub Tomczak, Tijmen Blankevoort and
Simone Calderara, Rita Cucchiara, Babak Ehteshami Bejnordi

Keywords Paper

continual learning, channel gating, conditional computation, incremental learning, lifelong learning, hard attention

0

0

0

0

5:01

03/05/2021

Modeling the Second Player in Distributionally Robust Optimization

Paul Michel, Tatsunori Hashimoto, Graham Neubig

Keywords Paper

adversarial learning, deep learning, robustness, distributionally robust optimization

0

0

0

0

5:09

18/07/2021

Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization

Neha Wadia, Daniel Duckworth, Samuel Schoenholz and
Ethan Dyer, Jascha Sohl-Dickstein

Keywords Paper

Optimization, Probabilistic Methods, Topic Models, Probabilistic Methods, Latent Variable Models

0

0

0

0

5:17

12/07/2020

Invariant Risk Minimization Games

Kartik Ahuja, Karthikeyan Shanmugam, Kush Varshney, Amit Dhurandhar

Keywords Paper

Causality

0

0

0

0

14:57

12/07/2020

Generalization Error of Generalized Linear Models in High Dimensions

Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit and
Sundeep Rangan, Alyson Fletcher

Keywords Paper

Supervised Learning

0

0

0

0

15:08

03/05/2021

Overparameterisation and worst-case generalisation: friend or foe?

Aditya Krishna Menon, Ankit Singh Rawat, Sanjiv Kumar

Keywords Paper

worst-case generalisation, overparameterisation

0

0

0

0

5:01

18/07/2021

Towards Better Robust Generalization with Shift Consistency Regularization

Shufei Zhang, Zhuang Qian, Kaizhu Huang and
Qiufeng Wang, Rui Zhang, Xinping Yi

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

0

5:44

18/07/2021

RATT: Leveraging Unlabeled Data to Guarantee Generalization

Saurabh Garg, Sivaraman Balakrishnan, Zico Kolter, Zachary Lipton

Keywords Paper

Probabilistic Methods, Graphical Models, Theory, Computational Complexity, Theory, Models of Learning and Generalization

0

0

0

1

17:27

06/12/2021

Overparameterization Improves Robustness to Covariate Shift in High Dimensions

Nilesh Tripuraneni, Ben Adlam, Jeffrey Pennington

Keywords Paper

theory, deep learning, machine learning, robustness

0

0

0

0

15:11

18/07/2021

DORO: Distributional and Outlier Robust Optimization

Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar

Keywords Paper

Probabilistic Methods, Robust statistics

0

0

0

1

5:06

13/04/2021

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Paper

0

0

0

0

3:05

09/07/2020

Precise Tradeoffs in Adversarial Training for Linear Regression

Adel Javanmard, Mahdi Soltanolkotabi, Hamed Hassani

Keywords Paper

Adversarial learning and robustness, High-dimensional statistics, Regression

0

0

0

0

15:49

06/12/2020

The Generalization-Stability Tradeoff In Neural Network Pruning

Brian Bartoldson, Ari Morcos, Adrian Barbu, Gordon Erlebacher

Keywords Paper

0

0

0

0

3:12

20/07/2020

A type of generalization error induced by initialization in deep neural networks

Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Keywords Paper

0

0

0

0

17:33

18/07/2021

Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

Xiang Wang, Shuai Yuan, Chenwei Wu, Rong Ge

Keywords Paper

Theory, Computational Learning Theory

0

0

0

0

5:20

03/05/2021

Understanding the failure modes of out-of-distribution generalization

Vaishnavh Nagarajan, Anders J Andreassen, Behnam Neyshabur

Keywords Paper

theoretical study, spurious correlations, out-of-distribution generalization, empirical risk minimization

0

1

0

1

5:12

18/07/2021

Enhancing Robustness of Neural Networks through Fourier Stabilization

Netanel Raviv, Aidan Kelley, Minzhe Guo, Yevgeniy Vorobeychik

Keywords Paper

Probabilistic Methods, Variational Inference, Algorithms, Boosting and Ensemble Methods; Probabilistic Methods; Probabilistic Methods, Bayesian Theory, Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

4:57

18/07/2021

PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees

Jonas Rothfuss, Vincent Fortuin, Martin Josifoski, Andreas Krause

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

1

1

0

0

5:46

12/07/2020

Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime

Stéphane d'Ascoli, Maria Refinetti, Giulio Biroli, Florent Krzakala

Keywords Paper

Deep Learning - Theory

0

0

0

0

15:11

03/05/2021

Linear Mode Connectivity in Multitask and Continual Learning

Seyed Iman Mirzadeh, Mehrdad Farajtabar, Dilan Gorur and
Razvan Pascanu, Hassan Ghasemzadeh

Keywords Paper

multitask learning, mode connectivity, continual learning, catastrophic forgetting

0

0

0

0

5:31

18/07/2021

On Monotonic Linear Interpolation of Neural Network Parameters

James Lucas, Juhan Bae, Michael Zhang and
Stanislav Fort, Richard Zemel, Roger Grosse

Keywords Paper

Deep Learning, Others

0

0

0

0

5:03

02/02/2021

Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model

Qizhou Wang, Bo Han, Tongliang Liu and
Gang Niu, Jian Yang, Chen Gong

Keywords Paper

0

0

0

0

14:56