An Investigation Into the Stochasticity of Batch Whitening

14/06/2020

An Investigation Into the Stochasticity of Batch Whitening

Lei Huang, Lei Zhao, Yi Zhou, Fan Zhu, Li Liu, Ling Shao

Keywords: batch normalization, whitening, stochasticity analysis, conditioning, optimization, generalization, stochastic noise, deep learning, gans, classification

Abstract Paper Similar Papers

Abstract: Batch Normalization (BN) is extensively employed in various network architectures by performing standardization within mini-batches. A full understanding of the process has been a central target in the deep learning communities. Unlike existing works, which usually only analyze the standardization operation, this paper investigates the more general Batch Whitening (BW). Our work originates from the observation that while various whitening transformations equivalently improve the conditioning, they show significantly different behaviors in discriminative scenarios and training Generative Adversarial Networks (GANs). We attribute this phenomenon to the stochasticity that BW introduces. We quantitatively investigate the stochasticity of different whitening transformations and show that it correlates well with the optimization behaviors during training. We also investigate how stochasticity relates to the estimation of population statistics during inference. Based on our analysis, we provide a framework for designing and comparing BW algorithms in different scenarios. Our proposed BW algorithm improves the residual networks by a significant margin on ImageNet classification. Besides, we show that the stochasticity of BW can improve the GAN's performance with, however, the sacrifice of the training stability.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CVPR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

Exploring Balanced Feature Spaces for Representation Learning

Bingyi Kang, Yu Li, Sain Xie and
Zehuan Yuan, Jiashi Feng

Keywords Paper

Representation Learning, Contrastive Learning, Long-Tailed Recognition

0

0

0

0

7:18

03/05/2021

Neurally Augmented ALISTA

Freya Behrens, Jonathan Sauder, Peter Jung

Keywords Paper

learned ISTA, unrolled algorithms, compressed sensing, sparse reconstruction

0

0

0

0

5:18

18/07/2021

Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation

Haoxiang Wang, Han Zhao, Bo Li

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

5:01

18/07/2021

Can Subnetwork Structure Be the Key to Out-of-Distribution Generalization?

Dinghuai Zhang, Kartik Ahuja, Yilun Xu and
Yisen Wang, Aaron Courville

Keywords Paper

Deep Learning, Algorithms, Theory; Theory, Regularization

0

0

0

0

20:10

06/12/2021

Time-series Generation by Contrastive Imitation

Daniel Jarrett, Ioana Bica, Mihaela van der Schaar

Keywords Paper

generative model

0

0

0

0

8:47

22/11/2021

SamplingAug: On the Importance of Patch Sampling Augmentation for Single Image Super-Resolution

Shizun Wang, Ming Lu, Kaixin Chen and
Jiaming Liu, Xiaoqi Li, Chuang Zhang, Ming Wu

Keywords Paper

Super-Resolution, Patch Sampling

0

0

0

0

2:18

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

06/12/2021

CAM-GAN: Continual Adaptation Modules for Generative Adversarial Networks

Sakshi Varshney, Vinay Kumar Verma, P. K. Srijith and
Lawrence Carin, Piyush Rai

Keywords Paper

generative model, representation learning, continual learning

0

0

0

0

14:50

02/02/2021

Delving into Variance Transmission and Normalization: Shift of Average Gradient Makes the Network Collapse

Yuxiang Liu, Jidong Ge, Chuanyi Li, Jie Gui

Keywords Paper

0

0

0

0

14:49

14/06/2020

GreedyNAS: Towards Fast One-Shot NAS With Greedy Supernet

Shan You, Tao Huang, Mingmin Yang and
Fei Wang, Chen Qian, Changshui Zhang

Keywords Paper

neural architecture search, supernet, one-shot nas, single path, greedy algorithm, exploration and exploitation, searching efficiency

0

0

0

0

1:01

03/05/2021

Initialization and Regularization of Factorized Neural Layers

Misha Khodak, Neil Tenenholtz, Lester Mackey, Nicolo Fusi

Keywords Paper

matrix factorization, knowledge distillation, multi-head attention, model compression

0

0

0

0

4:25

18/07/2021

A Unified Generative Adversarial Network Training via Self-Labeling and Self-Attention

Tomoki Watanabe, Paolo Favaro

Keywords Paper

Deep Learning, Generative Models, Applications, Matrix and Tensor Factorization, Algorithms, Collaborative Filtering; Algorithms, Large Scale Learning; Applications, Denoising

0

0

0

0

5:12

12/07/2020

A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton

Risheng Liu, Pan Mu, Xiaoming Yuan and
Shangzhi Zeng, Jin Zhang

Keywords Paper

Optimization - Non-convex

0

0

0

0

13:51

03/05/2021

QPLEX: Duplex Dueling Multi-Agent Q-Learning

Jianhao Wang, Zhizhou Ren, Terry Liu and
Yang Yu, Chongjie Zhang

Keywords Paper

Dueling structure, Value factorization, Multi-agent reinforcement learning

0

0

0

0

4:52

13/04/2021

Fractional moment-preserving initialization schemes for training deep neural networks

Mert Gurbuzbalaban, Yuanhan Hu

Keywords Paper

0

0

0

0

3:05

12/07/2020

Minimax Weight and Q-Function Learning for Off-Policy Evaluation

Masatoshi Uehara, Jiawei Huang, Nan Jiang

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:20

06/12/2021

Adversarial Reweighting for Partial Domain Adaptation

Xiang Gu, Xi Yu, yan yang and
Jian Sun, Zongben Xu

Keywords Paper

domain adaptation

0

0

0

1

9:03

06/12/2020

Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians

Juhan Bae, Roger Grosse

Keywords Paper

0

0

0

0

3:20

02/02/2021

Harmonized Dense Knowledge Distillation Training for Multi-Exit Architectures

Xinglu Wang, Yingming Li

Keywords Paper

0

0

0

0

15:12

18/07/2021

Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation

Cunxiao Du, Zhaopeng Tu, Jing Jiang

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

17:21

06/12/2021

Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time

Ferran Alet, Maria Bauza, Kenji Kawaguchi and
Nurullah Giray Kuru, Tomás Lozano-Pérez, Leslie Kaelbling

Keywords Paper

deep learning, optimization, machine learning, self-supervised learning, meta learning

0

0

0

0

15:05

04/07/2020

Estimating the influence of auxiliary tasks for multi-task learning of sequence tagging tasks

Fynn Schröder, Chris Biemann

Keywords Paper

multi-task tasks, MTL, TL, MTL setups

0

0

0

0

12:02

22/11/2021

Adaptive End-to-End Budgeted Network Learning via Inverse Scale Space

Zuyuan Zhong, Chen Liu, Yanwei Fu

Keywords Paper

deep learning, network architecture, growing network, budgeted network learning, pruning

0

0

0

0

2:58

06/12/2021

Beyond BatchNorm: Towards a Unified Understanding of Normalization in Deep Learning

Ekdeep S Lubana, Robert Dick, Hidenori Tanaka

Keywords Paper

deep learning

0

0

0

0

8:28

06/12/2021

Sliced Mutual Information: A Scalable Measure of Statistical Dependence

Ziv Goldfeld, Kristjan Greenewald

Keywords Paper

theory, machine learning

0

0

0

0

13:59

14/06/2020

Exemplar Normalization for Learning Deep Representation

Ruimao Zhang, Zhanglin Peng, Lingyun Wu and
Zhen Li, Ping Luo

Keywords Paper

normalization, learning to normalize, sample-adaptive, deep learning, image classification, semantic segmentation

0

0

0

0

1:00

06/12/2021

Generalized Proximal Policy Optimization with Sample Reuse

James Queeney, Yannis Paschalidis, Christos G Cassandras

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

13:45

02/02/2021

Infinite Gaussian Mixture Modeling with an Improved Estimation of the Number of Clusters

Avi Matza, Yuval Bistritz

Keywords Paper

0

0

0

0

20:14

18/07/2021

Unsupervised Representation Learning via Neural Activation Coding

Yookoon Park, Sangho Lee, Gunhee Kim, David Blei

Keywords Paper

Deep Learning, Embedding and Representation learning

0

0

0

0

13:50

06/12/2021

$(\textrm{Implicit})^2$: Implicit Layers for Implicit Representations

Zhichun Huang, Shaojie Bai, J. Zico Kolter

Keywords Paper

deep learning, representation learning

1

0

0

1

12:23

02/02/2021

Joint-Label Learning by Dual Augmentation for Time Series Classification

Qianli Ma, Zhenjing Zheng, Jiawei Zheng and
Sen Li, Wanqing Zhuang, Garrison W. Cottrell

Keywords Paper

0

0

0

0

15:59

06/12/2020

Revisiting the Sample Complexity of Sparse Spectrum Approximation of Gaussian Processes

Minh Hoang, Nghia Hoang, Hai Pham, David Woodruff

Keywords Paper

, Deep Learning

0

0

0

0

3:25

06/12/2021

Functional Regularization for Reinforcement Learning via Learned Fourier Features

Alexander Li, Deepak Pathak

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

14:35

02/02/2021

Domain General Face Forgery Detection by Learning to Weight

Ke Sun, Hong Liu, Qixiang Ye and
Yue Gao, Jianzhuang Liu, Ling Shao, Rongrong Ji

Keywords Paper

0

0

0

0

14:07

14/06/2020

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Tianzhe Wang, Kuan Wang, Han Cai and
Ji Lin, Zhijian Liu, Hanrui Wang, Yujun Lin, Song Han

Keywords Paper

efficiency, model compression, joint design, neural architecture search, channel pruning, mixed-precision quantization

0

0

0

0

1:00

13/04/2021

Sample elicitation

Jiaheng Wei, Zuyue Fu, Yang Liu and
Xingyu Li, Zhuoran Yang, Zhaoran Wang

Keywords Paper

0

0

0

0

3:16

12/07/2020

Data Valuation using Reinforcement Learning

Jinsung Yoon, Sercan Arik, Tomas Pfister

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:35

02/02/2021

Adversarial Training Reduces Information and Improves Transferability

Matteo Terzi, Alessandro Achille, Marco Maggipinto, Gian Antonio Susto

Keywords Paper

0

0

0

0

19:54

06/12/2021

Algorithmic stability and generalization of an unsupervised feature selection algorithm

xinxing wu, Qiang Cheng

Keywords Paper

deep learning

0

0

0

0

12:41

18/07/2021

XOR-CD: Linearly Convergent Constrained Structure Generation

Fan Ding, Jianzhu Ma, Jinbo Xu, Yexiang Xue

Keywords Paper

Probabilistic Methods

0

0

0

0

5:14