Critical initialisation in continuous approximations of binary neural networks

Abstract: The training of stochastic neural network models with binary ($\pm1$) weights and activations via continuous surrogate networks is investigated. We derive new surrogates using a novel derivation based on writing the stochastic neural network as a Markov chain. This derivation also encompasses existing variants of the surrogates presented in the literature. Following this, we theoretically study the surrogates at initialisation. We derive, using mean field theory, a set of scalar equations describing how input signals propagate through the randomly initialised networks. The equations reveal whether so-called critical initialisations exist for each surrogate network, where the network can be trained to arbitrary depth. Moreover, we predict theoretically and confirm numerically, that common weight initialisation schemes used in standard continuous networks, when applied to the mean values of the stochastic binary weights, yield poor training performance. This study shows that, contrary to common intuition, the means of the stochastic binary weights should be initialised close to $\pm 1$, for deeper networks to be trainable.

26/04/2020

Critical initialisation in continuous approximations of binary neural networks

George Stamatescu, Federica Gerace, Carlo Lucibello, Ian Fuss, Langford White

Comments

Similar Papers

Mixed Precision DNNs: All you need is a good parametrization

Stefan Uhlich, Lukas Mauch, Fabien Cardinaux and Kazuki Yoshiyama, Javier Alonso Garcia, Stephen Tiedemann, Thomas Kemp, Akira Nakamura

Keywords Abstract Paper

Deep Neural Network Compression, Quantization, Straight through gradients

Adversarial Robustness via Runtime Masking and Cleansing

Yi-Hsuan Wu, Chia-Hung Yuan, Shan-Hung (Brandon) Wu

Keywords Abstract Paper

Adversarial Examples

Adaptive Sampling for Minimax Fair Classification

Shubhanshu Shekhar, Greg Fields, Mohammad Ghavamzadeh, Tara Javidi

Keywords Abstract Paper

deep learning, machine learning, fairness

Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization

Sicheng Zhu, Xiao Zhang, David Evans

Keywords Abstract Paper

Adversarial Examples

Gating creates slow modes and controls phase-space complexity in GRUs and LSTMs

Tankut Can, Kamesh Krishnamurthy, David J. Schwab

Keywords Abstract Paper

Adversarial Self-Supervised Contrastive Learning

Minseon Kim, Jihoon Tack, Sung Ju Hwang

Keywords Abstract Paper

On Monotonic Linear Interpolation of Neural Network Parameters

James Lucas, Juhan Bae, Michael Zhang and Stanislav Fort, Richard Zemel, Roger Grosse

Keywords Abstract Paper

Deep Learning, Others

Borrowing From the Future: An Attempt to Address Double Sampling

Yuhua Zhu, Lexing Ying

Keywords Abstract Paper

Bayesian Adaptation for Covariate Shift

Aurick Zhou, Sergey Levine

Keywords Abstract Paper

deep learning, machine learning, robustness, vision, domain adaptation

Joint Inference for Neural Network Depth and Dropout Regularization

Kishan K C, Rui Li, MohammadMahdi Gilany

Keywords Abstract Paper

deep learning, generative model, continual learning

Neural Jump Ordinary Differential Equations: Consistent Continuous-Time Prediction and Filtering

Calypso Herrera, Florian Krach, Josef Teichmann

Keywords Abstract Paper

irregular-observed data modelling, conditional expectation, Neural ODE

Learning to defend by learning to attack

Haoming Jiang, Zhehui Chen, Yuyang Shi and Bo Dai, Tuo Zhao

Keywords Abstract Paper

Towards Better Robust Generalization with Shift Consistency Regularization

Shufei Zhang, Zhuang Qian, Kaizhu Huang and Qiufeng Wang, Rui Zhang, Xinping Yi

Keywords Abstract Paper

Algorithms, Adversarial Examples

Estimating informativeness of samples with Smooth Unique Information

Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini and Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

Keywords Abstract Paper

dataset summarization, ntk, stability theory, sample information, information theory

A Differentiable Point Process with Its Application to Spiking Neural Networks

Hiroshi Kajino

Keywords Abstract Paper

, Reinforcement Learning and Planning, Applications, Neuroscience and Cognitive Science

RATT: Leveraging Unlabeled Data to Guarantee Generalization

Saurabh Garg, Sivaraman Balakrishnan, Zico Kolter, Zachary Lipton

Keywords Abstract Paper

Probabilistic Methods, Graphical Models, Theory, Computational Complexity, Theory, Models of Learning and Generalization

Integrated Latent Heterogeneity and Invariance Learning in Kernel Space

Jiashuo Liu, Zheyuan Hu, Peng Cui and Bo Li, Zheyan Shen

Keywords Abstract Paper

deep learning, reinforcement learning and planning, machine learning

Fixed-Point Back-Propagation Training

Xishan Zhang, Shaoli Liu, Rui Zhang and Chang Liu, Di Huang, Shiyi Zhou, Jiaming Guo, Qi Guo, Zidong Du, Tian Zhi, Yunji Chen

Keywords Abstract Paper

network quantization, fixed-point training, deep learning, neural network

Influence Functions in Deep Learning Are Fragile

Samyadeep Basu, Phil Pope, Soheil Feizi

Keywords Abstract Paper

Influence Functions, Interpretability

On Recovering from Modeling Errors Using Testing Bayesian Networks

Haiying Huang, Adnan Darwiche

Keywords Abstract Paper

Probabilistic Methods, Graphical Models

Stefan Uhlich, Lukas Mauch, Fabien Cardinaux and
Kazuki Yoshiyama, Javier Alonso Garcia, Stephen Tiedemann, Thomas Kemp, Akira Nakamura

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

James Lucas, Juhan Bae, Michael Zhang and
Stanislav Fort, Richard Zemel, Roger Grosse

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Haoming Jiang, Zhehui Chen, Yuyang Shi and
Bo Dai, Tuo Zhao

Keywords Paper

Shufei Zhang, Zhuang Qian, Kaizhu Huang and
Qiufeng Wang, Rui Zhang, Xinping Yi

Keywords Paper

Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini and
Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

Keywords Paper

Keywords Paper

Keywords Paper

Jiashuo Liu, Zheyuan Hu, Peng Cui and
Bo Li, Zheyan Shen

Keywords Paper

Xishan Zhang, Shaoli Liu, Rui Zhang and
Chang Liu, Di Huang, Shiyi Zhou, Jiaming Guo, Qi Guo, Zidong Du, Tian Zhi, Yunji Chen

Keywords Paper

Keywords Paper

Keywords Paper

Aojun Zhou, Yukun Ma, Junnan Zhu and
Jianbo Liu, Zhijie Zhang, Kun Yuan, Wenxiu Sun, Hongsheng Li

Keywords Paper

Junhyun Nam, Hyuntak Cha, Sungsoo Ahn and
Jaeho Lee, Jinwoo Shin

Keywords Paper

Junjiao Tian, Yen-Cheng Liu, Nathaniel Glaser and
Yen-Chang Hsu, Zsolt Kira

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Qianli Shen, Yan Li, Haoming Jiang and
Zhaoran Wang, Tuo Zhao

Keywords Paper

Keywords Paper

Ahmadreza Jeddi, Mohammad Javad Shafiee, Michelle Karg and
Christian Scharfenberger, Alexander Wong

Keywords Paper

Mingrui Liu, Wei Zhang, Youssef Mroueh and
Xiaodong Cui, Jarret Ross, Tianbao Yang, Payel Das

Keywords Paper

Keywords Paper

Keywords Paper

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

Mohammad Pezeshki, Oumar Kaba, Yoshua Bengio and
Aaron Courville, Doina Precup, Guillaume Lajoie

Keywords Paper

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper