Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?

12/07/2020

Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?

Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry

Keywords: Deep Learning - Theory

Abstract Paper Similar Papers

Abstract: Deep neural networks are typically initialized with random weights, with variances chosen to facilitate signal propagation and stable gradients. It is also believed that diversity of features is an important property of these initializations. We construct a deep convolutional network with identical features by initializing almost all the weights to $0$. The architecture also enables perfect signal propagation and stable gradients, and achieves high accuracy on standard benchmarks. This indicates that random, diverse initializations are \textit{not} necessary for training neural networks. An essential element in training this network is a mechanism of symmetry breaking; we study this phenomenon and find that standard GPU operations, which are non-deterministic, can serve as a sufficient source of symmetry breaking to enable training.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

A Functional Perspective on Learning Symmetric Functions with Neural Networks

Aaron Zweig, Joan Bruna

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:36

03/05/2021

Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets?

Zhiyuan Li, Yi Zhang, Sanjeev Arora

Keywords Paper

equivariance, fully-connected, sample complexity separation, convolutional neural networks

0

0

0

0

15:18

26/04/2020

Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee

Wei Hu, Zhiyuan Li, Dingli Yu

Keywords Paper

deep learning theory, regularization, noisy labels

0

0

0

0

5:13

07/09/2020

Mish: A Self Regularized Non-Monotonic Activation Function

Diganta Misra

Keywords Paper

activation functions, non-linear dynamics, loss landscapes

0

0

0

0

10:37

06/12/2020

MMA Regularization: Decorrelating Weights of Neural Networks by Maximizing the Minimal Angles

Zhennan Wang, Canqun Xiang, Wenbin Zou, Chen Xu

Keywords Paper

0

0

0

0

3:23

06/12/2020

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks

Wei Hu, Lechao Xiao, Ben Adlam, Jeffrey Pennington

Keywords Paper

0

0

0

0

3:20

26/08/2020

Adversarial Risk Bounds through Sparsity based Compression

Emilio Balda, Niklas Koep, Arash Behboodi, Rudolf Mathar

Keywords Paper

0

0

0

0

15:15

06/12/2021

Revisiting Hilbert-Schmidt Information Bottleneck for Adversarial Robustness

Zifeng Wang, Tong Jian, Aria Masoomi and
Stratis Ioannidis, Jennifer Dy

Keywords Paper

deep learning, robustness, adversarial robustness and security

0

0

0

0

13:49

12/07/2020

Training Binary Neural Networks through Learning with Noisy Supervision

Kai Han, Yunhe Wang, Yixing Xu and
Chunjing Xu, Enhua Wu, Chang Xu

Keywords Paper

Unsupervised and Semi-Supervised Learning

0

0

0

0

12:34

06/12/2020

Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming

Sumanth Dathathri, Krishnamurthy Dvijotham, Alexey Kurakin and
Aditi Raghunathan, Jonathan Uesato, Rudy Bunel, Shreya Shankar, Jacob Steinhardt, Ian Goodfellow, Percy Liang, Pushmeet Kohli

Keywords Paper

0

0

0

0

3:23

06/12/2021

General Nonlinearities in SO(2)-Equivariant CNNs

Daniel Franzen, Michael Wand

Keywords Paper

deep learning, machine learning

0

0

0

0

12:23

06/12/2020

NeuMiss networks: differentiable programming for supervised learning with missing values.

Marine Le Morvan, Julie Josse, Thomas Moreau and
Erwan Scornet, Gael Varoquaux

Keywords Paper

0

0

0

0

3:20

13/04/2021

Fractional moment-preserving initialization schemes for training deep neural networks

Mert Gurbuzbalaban, Yuanhan Hu

Keywords Paper

0

0

0

0

3:05

14/06/2020

Discrete Model Compression With Resource Constraint for Deep Neural Networks

Shangqian Gao, Feihu Huang, Jian Pei, Heng Huang

Keywords Paper

covutional neural networks, model compression, channel pruning, discrete optimization

0

0

0

0

1:01

06/12/2020

Adversarial robustness via robust low rank representations

Pranjal Awasthi, Himanshu Jain, Ankit Singh Rawat, Aravindan Vijayaraghavan

Keywords Paper

0

0

0

1

3:14

07/09/2020

Learning Non-Parametric Invariances from Data with Permanent Random Connectomes

Dipan Pal, Akshay Chawla, Marios Savvides

Keywords Paper

random connections, invariant features, permanent random connectomes, prcns, nptns, learning invariances, bio-inspired networks

0

0

0

0

10:00

06/12/2021

Improving Compositionality of Neural Networks by Decoding Representations to Inputs

Mike Wu, Noah Goodman, Stefano Ermon

Keywords Paper

deep learning, machine learning, adversarial robustness and security, generative model

0

0

0

0

12:36

06/12/2020

Kernelized information bottleneck leads to biologically plausible 3-factor Hebbian learning in deep networks

Roman Pogodin, Peter E Latham

Keywords Paper

Deep Learning -> Adversarial Networks, Algorithms -> Semi-Supervised Learning

0

0

0

0

2:30

06/12/2021

Learning Compact Representations of Neural Networks using DiscriminAtive Masking (DAM)

Jie Bu, Arka Daw, M. Maruf, Anuj Karpatne

Keywords Paper

deep learning, machine learning, vision, graph learning, representation learning

0

0

0

0

13:59

03/05/2021

Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs

Jonathan Frankle, David J Schwab, Ari Morcos

Keywords Paper

random features, batchnorm, affine parameters

0

0

0

0

5:26

26/04/2020

Functional vs. parametric equivalence of ReLU networks

Mary Phuong, Christoph H. Lampert

Keywords Paper

ReLU networks, symmetry, functional equivalence, over-parameterization

0

0

0

0

5:15

04/07/2020

Location Attention for Extrapolation to Longer Sequences

Yann Dubois, Gautier Dagan, Dieuwke Hupkes, Elia Bruni

Keywords Paper

Extrapolation, natural processing, generalization, Lookup task

0

0

0

0

11:02

03/05/2021

Distance-Based Regularisation of Deep Networks for Fine-Tuning

Henry Gouk, Timothy Hospedales, massimiliano pontil

Keywords Paper

Statistical Learning Theory, Transfer Learning, Deep Learning

0

0

0

0

4:57

06/12/2021

Channel Permutations for N:M Sparsity

Jeff Pool, Chong Yu

Keywords Paper

optimization

0

0

0

0

12:41

06/12/2020

Regularizing Towards Permutation Invariance In Recurrent Models

Edo Cohen-Karlik, Avichai Ben David, Amir Globerson

Keywords Paper

0

0

0

0

3:19

02/02/2021

Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision

Xingchao Liu, Mao Ye, Dengyong Zhou, Qiang Liu

Keywords Paper

0

0

0

0

15:18

06/12/2020

Non-Euclidean Universal Approximation

Anastasis Kratsios, Eugene Bilokopytov

Keywords Paper

0

0

0

0

3:34

03/05/2021

Fooling a Complete Neural Network Verifier

Dániel Zombori, Balázs Bánhelyi, Tibor Csendes and
István Megyeri, Márk Jelasity

Keywords Paper

complete verifiers, adversarial examples, numerical errors

0

0

0

0

4:52

06/12/2021

Subquadratic Overparameterization for Shallow Neural Networks

ChaeHwan Song, Ali Ramezani-Kebrya, Thomas Pethick and
Armin Eftekhari, Volkan Cevher

Keywords Paper

theory, deep learning, optimization

0

0

0

0

5:23

04/07/2020

Generative Semantic Hashing Enhanced via Boltzmann Machines

Lin Zheng, Qinliang Su, Dinghan Shen, Changyou Chen

Keywords Paper

Generative Hashing, large-scale retrieval, training, Boltzmann Machines

0

0

0

0

11:26

14/06/2020

Generalized Zero-Shot Learning via Over-Complete Distribution

Rohit Keshari, Richa Singh, Mayank Vatsa

Keywords Paper

deep learning, zero-shot leaning, cvae, triplet loss, center loss

0

0

0

0

0:50

26/04/2020

Identity Crisis: Memorization and Generalization Under Extreme Overparameterization

Chiyuan Zhang, Samy Bengio, Moritz Hardt and
Michael C. Mozer, Yoram Singer

Keywords Paper

Generalization, Memorization, Understanding, Inductive Bias

1

0

0

1

5:27

06/12/2020

Efficient Exact Verification of Binarized Neural Networks

Kai Jia, Martin Rinard

Keywords Paper

0

0

0

0

3:20

12/07/2020

Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

Francesco Croce, Matthias Hein

Keywords Paper

Adversarial Examples

0

0

0

0

15:12

12/07/2020

Unique Properties of Wide Minima in Deep Networks

Rotem Mulayoff, Tomer Michaeli

Keywords Paper

Deep Learning - Theory

0

0

0

0

14:35

26/04/2020

Finite Depth and Width Corrections to the Neural Tangent Kernel

Boris Hanin, Mihai Nica

Keywords Paper

Neural Tangent Kernel, Finite Width Corrections, Random ReLU Net, Wide Networks, Deep Networks

0

0

0

0

5:09

06/12/2021

AugMax: Adversarial Composition of Random Augmentations for Robust Training

Haotao Wang, Chaowei Xiao, Jean Kossaifi and
Zhiding Yu, Anima Anandkumar, Zhangyang Wang

Keywords Paper

deep learning, robustness, adversarial robustness and security

0

0

0

0

11:19

26/04/2020

A framework for robustness certification of smoothed classifiers using f-divergences

Krishnamurthy (Dj) Dvijotham, Jamie Hayes, Borja Balle and
Zico Kolter, Chongli Qin, Andras Gyorgy, Kai Xiao, Sven Gowal, Pushmeet Kohli

Keywords Paper

verification of machine learning, certified robustness of neural networks

0

0

0

0

4:41

26/08/2020

Adaptive, Distribution-Free Prediction Intervals for Deep Networks

Danijel Kivaranovic, Kory D. Johnson, Hannes Leeb

Keywords Paper

0

0

0

0

16:48

06/12/2021

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness

Jongheon Jeong, Sejun Park, Minkyu Kim and
Heung-Chang Lee, Do-Guk Kim, Jinwoo Shin

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security

0

0

0

0

12:23