Implicit Bias of Linear RNNs

18/07/2021

Implicit Bias of Linear RNNs

Melika Emami, Moji Sahraee-Ardakan, Parthe Pandit, Sundeep Rangan, Alyson Fletcher

Keywords: Theory, Deep learning Theory

Abstract Paper Similar Papers

Abstract: Contemporary wisdom based on empirical studies suggests that standard recurrent neural networks (RNNs) do not perform well on tasks requiring long-term memory. However, RNNs' poor ability to capture long-term dependencies has not been fully understood. This paper provides a rigorous explanation of this property in the special case of linear RNNs. Although this work is limited to linear RNNs, even these systems have traditionally been difficult to analyze due to their non-linear parameterization. Using recently-developed kernel regime analysis, our main result shows that as the number of hidden units goes to infinity, linear RNNs learned from random initializations are functionally equivalent to a certain weighted 1D-convolutional network. Importantly, the weightings in the equivalent model cause an implicit bias to elements with smaller time lags in the convolution, and hence shorter memory. The degree of this bias depends on the variance of the transition matrix at initialization and is related to the classic exploding and vanishing gradients problem. The theory is validated with both synthetic and real data experiments.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels

Stefani Karp, Ezra Winston, Yuanzhi Li, Aarti Singh

Keywords Paper

theory, deep learning, optimization, machine learning, vision, kernel methods

0

0

0

0

13:22

03/05/2021

Activation-level uncertainty in deep neural networks

Pablo Morales-Alvarez, Daniel Hernández-Lobato, Rafael Molina, José Miguel Hernández Lobato

Keywords Paper

Gaussian Processes, Bayesian Neural Networks, Deep Gaussian Processes, Uncertainty estimation

0

0

0

0

6:53

12/07/2020

Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks

Agustinus Kristiadi, Matthias Hein, Philipp Hennig

Keywords Paper

Deep Learning - General

0

0

0

0

15:02

06/12/2020

On the Expressiveness of Approximate Inference in Bayesian Neural Networks

Andrew Foong, David Burt, Yingzhen Li, Richard Turner

Keywords Paper

0

0

0

0

3:23

02/02/2021

On the Softmax Bottleneck of Recurrent Language Models

Dwarak Govind Parthiban, Yongyi Mao, Diana Inkpen

Keywords Paper

0

0

0

0

19:58

12/07/2020

Two Routes to Scalable Credit Assignment without Weight Symmetry

Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena and
Surya Ganguli, Jonathan Bloom, Daniel Yamins

Keywords Paper

Applications - Neuroscience, Cognitive Science, Biology and Health

0

0

0

1

14:12

12/07/2020

Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift

Alexander Chan, Ahmed Alaa, Zhaozhi Qian, Mihaela van der Schaar

Keywords Paper

Trustworthy Machine Learning

0

0

0

0

14:59

04/08/2021

Implicit Regularization in ReLU Networks with the Square Loss

Gal Vardi, Ohad Shamir

Keywords Paper

0

0

0

0

15:57

06/12/2020

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks

Wei Hu, Lechao Xiao, Ben Adlam, Jeffrey Pennington

Keywords Paper

0

0

0

0

3:20

06/12/2020

NeuMiss networks: differentiable programming for supervised learning with missing values.

Marine Le Morvan, Julie Josse, Thomas Moreau and
Erwan Scornet, Gael Varoquaux

Keywords Paper

0

0

0

0

3:20

14/06/2020

Deep Learning for Handling Kernel/model Uncertainty in Image Deconvolution

Yuesong Nan, Hui Ji

Keywords Paper

image deblurring, robust deblurring, error-in-variable model, deep learning, blur kernel correction, image restoration, image processing, low level vision

0

0

0

0

1:01

02/02/2021

Uncertainty Quantification in CNN Through the Bootstrap of Convex Neural Networks

Hongfei Du, Emre Barut, Fang Jin

Keywords Paper

0

0

0

0

13:45

02/02/2021

Vector Quantized Bayesian Neural Network Inference for Data Streams

Namuk Park, Taekyu Lee, Songkuk Kim

Keywords Paper

0

0

0

0

19:20

03/05/2021

Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit

Ben Adlam, Jaehoon Lee, Lechao Xiao and
Jeffrey Pennington, Jasper Snoek

Keywords Paper

Deep Learning, Bayesian Neural Networks, Neural Network Gaussian Process, Infinite-Width Limit, Uncertainty, Gaussian Process

0

0

0

0

4:34

06/12/2021

An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers

Ramakrishna Vedantam, David Lopez-Paz, David Schwab

Keywords Paper

theory, deep learning, domain adaptation

0

0

0

0

10:46

19/08/2021

Learning Deeper Non-Monotonic Networks by Softly Transferring Solution Space

Zheng-Fan Wu, Hui Xue, Weimin Bai

Keywords Paper

Machine Learning, Kernel Methods, Deep Learning, Classification

0

0

0

0

12:50

12/07/2020

Revisiting Spatial Invariance with Low-Rank Local Connectivity

Gamaleldin Elsayed, Prajit Ramachandran, Jon Shlens, Simon Kornblith

Keywords Paper

Deep Learning - General

0

0

0

0

14:48

26/04/2020

Stable Rank Normalization for Improved Generalization in Neural Networks and GANs

Amartya Sanyal, Philip H. Torr, Puneet K. Dokania

Keywords Paper

Generelization, regularization, empirical lipschitz

0

0

0

0

5:25

30/11/2020

Hyperparameter-Free Out-of-Distribution Detection Using Cosine Similarity

Engkarat Techapanurak, Masanori Suganuma, Takayuki Okatani

Keywords Paper

0

0

0

0

7:48

18/07/2021

What Are Bayesian Neural Network Posteriors Really Like?

Pavel Izmailov, Sharad Vikram, Matt Hoffman, Andrew Wilson

Keywords Paper

Deep Learning, Bayesian Deep Learning

0

0

0

0

17:13

03/05/2021

Neural ODE Processes

Alexander Norcliffe, Cristian Bodnar, Ben Day and
Jacob Moss, Pietro Liò

Keywords Paper

neural processes, deep learning, neural ode, dynamics, differential equations

0

0

0

0

5:07

06/12/2020

Constant-Expansion Suffices for Compressed Sensing with Generative Priors

Constantinos Daskalakis, Dhruv Rohatgi, Emmanouil Zampetakis

Keywords Paper

0

0

0

0

3:13

26/08/2020

Uncertainty in Neural Networks: Approximately Bayesian Ensembling

Tim Pearce, Felix Leibfried, Alexandra Brintrup

Keywords Paper

0

0

0

0

16:03

14/06/2020

Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End

Abdelrahman Eldesokey, Michael Felsberg, Karl Holmquist, Michael Persson

Keywords Paper

uncertainty, sparsity, depth completion, bayesian deep learning, normalized convolution, real-time

0

0

0

0

1:00

12/07/2020

Fast and Consistent Learning of Hidden Markov Models by Incorporating Non-Consecutive Correlations

Robert Mattila, Cristian Rojas, Eric Moulines and
Vikram Krishnamurthy, Bo Wahlberg

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

13:37

14/06/2020

Closed-Loop Matters: Dual Regression Networks for Single Image Super-Resolution

Yong Guo, Jian Chen, Jingdong Wang and
Qi Chen, Jiezhang Cao, Zeshuai Deng, Yanwu Xu, Mingkui Tan

Keywords Paper

computer vision, image super-resolution, dual regression scheme, closed-loop

0

0

0

0

1:01

14/09/2020

Effective Version Space Reduction for Convolutional Neural Networks

Jiayu Liu, Ioannis Chiotellis, Rudolph Triebel , Daniel Cremers

Keywords Paper

active learning, deep learning, version space, diameter reduction

0

0

0

0

14:45

12/07/2020

Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

Mike Dusenberry, Ghassen Jerfel, Yeming Wen and
Yian Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran

Keywords Paper

Deep Learning - General

0

0

0

1

14:29

06/12/2021

An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence

Agustinus Kristiadi, Matthias Hein, Philipp Hennig

Keywords Paper

deep learning, kernel methods

0

0

0

0

10:57

06/12/2021

Locally Valid and Discriminative Prediction Intervals for Deep Learning Models

Zhen Lin, Shubhendu Trivedi, Jimeng Sun

Keywords Paper

deep learning

0

0

0

0

12:05

06/12/2020

Estimation and Imputation in Probabilistic Principal Component Analysis with Missing Not At Random Data

Aude Sportisse, Claire Boyer, Julie Josse

Keywords Paper

, Algorithms -> Online Learning

0

0

0

0

3:20

06/12/2021

The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective

Geoff Pleiss, John Cunningham

Keywords Paper

deep learning, kernel methods

0

0

0

0

6:59

26/04/2020

Improved memory in recurrent neural networks with sequential non-normal dynamics

Emin Orhan, Xaq Pitkow

Keywords Paper

recurrent neural networks, memory, non-normal dynamics

0

0

0

0

4:53

06/12/2020

Triple descent and the two kinds of overfitting: where & why do they appear?

Stéphane d'Ascoli, Levent Sagun, Giulio Biroli

Keywords Paper

Algorithms -> Active Learning; Algorithms -> Classification; Algorithms -> Ranking and Preference Learning, Theory -> Learning Theory

0

0

0

0

3:28

18/07/2021

Toward Better Generalization Bounds with Locally Elastic Stability

Zhun Deng, Hangfeng He, Weijie Su

Keywords Paper

Theory, Computational Learning Theory

0

0

0

0

4:59

18/07/2021

Generative Particle Variational Inference via Estimation of Functional Gradients

Neale Ratzlaff, Jerry Bai, Fuxin Li, Wei Xu

Keywords Paper

Deep Learning, Bayesian Deep Learning

0

0

0

0

5:11

18/07/2021

Crowdsourcing via Annotator Co-occurrence Imputation and Provable Symmetric Nonnegative Matrix Factorization

Shahana Ibrahim, Xiao Fu

Keywords Paper

Algorithms, Crowdsourcing

0

0

0

0

15:55

06/12/2020

Efficient Low Rank Gaussian Variational Inference for Neural Networks

Marcin Tomczak, Siddharth Swaroop, Richard Turner

Keywords Paper

Probabilistic Methods -> Latent Variable Models, Probabilistic Methods -> Topic Models

0

0

0

0

2:48

13/04/2021

Exponential convergence rates of classification errors on learning with SGD and random features

Shingo Yashima, Atsushi Nitanda, Taiji Suzuki

Keywords Paper

0

0

0

0

2:58

12/07/2020

Maximum-and-Concatenation Networks

Xingyu Xie, Hao Kong, Jianlong Wu and
Wayne Zhang, Guangcan Liu, Zhouchen Lin

Keywords Paper

Deep Learning - Theory

0

0

0

0

14:05