Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

Abstract: Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dramatic increases in accuracy and convergence rate for benchmarks characterizing scientific applications where DNNs are currently used, including regression problems and physics-informed neural networks for the solution of partial differential equations.

06/12/2020

Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

Eric C. Cyr, Mamikon A. Gulian, Ravi G. Patel, Mauro Perego, Nathaniel A. Trask

Comments

Similar Papers

Benchmarking Deep Inverse Models over time, and the Neural-Adjoint method

Ben Ren, Willie Padilla, Jordan Malof

Keywords Abstract Paper

DDPNOpt: Differential Dynamic Programming Neural Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Keywords Abstract Paper

differential dynamica programming, trajectory optimization, deep learning training, optimal control

Parametric Complexity Bounds for Approximating PDEs with Neural Networks

Tanya Marwah, Zachary Lipton, Andrej Risteski

Keywords Abstract Paper

theory, deep learning, optimization

A Trace-restricted Kronecker-Factored Approximation to Natural Gradient

Kaixin Gao, Xiaolei Liu, Zhenghai Huang and Min Wang, Zidong Wang, Dachuan Xu, Fan Yu

Keywords Abstract Paper

Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians

Juhan Bae, Roger Grosse

Keywords Abstract Paper

MomentumRNN: Integrating Momentum into Recurrent Neural Networks

Tan Nguyen, Richard Baraniuk, Andrea Bertozzi and Stanley Osher, Bao Wang

Keywords Abstract Paper

A type of generalization error induced by initialization in deep neural networks

Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Keywords Abstract Paper

Escaping Saddle Points Faster with Stochastic Momentum

Jun-Kun Wang, Chi-Heng Lin, Jacob Abernethy

Keywords Abstract Paper

SGD, momentum, escaping saddle point

The Recurrent Neural Tangent Kernel

Sina Alemohammad, Jack Wang, Randall Balestriero, Richard Baraniuk

Keywords Abstract Paper

Gaussian Process, Recurrent Neural Network, Neural Tangent Kernel, Overparameterization

DebiNet: Debiasing linear models with nonlinear overparameterized neural networks

Shiyun Xu, Zhiqi Bu

Keywords Abstract Paper

Understanding Why Neural Networks Generalize Well Through GSNR of Parameters

Jinlong Liu, Yunzhi Bai, Guoqing Jiang and Ting Chen, Huayan Wang

Keywords Abstract Paper

DNN, generalization, GSNR, gradient descent

Empirical Studies on the Properties of Linear Regions in Deep Neural Networks

Xiao Zhang, Dongrui Wu

Keywords Abstract Paper

deep learning, linear region, optimization

A Bayesian Perspective on Training Speed and Model Selection

Clare Lyle, Lisa Schut, Robin Ru and Yarin Gal, Mark van der Wilk

Keywords Abstract Paper

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

Zixiang Chen, Yuan Cao, Difan Zou, Quanquan Gu

Keywords Abstract Paper

classification, neural tangent kernel, generalization error, (stochastic) gradient descent, deep ReLU networks

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks

Yu Bai, Jason D. Lee

Keywords Abstract Paper

Neural Tangent Kernels, over-parametrized neural networks, deep learning theory

Meta-learning to Improve Pre-training

Aniruddh Raghu, Jonathan Lorraine, Simon Kornblith and Matthew McDermott, David Duvenaud

Keywords Abstract Paper

deep learning, optimization, graph learning, meta learning

Rational neural networks

Nicolas Boulle, Yuji Nakatsukasa, Alex J Townsend

Keywords Abstract Paper

On the Role of Optimization in Double Descent: A Least Squares Study

Ilja Kuzborskij, Csaba Szepesvari, Omar Rivasplata and Amal Rannen-Triki, Razvan Pascanu

Keywords Abstract Paper

theory, deep learning, optimization

NTopo: Mesh-free Topology Optimization using Implicit Neural Representations

Jonas Zehnder, Yue Li, Stelian Coros, Bernhard Thomaszewski

Keywords Abstract Paper

deep learning, optimization, machine learning, self-supervised learning, representation learning

Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks

Wei Hu, Lechao Xiao, Jeffrey Pennington

Keywords Abstract Paper

deep learning theory, non-convex optimization, orthogonal initialization

Robust Implicit Networks via Non-Euclidean Contractions

Saber Jafarpour, Alexander Davydov, Anton Proskurnikov, Francesco Bullo

Keywords Abstract Paper

theory, deep learning, machine learning, robustness, vision

Keywords Paper

Keywords Paper

Keywords Paper

Kaixin Gao, Xiaolei Liu, Zhenghai Huang and
Min Wang, Zidong Wang, Dachuan Xu, Fan Yu

Keywords Paper

Keywords Paper

Tan Nguyen, Richard Baraniuk, Andrea Bertozzi and
Stanley Osher, Bao Wang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jinlong Liu, Yunzhi Bai, Guoqing Jiang and
Ting Chen, Huayan Wang

Keywords Paper

Keywords Paper

Clare Lyle, Lisa Schut, Robin Ru and
Yarin Gal, Mark van der Wilk

Keywords Paper

Keywords Paper

Keywords Paper

Aniruddh Raghu, Jonathan Lorraine, Simon Kornblith and
Matthew McDermott, David Duvenaud

Keywords Paper

Keywords Paper

Ilja Kuzborskij, Csaba Szepesvari, Omar Rivasplata and
Amal Rannen-Triki, Razvan Pascanu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Stefano Massaroli, Michael Poli, Sho Sonoda and
Taiji Suzuki, Jinkyoo Park, Atsushi Yamashita, Hajime Asama

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Ramin Hasani, Mathias Lechner, Alexander Amini and
Daniela Rus, Radu Grosu

Keywords Paper

Alexander Camuto, George Deligiannidis, Murat Erdogdu and
Mert Gurbuzbalaban, Umut Simsekli, Lingjiong Zhu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper