A Principle of Least Action for the Training of Neural Networks

14/09/2020

A Principle of Least Action for the Training of Neural Networks

Skander Karkar, Ibrahim Ayed, Emmanuel de Bézenac, Patrick Gallinari

Keywords: deep learning, optimal transport, dynamical systems

Abstract Paper Similar Papers

Abstract: Neural networks have been achieving high generalization performance on many tasks despite being highly over-parameterized. Since classical statistical learning theory struggles to explain this behaviour, much effort has recently been focused on uncovering the mechanisms behind it, in the hope of developing a more adequate theoretical framework and having a better control over the trained models. In this work, we adopt an alternative perspective, viewing the neural network as a dynamical system displacing input particles over time. We conduct a series of experiments and, by analyzing the network’s behaviour through its displacements, we show the presence of a low kinetic energy bias in the transport map of the network, and link this bias with generalization performance. From this observation, we reformulate the learning problem as follows: find neural networks that solve the task while transporting the data as efficiently as possible. This offers a novel formulation of the learning problem which allows us to provide regularity results for the solution network, based on Optimal Transport theory. From a practical viewpoint, this allows us to propose a new learning algorithm, which automatically adapts to the complexity of the task, and leads to networks with a high generalization ability even in low data regimes.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ECML PKDD 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

13/04/2021

Fast adaptation with linearized neural networks

Wesley Maddox, Shuai Tang, Pablo Moreno and
Andrew Gordon Wilson, Andreas Damianou

Keywords Paper

0

0

0

0

3:13

02/02/2021

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Xiangyu Chang, Yingcong Li, Samet Oymak, Christos Thrampoulidis

Keywords Paper

0

0

0

0

18:14

06/12/2021

Efficient Learning of Discrete-Continuous Computation Graphs

David Friede, Mathias Niepert

Keywords Paper

deep learning, reinforcement learning and planning, graph learning

0

0

0

0

12:31

06/12/2020

Detecting Interactions from Neural Networks via Topological Analysis

Liu Liu, Qingquan Song, Kaixiong Zhou and
Ting-Hsiang Wang, Ying Shan, Xia Hu

Keywords Paper

Algorithms -> Bandit Algorithms; Reinforcement Learning and Planning -> Reinforcement Learning; Theory -> Learning Theory, Reinforcement Learning and Planning -> Exploration

0

0

0

0

3:25

18/07/2021

Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics

Avik Pal, Yingbo Ma, Viral Shah, Christopher Rackauckas

Keywords Paper

Deep Learning

0

0

0

0

5:11

02/02/2021

Simple and Effective Stochastic Neural Networks

Tianyuan Yu, Yongxin Yang, Da Li and
Timothy Hospedales, Tao Xiang

Keywords Paper

0

0

0

0

13:52

12/07/2020

Optimal transport mapping via input convex neural networks

Ashok Vardhan Makkuva, Amirhossein Taghvaei, Sewoong Oh, Jason Lee

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

1

15:09

30/11/2020

Bridging Adversarial and Statistical Domain Transfer via Spectral Adaptation Networks

Christoph Raab, Philipp Väth, Peter Meier, Frank-Michael Schleif

Keywords Paper

0

0

0

0

10:07

26/04/2020

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks

Yu Bai, Jason D. Lee

Keywords Paper

Neural Tangent Kernels, over-parametrized neural networks, deep learning theory

0

0

0

0

5:25

06/12/2021

Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

Mathias Niepert, Pasquale Minervini, Luca Franceschi

Keywords Paper

deep learning, optimization

0

0

0

0

15:02

18/07/2021

Prediction-Centric Learning of Independent Cascade Dynamics from Partial Observations

Mateusz Wilinski, Andrey Lokhov

Keywords Paper

Probabilistic Methods, Approximate Inference

0

0

0

0

6:26

06/12/2020

Gradient-EM Bayesian Meta-Learning

Yayi Zou, Xiaoqi Lu

Keywords Paper

0

0

0

0

3:23

12/07/2020

Generalization Error of Generalized Linear Models in High Dimensions

Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit and
Sundeep Rangan, Alyson Fletcher

Keywords Paper

Supervised Learning

0

0

0

0

15:08

06/12/2021

Learning to Learn Dense Gaussian Processes for Few-Shot Learning

Ze Wang, Zichen Miao, Xiantong Zhen, Qiang Qiu

Keywords Paper

deep learning, optimization, generative model, meta learning, kernel methods, few shot learning

0

0

0

0

5:21

06/12/2020

From Boltzmann Machines to Neural Networks and Back Again

Surbhi Goel, Adam Klivans, Frederic Koehler

Keywords Paper

Algorithms -> Nonlinear Dimensionality Reduction and Manifold Learning, Algorithms -> Regression

0

0

0

0

3:26

06/12/2021

Integrated Latent Heterogeneity and Invariance Learning in Kernel Space

Jiashuo Liu, Zheyuan Hu, Peng Cui and
Bo Li, Zheyan Shen

Keywords Paper

deep learning, reinforcement learning and planning, machine learning

0

0

0

0

11:11

03/05/2021

Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit

Ben Adlam, Jaehoon Lee, Lechao Xiao and
Jeffrey Pennington, Jasper Snoek

Keywords Paper

Deep Learning, Bayesian Neural Networks, Neural Network Gaussian Process, Infinite-Width Limit, Uncertainty, Gaussian Process

0

0

0

0

4:34

26/04/2020

SVQN: Sequential Variational Soft Q-Learning Networks

Shiyu Huang, Hang Su, Jun Zhu, Ting Chen

Keywords Paper

reinforcement learning, POMDP, variational inference, generative model

0

0

0

0

4:52

12/07/2020

Adversarial Robustness via Runtime Masking and Cleansing

Yi-Hsuan Wu, Chia-Hung Yuan, Shan-Hung (Brandon) Wu

Keywords Paper

Adversarial Examples

0

0

0

0

13:38

26/04/2020

Frequency-based Search-control in Dyna

Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand

Keywords Paper

Model-based reinforcement learning, search-control, Dyna, frequency of a signal

0

0

0

0

4:32

03/05/2021

Meta-Learning with Neural Tangent Kernels

Yufan Zhou, Zhenyi Wang, Jiayi Xian and
Changyou Chen, Jinhui Xu

Keywords Paper

neural tangent kernel, meta-learning

0

0

0

0

3:54

07/09/2020

Transferring Pretrained Networks to Small Data via Category Decorrelation

Ying Jin, Zhangjie Cao, Mingsheng Long, Jianmin Wang

Keywords Paper

Category Decorrelation, Under Transfer

1

1

0

0

8:39

18/07/2021

Towards Understanding Learning in Neural Networks with Linear Teachers

Roei Sarussi, Alon Brutzkus, Amir Globerson

Keywords Paper

Probabilistic Methods, Theory, Probabilistic Methods, MCMC

0

0

0

0

5:22

06/12/2021

On the Role of Optimization in Double Descent: A Least Squares Study

Ilja Kuzborskij, Csaba Szepesvari, Omar Rivasplata and
Amal Rannen-Triki, Razvan Pascanu

Keywords Paper

theory, deep learning, optimization

0

0

0

0

13:48

02/02/2021

Meta-Learning Framework with Applications to Zero-Shot Time-Series Forecasting

Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, Yoshua Bengio

Keywords Paper

0

0

0

0

17:41

06/12/2021

What can linearized neural networks actually say about generalization?

Guillermo Ortiz-Jimenez, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard

Keywords Paper

theory, deep learning

0

0

0

0

9:46

03/05/2021

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks

Keyulu Xu, Mozhi Zhang, Jingling Li and
Simon Du, Ken-Ichi Kawarabayashi, Stefanie Jegelka

Keywords Paper

graph neural networks, out-of-distribution, deep learning, extrapolation, deep learning theory

0

0

0

1

17:06

06/12/2021

Meta-Learning Sparse Implicit Neural Representations

Jaeho Lee, Jihoon Tack, Namhoon Lee, Jinwoo Shin

Keywords Paper

deep learning, optimization, meta learning, representation learning

0

0

0

0

8:41

22/06/2020

Learning Credal Sum-Product Networks

Amelie Levray, Vaishak Belle

Keywords Paper

credal networks, imprecise probabilities, tractable learning

0

0

0

0

5:10

12/07/2020

Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?

Kei Ota, Tomoaki Oiki, Devesh Jha and
Toshisada Mariyama, Daniel Nikovski

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:55

06/12/2021

Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction

Konstantin Schürholt, Dimche Kostadinov, Damian Borth

Keywords Paper

deep learning, self-supervised learning, representation learning

0

0

0

0

13:51

18/07/2021

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

Jianfei Chen, Lianmin Zheng, Zhewei Yao and
Dequan Wang, Ion Stoica, Michael Mahoney, Joseph E Gonzalez

Keywords Paper

Algorithms, Large Scale Learning

0

0

0

0

18:54

06/12/2021

Asymptotics of representation learning in finite Bayesian neural networks

Jacob Zavatone-Veth, Abdulkadir Canatar, Ben Ruben, Cengiz Pehlevan

Keywords Paper

deep learning, representation learning

0

0

0

0

14:09

26/08/2020

Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees

Atsushi Nitanda, Taiji Suzuki

Keywords Paper

0

0

0

0

10:49

20/07/2020

Gating creates slow modes and controls phase-space complexity in GRUs and LSTMs

Tankut Can, Kamesh Krishnamurthy, David J. Schwab

Keywords Paper

0

0

0

0

21:00

18/11/2020

Learning 2-opt heuristics for the traveling salesman problem via deep reinforcement learning

Paulo R d O Costa, Jason Rhuggenaath, Yingqian Zhang, Alp Akcay

Keywords Paper

0

0

0

0

11:58

06/12/2021

Towards Sample-efficient Overparameterized Meta-learning

Yue Sun, Adhyyan Narang, Ibrahim Gulluk and
Samet Oymak, Maryam Fazel

Keywords Paper

theory, machine learning, meta learning, representation learning, few shot learning

0

0

0

0

13:54

06/12/2020

Model Fusion via Optimal Transport

Sidak Pal Singh, Martin Jaggi

Keywords Paper

1

0

0

1

3:10

12/07/2020

On the Generalization Benefit of Noise in Stochastic Gradient Descent

Samuel Smith, Erich Elsen, Soham De

Keywords Paper

Deep Learning - General

0

0

0

0

15:18

06/12/2021

Meta-Learning for Relative Density-Ratio Estimation

Atsutoshi Kumagai, Tomoharu Iwata, Yasuhiro Fujiwara

Keywords Paper

deep learning, machine learning, meta learning

0

0

0

0

8:56