Neural Execution Engines: Learning to Execute Subroutines

06/12/2020

Neural Execution Engines: Learning to Execute Subroutines

Yujun Yan, Kevin Swersky, Danai Koutra, Parthasarathy Ranganathan, Milad Hashemi

Keywords:

Abstract Paper Similar Papers

Abstract: A significant effort has been made to train neural networks that replicate algorithmic reasoning, but they often fail to learn the abstract concepts underlying these algorithms. This is evidenced by their inability to generalize to data distributions that are outside of their restricted training sets, namely larger inputs and unseen data. We study these generalization issues at the level of numerical subroutines that comprise common algorithms like sorting, shortest paths, and minimum spanning trees. First, we observe that transformer-based sequence-to-sequence models can learn subroutines like sorting a list of numbers, but their performance rapidly degrades as the length of lists grows beyond those found in the training set. We demonstrate that this is due to attention weights that lose fidelity with longer sequences, particularly when the input numbers are numerically similar. To address the issue, we propose a learned conditional masking mechanism, which enables the model to strongly generalize far outside of its training range with near-perfect accuracy on a variety of algorithms. Second, to generalize to unseen data, we show that encoding numbers with a binary representation leads to embeddings with rich structure once trained on downstream tasks like addition or multiplication. This allows the embedding to handle missing data by faithfully interpolating numbers not seen during training.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Xuanlin Li, Brandon Trabucco, Dong Huk Park and
Michael Luo, Sheng Shen, trevor darrell, Yang Gao

Keywords Paper

reinforcement learning, computer vision, natural language processing, optimization, variational inference, unsupervised learning

0

0

0

0

4:56

16/11/2020

Consistency of a Recurrent Language Model With Respect to Incomplete Decoding

Sean Welleck, Ilia Kulikov, Jaedeok Kim and
Richard Yuanzhe Pang, Kyunghyun Cho

Keywords Paper

receiving sequences, neural models, recurrent model, common algorithms

0

0

0

0

9:58

06/12/2020

LoCo: Local Contrastive Representation Learning

Yuwen Xiong, Mengye Ren, Raquel Urtasun

Keywords Paper

0

1

0

1

3:18

06/12/2021

Structured Reordering for Modeling Latent Alignments in Sequence Transduction

bailin wang, Mirella Lapata, Ivan Titov

Keywords Paper

language

0

0

0

0

15:00

06/12/2021

Statistically and Computationally Efficient Linear Meta-representation Learning

Kiran Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh

Keywords Paper

optimization, meta learning, representation learning, few shot learning

1

0

0

1

12:56

26/04/2020

Meta-Learning without Memorization

Mingzhang Yin, George Tucker, Mingyuan Zhou and
Sergey Levine, Chelsea Finn

Keywords Paper

meta-learning, memorization, regularization, overfitting, mutually-exclusive

0

0

0

0

5:09

03/05/2021

Revisiting Locally Supervised Learning: an Alternative to End-to-end Training

Yulin Wang, Zanlin Ni, Shiji Song and
Le Yang, Gao Huang

Keywords Paper

Deep learning, Locally supervised training

1

0

0

1

5:03

16/11/2020

Cold-Start and Interpretability: Turning Regular Expressions into Trainable Recurrent Neural Networks

Chengyue Jiang, Yinggong Zhao, Shanbo Chu and
Libin Shen, Kewei Tu

Keywords Paper

natural applications, training, text classification, neural networks

0

0

0

0

11:32

12/07/2020

It's Not What Machines Can Learn, It's What We Cannot Teach

Gal Yehuda, Moshe Gabel, Assaf Schuster

Keywords Paper

Supervised Learning

0

0

0

0

10:41

02/02/2021

Learning Representations for Incomplete Time Series Clustering

Qianli Ma, Chuxin Chen, Sen Li, Garrison W. Cottrell

Keywords Paper

0

0

0

0

15:15

06/12/2020

Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming

Sumanth Dathathri, Krishnamurthy Dvijotham, Alexey Kurakin and
Aditi Raghunathan, Jonathan Uesato, Rudy Bunel, Shreya Shankar, Jacob Steinhardt, Ian Goodfellow, Percy Liang, Pushmeet Kohli

Keywords Paper

0

0

0

0

3:23

14/06/2020

DeepDeform: Learning Non-Rigid RGB-D Reconstruction With Semi-Supervised Data

Aljaž Božič, Michael Zollhöfer, Christian Theobalt, Matthias Nießner

Keywords Paper

non-rigid reconstruction, non-rigid tracking, dataset, benchmark, correspondence prediction, heatmap network, rgb-d, single camera, least squares optimization

0

0

0

0

1:00

03/05/2021

SEDONA: Search for Decoupled Neural Networks toward Greedy Block-wise Learning

Myeongjang Pyeon, Jihwan Moon, Taeyoung Hahn, Gunhee Kim

Keywords Paper

AutoML, Greedy Learning, Deep Learning, Neural Architecture Search

0

0

1

2

5:03

02/02/2021

Learning by Fixing: Solving Math Word Problems with Weak Supervision

Yining Hong, Qing Li, Daniel Ciao and
Siyuan Huang, Song-Chun Zhu

Keywords Paper

0

0

0

0

13:50

06/12/2020

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

Aviral Kumar, Abhishek Gupta, Sergey Levine

Keywords Paper

0

0

0

0

3:25

19/04/2021

Neural data-to-text generation with LM-based text augmentation

Ernie Chang, Xiaoyu Shen, Dawei Zhu and
Vera Demberg, Hui Su

Keywords Paper

0

0

0

0

7:32

22/06/2020

Efficiently learning structured distributions from untrusted batches

Sitan Chen, Jerry Li, Ankur Moitra

Keywords Paper

sum-of-squares, federated learning, VC complexity, Robust statistics

0

0

0

0

24:38

02/02/2021

Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units

Ankur Mali, Alexander G. Ororbia, Daniel Kifer, C. Lee Giles

Keywords Paper

0

0

0

0

15:07

26/04/2020

Towards Verified Robustness under Text Deletion Interventions

Johannes Welbl, Po-Sen Huang, Robert Stanforth and
Sven Gowal, Krishnamurthy (Dj) Dvijotham, Martin Szummer, Pushmeet Kohli

Keywords Paper

natural language processing, specification, verification, model undersensitivity, adversarial, interval bound propagation

0

0

0

0

5:01

12/07/2020

Two Routes to Scalable Credit Assignment without Weight Symmetry

Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena and
Surya Ganguli, Jonathan Bloom, Daniel Yamins

Keywords Paper

Applications - Neuroscience, Cognitive Science, Biology and Health

0

0

0

1

14:12

26/04/2020

Meta-Learning Deep Energy-Based Memory Models

Sergey Bartunov, Jack Rae, Simon Osindero, Timothy Lillicrap

Keywords Paper

associative memory, energy-based memory, meta-learning, compressive memory

0

0

0

0

4:59

02/02/2021

Knowledge-aware Leap-LSTM: Integrating Prior Knowledge into Leap-LSTM towards Faster Long Text Classification

Jinhua Du, Yan Huang, Karo Moilanen

Keywords Paper

0

0

0

0

19:11

26/04/2020

Neural Text Generation With Unlikelihood Training

Sean Welleck, Ilia Kulikov, Stephen Roller and
Emily Dinan, Kyunghyun Cho, Jason Weston

Keywords Paper

language modeling, machine learning

0

0

0

0

4:20

03/05/2021

Cut out the annotator, keep the cutout: better segmentation with weak supervision

Sarah Hooper, Michael Wornow, Ying Seah and
Peter Kellman, Hui Xue, Frederic Sala, Curtis Langlotz, Christopher Re

Keywords Paper

Weak supervision, medical imaging, latent variable, segmentation, CNN

0

0

0

0

5:07

22/06/2020

Learning Credal Sum-Product Networks

Amelie Levray, Vaishak Belle

Keywords Paper

credal networks, imprecise probabilities, tractable learning

0

0

0

0

5:10

12/07/2020

Learning To Stop While Learning To Predict

Xinshi Chen, Hanjun Dai, Yu Li and
Xin Gao, Le Song

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

14:33

06/12/2020

MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures

Jeong Un Ryu, JWoong Shin, Hae Beom Lee, Sung Ju Hwang

Keywords Paper

0

0

0

0

3:32

14/06/2020

On the Acceleration of Deep Learning Model Parallelism With Staleness

An Xu, Zhouyuan Huo, Heng Huang

Keywords Paper

layer-wise staleness, asynchronous model parallelism, convolutional neural networks.

0

0

0

0

1:01

06/12/2020

Kernelized information bottleneck leads to biologically plausible 3-factor Hebbian learning in deep networks

Roman Pogodin, Peter E Latham

Keywords Paper

Deep Learning -> Adversarial Networks, Algorithms -> Semi-Supervised Learning

0

0

0

0

2:30

19/04/2021

Reanalyzing the most probable sentence problem: A case study in explicating the role of entropy in algorithmic complexity

Eric Corlett, Gerald Penn

Keywords Paper

0

0

0

0

11:08

06/12/2021

Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

Tolga Birdal, Aaron Lou, Leonidas Guibas, Umut Simsekli

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:38

18/07/2021

On Monotonic Linear Interpolation of Neural Network Parameters

James Lucas, Juhan Bae, Michael Zhang and
Stanislav Fort, Richard Zemel, Roger Grosse

Keywords Paper

Deep Learning, Others

0

0

0

0

5:03

30/11/2020

Few-Shot Zero-Shot Learning: Knowledge Transfer with Less Supervision

Nanyi Fei, Jiechao Guan, Zhiwu Lu, Yizhao Gao

Keywords Paper

0

0

0

0

7:37

14/06/2020

Conditional Channel Gated Networks for Task-Aware Continual Learning

Davide Abati, Jakub Tomczak, Tijmen Blankevoort and
Simone Calderara, Rita Cucchiara, Babak Ehteshami Bejnordi

Keywords Paper

continual learning, channel gating, conditional computation, incremental learning, lifelong learning, hard attention

0

0

0

0

5:01

12/07/2020

Non-Stationary Bandits with Intermediate Observations

Claire Vernade, András György, Timothy Mann

Keywords Paper

Online Learning, Active Learning, and Bandits

1

1

0

0

14:40

03/05/2021

Incremental few-shot learning via vector quantization in deep embedded space

Kuilin Chen, Chi-Guhn Lee

Keywords Paper

incremental learning, vector quantization, few-shot

0

0

0

0

5:07

12/07/2020

Generalization Error of Generalized Linear Models in High Dimensions

Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit and
Sundeep Rangan, Alyson Fletcher

Keywords Paper

Supervised Learning

0

0

0

0

15:08

14/06/2020

Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End

Abdelrahman Eldesokey, Michael Felsberg, Karl Holmquist, Michael Persson

Keywords Paper

uncertainty, sparsity, depth completion, bayesian deep learning, normalized convolution, real-time

0

0

0

0

1:00

30/11/2020

dpVAEs: Fixing Sample Generation for Regularized VAEs

Riddhish Bhalodia, Iain Lee, Shireen Elhabian

Keywords Paper

0

0

0

0

7:54

03/05/2021

On the Bottleneck of Graph Neural Networks and its Practical Implications

Uri Alon, Eran Yahav

Keywords Paper

GNNs, graphs, over-squashing, bottleneck, understanding, limitations

0

0

0

1

5:16