What they do when in doubt: a study of inductive biases in seq2seq learners

03/05/2021

What they do when in doubt: a study of inductive biases in seq2seq learners

Kharitonov Eugene, Rahma Chaabouni

Keywords: description length, inductive biases, sequence-to-sequence models

Abstract Paper Similar Papers

Abstract: Sequence-to-sequence (seq2seq) learners are widely used, but we still have only limited knowledge about what inductive biases shape the way they generalize. We address that by investigating how popular seq2seq learners generalize in tasks that have high ambiguity in the training data. We use four new tasks to study learners' preferences for memorization, arithmetic, hierarchical, and compositional reasoning. Further, we connect to Solomonoff's theory of induction and propose to use description length as a principled and sensitive measure of inductive biases. In our experimental study, we find that LSTM-based learners can learn to perform counting, addition, and multiplication by a constant from a single training example. Furthermore, Transformer and LSTM-based learners show a bias toward the hierarchical induction over the linear one, while CNN-based learners prefer the opposite. The latter also show a bias toward a compositional generalization over memorization. Finally, across all our experiments, description length proved to be a sensitive measure of inductive biases.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection

Zibo Lin, Deng Cai, Yan Wang and
Xiaojiang Liu, Haitao Zheng, Shuming Shi

Keywords Paper

response selection, retrieval-based systems, learning-to-rank problem, learning-to-rank

0

0

0

0

12:03

03/05/2021

When Do Curricula Work?

Xiaoxia (Shirley) Wu, Ethan Dyer, Behnam Neyshabur

Keywords Paper

Empirical Investigation, Understanding Deep Learning, Curriculum Learning

0

0

0

0

14:37

06/12/2021

No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data

Mi Luo, Fei Chen, Dapeng Hu and
Yifan Zhang, Jian Liang, Jiashi Feng

Keywords Paper

optimization, machine learning, federated learning

0

0

0

0

3:27

06/12/2020

What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation

Vitaly Feldman, Chiyuan Zhang

Keywords Paper

0

0

0

0

3:22

06/12/2021

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Jixuan Wang, Kuan-Chieh Wang, Frank Rudzicz, Michael Brudno

Keywords Paper

machine learning, transformers, meta learning, language, transfer learning

0

0

0

0

14:45

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

04/07/2020

Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work?

Yada Pruksachatkun, Jason Phang, Haokun Liu and
Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman

Keywords Paper

Intermediate-Task Learning, natural tasks, data-rich task, intermediate-task training

0

0

0

0

14:47

03/05/2021

Contrastive Learning with Adversarial Perturbations for Conditional Text Generation

Seanie Lee, Dong Bok Lee, Sung Ju Hwang

Keywords Paper

contrastive learning, conditional text generation

0

0

0

0

4:51

06/12/2021

Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Gonzalo Jaimovitch-Lopez, David Castellano Falcón, Cesar Ferri, José Hernández-Orallo

Keywords Paper

machine learning, interpretability, few shot learning

0

0

0

0

12:12

04/07/2020

Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling

Zihan Liu, Genta Indra Winata, Peng Xu, Pascale Fung

Keywords Paper

Cross-domain Filling, task-oriented systems, slot filling, data problem

0

0

0

0

6:59

19/04/2021

Progressively pretrained dense corpus index for open-domain question answering

Wenhan Xiong, Hong Wang, William Yang Wang

Keywords Paper

0

0

0

0

12:15

03/05/2021

Theoretical bounds on estimation error for meta-learning

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and
Toniann Pitassi, Richard Zemel

Keywords Paper

meta learning, minimax risk, few-shot, lower bounds, learning theory

0

0

0

0

4:46

06/12/2021

Deep Learning on a Data Diet: Finding Important Examples Early in Training

Mansheej Paul, Surya Ganguli, Gintare Karolina Dziugaite

Keywords Paper

deep learning

0

0

0

0

10:18

22/11/2021

Few-shot Action Recognition with Prototype-centered Attentive Learning

Xiatian Zhu, Antoine S Toisoul, Juan-Manuel Perez-Rua and
Li Zhang, Brais Martinez, Tao Xiang

Keywords Paper

Few-shot learning, Video recognition, Action classification, Small training data, Model pre-training, Meta-learning, Transformer, Self-attention learning, Cross-attention learning, Prototype learning, Prototype-centered learning, Hybrid-attention learning

0

0

0

0

2:22

26/04/2020

Residual Energy-Based Models for Text Generation

Yuntian Deng, Anton Bakhtin, Myle Ott and
Arthur Szlam, Marc'Aurelio Ranzato

Keywords Paper

energy-based models, text generation

0

0

0

0

4:59

23/08/2020

Targeted data-driven regularization for out-of-distribution generalization

Mohammad Mahdi Kamani, Sadegh Farhang, Mehrdad Mahdavi, James Z. Wang

Keywords Paper

data-driven regularization, out-of-distribution generalization, bilevel programming

0

0

0

0

6:36

04/07/2020

A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking

Yong Shan, Zekang Li, Jinchao Zhang and
Fandong Meng, Yang Feng, Cheng Niu, Jie Zhou

Keywords Paper

Dialogue Tracking, slot problem, Contextual Network, DST

0

0

0

0

11:59

14/06/2020

TransMatch: A Transfer-Learning Scheme for Semi-Supervised Few-Shot Learning

Zhongjie Yu, Lin Chen, Zhongwei Cheng, Jiebo Luo

Keywords Paper

few-shot learning, semi-supervised learning, meta-learning

0

0

0

0

1:01

06/12/2021

Tuning Mixed Input Hyperparameters on the Fly for Efficient Population Based AutoRL

Jack Parker-Holder, Vu Nguyen, Shaan Desai, Stephen J Roberts

Keywords Paper

optimization, reinforcement learning and planning, bandits

0

0

0

0

14:41

03/05/2021

MixKD: Towards Efficient Distillation of Large-scale Language Models

Kevin Liang, Weituo Hao, Dinghan Shen and
Yufan Zhou, Weizhu Chen, Changyou Chen, Lawrence Carin

Keywords Paper

Representation Learning, Natural Language Processing

0

0

0

0

3:52

06/12/2020

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

Aviral Kumar, Abhishek Gupta, Sergey Levine

Keywords Paper

0

0

0

0

3:25

04/07/2020

Balancing Training for Multilingual Neural Machine Translation

Xinyi Wang, Yulia Tsvetkov, Graham Neubig

Keywords Paper

Multilingual Translation, Balancing Training, multilingual models, heuristic baselines

0

0

0

0

10:22

03/05/2021

Incremental few-shot learning via vector quantization in deep embedded space

Kuilin Chen, Chi-Guhn Lee

Keywords Paper

incremental learning, vector quantization, few-shot

0

0

0

0

5:07

16/11/2020

Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection

Shaolei Wang, Zhongyuan Wang, Wanxiang Che, Ting Liu

Keywords Paper

disfluency detection, self-supervised techniques, unsupervised paradigm, noisy training

0

0

0

0

9:57

04/07/2020

Multi-source Meta Transfer for Low Resource Multiple-Choice Question Answering

Ming Yan, Hao Zhang, Di Jin, Joey Tianyi Zhou

Keywords Paper

Multi-source Transfer, Low Answering, Multiple-choice answering, machine comprehension

0

0

0

0

7:40

16/11/2020

TeaForN: Teacher-Forcing with N-grams

Sebastian Goodman, Nan Ding, Radu Soricut

Keywords Paper

machine benchmark, news benchmarks, sequence models, teacher-forcing

0

0

0

0

12:02

06/12/2021

Structured Reordering for Modeling Latent Alignments in Sequence Transduction

bailin wang, Mirella Lapata, Ivan Titov

Keywords Paper

language

0

0

0

0

15:00

06/12/2021

Multi-Objective Meta Learning

Feiyang YE, Baijiong Lin, Zhixiong Yue and
Pengxin Guo, Qiao Xiao, Yu Zhang

Keywords Paper

deep learning, optimization, meta learning, domain adaptation, few shot learning

0

0

0

0

12:27

03/05/2021

The geometry of integration in text classification RNNs

Kyle Aitken, Vinay Ramasesh, Ankush Garg and
Yuan Cao, David Sussillo, Niru Maheswaranathan

Keywords Paper

interpretability, dynamical systems, reverse engineering, document classification, Recurrent neural networks

0

0

0

0

5:13

26/04/2020

Synthesizing Programmatic Policies that Inductively Generalize

Jeevana Priya Inala, Osbert Bastani, Zenna Tavares, Armando Solar-Lezama

Keywords Paper

Program synthesis, reinforcement learning, inductive generalization

0

0

0

0

4:42

05/01/2021

Unsupervised Multi-Target Domain Adaptation Through Knowledge Distillation

Le Thanh Nguyen-Meidine, Atif Belal, Madhu Kiran and
Jose Dolz, Louis-Antoine Blais-Morin, Eric Granger

Keywords Paper

0

0

0

0

4:56

06/12/2020

Learning to summarize with human feedback

Nisan Stiennon, Long Ouyang, Jeffrey Wu and
Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano

Keywords Paper

0

0

0

0

3:17

02/02/2021

GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning

Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Rishabh Iyer

Keywords Paper

0

0

0

0

19:14

30/11/2020

Large-Scale Cross-Domain Few-Shot Learning

Jiechao Guan, Manli Zhang, Zhiwu Lu

Keywords Paper

0

0

0

0

7:26

06/12/2021

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro and
Aaron Courville, Marc Bellemare

Keywords Paper

reinforcement learning and planning

0

0

0

0

19:36

26/04/2020

Decoupling Representation and Classifier for Long-Tailed Recognition

Bingyi Kang, Saining Xie, Marcus Rohrbach and
Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

Keywords Paper

long-tailed recognition, classification

0

0

0

1

5:00

03/05/2021

On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning

Ren Wang, Kaidi Xu, Sijia Liu and
Pin-Yu Chen, Lily Weng, Chuang Gan, Meng Wang

Keywords Paper

0

0

0

0

5:12

03/05/2021

Meta-learning with negative learning rates

Alberto Bernacchia

Keywords Paper

Meta-learning

0

0

0

0

5:19

03/05/2021

Exploring Balanced Feature Spaces for Representation Learning

Bingyi Kang, Yu Li, Sain Xie and
Zehuan Yuan, Jiashi Feng

Keywords Paper

Representation Learning, Contrastive Learning, Long-Tailed Recognition

0

0

0

0

7:18

30/11/2020

Imbalance Robust Softmax for Deep Embedding Learning

Hao Zhu, Yang Yuan, Guosheng Hu and
Xiang Wu, Neil Robertson

Keywords Paper

0

0

0

0

7:16