Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

16/11/2020

Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins

Keywords: pipeline systems, ste, latent models, end-to-end training

Abstract Paper Similar Papers

Abstract: Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data. One challenge with end-to-end training of these models is the argmax operation, which has null gradient. In this paper, we focus on surrogate gradients, a popular strategy to deal with this problem. We explore latent structure learning through the angle of pulling back the downstream learning objective. In this paradigm, we discover a principled motivation for both the straight-through estimator (STE) as well as the recently-proposed SPIGOT -- a variant of STE for structured models. Our perspective leads to new algorithms in the same family. We empirically compare the known and the novel pulled-back estimators against the popular alternatives, yielding new insight for practitioners and revealing intriguing failure cases.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

06/12/2020

Modeling and Optimization Trade-off in Meta-learning

Katelyn Gao, Ozan Sener

Keywords Paper

0

0

0

0

3:21

26/04/2020

Improving Neural Language Generation with Spectrum Control

Lingxiao Wang, Jing Huang, Kevin Huang and
Ziniu Hu, Guangtao Wang, Quanquan Gu

Keywords Paper

0

0

0

0

4:58

06/12/2021

Contrastively Disentangled Sequential Variational Autoencoder

Junwen Bai, Weiran Wang, Carla Gomes

Keywords Paper

self-supervised learning, generative model, contrastive learning, representation learning, interpretability

0

0

0

0

12:53

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

06/12/2021

Noether Networks: meta-learning useful conserved quantities

Ferran Alet, Dylan Doblar, Allan Zhou and
Josh Tenenbaum, Kenji Kawaguchi, Chelsea Finn

Keywords Paper

machine learning, vision, meta learning

0

0

0

0

11:18

03/05/2021

Auxiliary Task Update Decomposition: The Good, the Bad and the Neutral

Lucio Dery, Yann Dauphin, David Grangier

Keywords Paper

multitask learning, deeplearning, pre-training, gradient decomposition

0

0

0

0

5:22

18/07/2021

Improving Generalization in Meta-learning via Task Augmentation

Huaxiu Yao, Long-Kai Huang, Linjun Zhang and
Ying WEI, Li Tian, James Zou, Junzhou Huang, Zhenhui (Jessie) Li

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

8:27

06/12/2021

SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning

Sungmin Cha, beomyoung kim, YoungJoon Yoo, Taesup Moon

Keywords Paper

machine learning, vision

0

0

0

0

14:05

06/12/2020

Posterior Re-calibration for Imbalanced Datasets

Junjiao Tian, Yen-Cheng Liu, Nathaniel Glaser and
Yen-Chang Hsu, Zsolt Kira

Keywords Paper

Algorithms -> Few-Shot Learning, Applications -> Computer Vision

0

0

0

0

3:23

18/07/2021

Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks

Nezihe Merve Gürel, Xiangyu Qi, Luka Rimanic and
Ce Zhang, Bo Li

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

0

5:46

12/07/2020

Unsupervised Transfer Learning for Spatiotemporal Predictive Networks

Zhiyu Yao, Yunbo Wang, Mingsheng Long, Jianmin Wang

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

15:19

06/12/2021

Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems

Subhabrata Dutta, Tanya Gautam, Soumen Chakrabarti, Tanmoy Chakraborty

Keywords Paper

deep learning, transformers

0

0

0

0

11:54

16/11/2020

An Imitation Game for Learning Semantic Parsers from User Interaction

Ziyu Yao, Yiqi Tang, Wen-tau Yih and
Huan Sun, Yu Su

Keywords Paper

bootstrapping, fine-tuning parsers, theoretical analysis, text-to-sql problem

0

0

0

0

11:49

19/10/2020

MetaTPOT: Enhancing a tree-based pipeline optimization tool using meta-learning

Doron Laadan, Roman Vainshtein, Yarden Curiel and
Gilad Katz, Lior Rokach

Keywords Paper

tpot, meta-learning, genetic programming(gp), automl

0

0

0

0

6:41

13/04/2021

When MAML can adapt fast and how to assist when it cannot

Sébastien M. R. Arnold, Shariq Iqbal, Fei Sha

Keywords Paper

0

0

0

0

3:00

06/12/2021

A Framework to Learn with Interpretation

Jayneel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc

Keywords Paper

deep learning, interpretability

0

0

0

0

14:05

02/02/2021

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization

Wenxuan Zhou, Bill Yuchen Lin, Xiang Ren

Keywords Paper

0

0

0

0

16:25

19/08/2021

A Structure Self-Aware Model for Discourse Parsing on Multi-Party Dialogues

Ante Wang, Linfeng Song, Hui Jiang and
Shaopeng Lai, Junfeng Yao, Min Zhang, Jinsong Su

Keywords Paper

Natural Language Processing, Dialogue, Discourse, Tagging, Chunking, and Parsing

0

0

0

0

8:33

18/07/2021

Self-Damaging Contrastive Learning

Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Keywords Paper

Algorithms, Unsupervised Learning

0

0

0

1

5:10

06/12/2021

Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning

Yifan Zhang, Bryan Hooi, Dapeng Hu and
Jian Liang, Jiashi Feng

Keywords Paper

optimization, machine learning, self-supervised learning, vision, contrastive learning, representation learning, transfer learning

0

0

0

0

14:34

06/12/2020

Functional Regularization for Representation Learning: A Unified Theoretical Perspective

Siddhant Garg, Yingyu Liang

Keywords Paper

0

0

0

0

3:19

16/11/2020

An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks

Lifu Tu, Tianyu Liu, Kevin Gimpel

Keywords Paper

natural processing, sequence labeling, semantic labeling, parsing

0

0

0

0

10:07

04/07/2020

Paraphrase Generation by Learning How to Edit from Samples

Amirhossein Kazemnejad, Mohammadreza Salehi, Mahdieh Soleymani Baghshah

Keywords Paper

Paraphrase Generation, Neural sequence, sequence generation, retrieval-based method

0

0

0

0

12:20

06/12/2021

Improving Compositionality of Neural Networks by Decoding Representations to Inputs

Mike Wu, Noah Goodman, Stefano Ermon

Keywords Paper

deep learning, machine learning, adversarial robustness and security, generative model

0

0

0

0

12:36

18/07/2021

Finding Relevant Information via a Discrete Fourier Expansion

Mohsen Heidari, Jithin Sreedharan, Gil Shamir, Wojciech Szpankowski

Keywords Paper

, Theory, Theory, Statistical Learning Theory

0

0

0

0

5:25

04/07/2020

Multi-Domain Dialogue Acts and Response Co-Generation

Kai Wang, Junfeng Tian, Rui Wang and
Xiaojun Quan, Jianxing Yu

Keywords Paper

Generating responses, task-oriented systems, response generation, automatic evaluations

0

0

0

1

10:01

18/11/2020

CCA-flow: Deep multi-view subspace learning with inverse autoregressive flow

Jia He, Feiyang Pan, Fuzhen Zhuang, Qing He

Keywords Paper

0

0

0

0

11:33

03/05/2021

Improving Transformation Invariance in Contrastive Representation Learning

Adam Foster, Rattana Pukdee, Tom Rainforth

Keywords Paper

transformation invariance, contrastive learning, representation learning

0

0

0

0

5:23

06/12/2020

Joint Contrastive Learning with Infinite Possibilities

Qi Cai, Yu Wang, Yingwei Pan and
Ting Yao, Tao Mei

Keywords Paper

0

0

0

0

3:06

06/12/2021

Simple Stochastic and Online Gradient Descent Algorithms for Pairwise Learning

ZHENHUAN YANG, Yunwen Lei, Puyu Wang and
Tianbao Yang, Yiming Ying

Keywords Paper

optimization, machine learning, privacy

0

0

0

0

14:40

14/06/2020

McFlow: Monte Carlo Flow Models for Data Imputation

Trevor W. Richardson, Wencheng Wu, Lei Lin and
Beilei Xu, Edgar A. Bernal

Keywords Paper

data imputation, alternating learning, normalizing flow models, explicit and tractable generative models, deep unsupervised learning, conditional maximum likelihood estimation, partially observed data, learning to optimize, nonlinear independent component analysis, latent variable sampling

0

0

0

0

1:01

02/02/2021

Deep Semantic Dictionary Learning for Multi-label Image Classification

Fengtao Zhou, Sheng Huang, Yun Xing

Keywords Paper

0

0

0

0

15:06

26/08/2020

Towards Competitive N-gram Smoothing

Moein Falahatgar, Mesrob Ohannessian, Alon Orlitsky, Venkatadheeraj Pichapati

Keywords Paper

0

0

0

0

17:51

18/07/2021

The Impact of Record Linkage on Learning from Feature Partitioned Data

Richard Nock, Stephen J Hardy, Wilko Henecka and
Hamish Ivey-Law, Jakub Nabaglo, Giorgio Patrini, Guillaume Smith, Brian Thorne

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

6:02

12/07/2020

Variable-Bitrate Neural Compression via Bayesian Arithmetic Coding

Yibo Yang, Robert Bamler, Stephan Mandt

Keywords Paper

Deep Learning - General

0

0

0

0

15:08

22/11/2021

SemGIF: A Semantics Guided Incremental Few-shot Learning Framework with Generative Replay

S Divakar Bhat, Biplab Banerjee, Subhasis Chaudhuri

Keywords Paper

Incremental few shot learning, Few shot learning, Incremental learning, feature augmentation, cross dataset, heterogenous

0

0

0

0

3:05

06/12/2021

Refining Language Models with Compositional Explanations

Huihan Yao, Ying Chen, Qinyuan Ye and
Xisen Jin, Xiang Ren

Keywords Paper

machine learning, fairness, language

0

0

0

0

13:17

06/12/2021

Differentiable Spline Approximations

Minsu Cho, Aditya Balu, Ameya Joshi and
Anjana Deva Prasad, Biswajit Khara, Soumik Sarkar, Baskar Ganapathysubramanian, Adarsh Krishnamurthy, Chinmay Hegde

Keywords Paper

optimization, machine learning

0

0

0

0

7:18

16/11/2020

A Diagnostic Study of Explainability Techniques for Text Classification

Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

Keywords Paper

downstream tasks, machine learning, explainability techniques, diverse techniques

0

0

0

0

11:24