ETC: Encoding Long and Structured Inputs in Transformers

16/11/2020

ETC: Encoding Long and Structured Inputs in Transformers

Joshua Ainslie, Santiago Ontanon, Chris Alberti, Vaclav Cvicek, Zachary Fisher, Philip Pham, Anirudh Ravula, Sumit Sanghai, Qifan Wang, Li Yang

Keywords: natural tasks, encoding inputs, transformer models, transformer architecture

Abstract Paper Similar Papers

Abstract: Transformer models have advanced the state of the art in many Natural Language Processing (NLP) tasks. In this paper, we present a new Transformer architecture, ``Extended Transformer Construction″ (ETC), that addresses two key challenges of standard Transformer architectures, namely scaling input length and encoding structured inputs. To scale attention to longer inputs, we introduce a novel global-local attention mechanism between global tokens and regular input tokens. We also show that combining global-local attention with relative position encodings and a ``Contrastive Predictive Coding″ (CPC) pre-training objective allows ETC to encode structured inputs. We achieve state-of-the-art results on four natural language datasets requiring long and/or structured inputs.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

Multi-Stage Pre-training for Low-Resource Domain Adaptation

Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan and
Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avi Sil, Todd Ward

Keywords Paper

nlp tasks, fine-tuning, auxiliary tasks, lm transfer

0

0

0

0

6:56

19/08/2021

Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation

Zilu Guo, Zhongqiang Huang, Kenny Q. Zhu and
Guandan Chen, Kaibo Zhang, Boxing Chen, Fei Huang

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation, NLP Applications and Tools

0

0

0

0

13:53

04/07/2020

Roles and Utilization of Attention Heads in Transformer-based Neural Language Models

Jae-young Jo, Sung-Hyon Myaeng

Keywords Paper

Transformer-based Models, natural tasks, downstream tasks, probing tasks

0

0

0

0

12:17

19/04/2021

Globalizing BERT-based transformer architectures for long document summarization

Quentin Grail, Julien Perez, Eric Gaussier

Keywords Paper

0

0

0

0

11:53

04/07/2020

Video-Grounded Dialogues with Pretrained Generation Language Models

Hung Le, Steven C.H. Hoi

Keywords Paper

downstream tasks, video-grounded tasks, sequence-to-sequence task, Pretrained Models

0

0

0

0

7:22

04/07/2020

Abstract Syntax as Interlingua: Scaling Up the Grammatical Framework from Controlled Languages to Robust Pipelines

Aarne Ranta, Krasimir Angelov, Normunds Gruzitis, Prasanth Kolachina

Keywords Paper

Abstract Syntax, controlled implementations, accurate generation, accurate translation

0

0

0

0

13:59

08/12/2020

Probing Multimodal Embeddings for Linguistic Properties: the Visual-Semantic Case

Adam Dahlgren Lindström, Johanna Björklund, Suna Bensch, Frank Drewes

Keywords Paper

0

0

0

0

14:20

19/10/2020

Learning to generate reformulation actions for scalable conversational query understanding

Zihan Xu, Jiangang Zhu, Ling Geng and
Yang Yang, Bojia Lin, Daxin Jiang

Keywords Paper

contextual query reformulation, question answering, conversational query understanding

0

0

0

0

6:58

04/07/2020

Generalizing Natural Language Analysis through Span-relation Representations

Zhengbao Jiang, Wei Xu, Jun Araki, Graham Neubig

Keywords Paper

Natural Analysis, Natural processing, dependency parsing, semantic labeling

0

0

0

0

8:30

03/05/2021

HyperGrid Transformers: Towards A Single Model for Multiple Tasks

Yi Tay, Zhe Zhao, Dara Bahri and
Donald Metzler, DA-CHENG Juan

Keywords Paper

Transformers, Multi-Task Learning

0

0

0

0

5:14

03/05/2021

Structured Prediction as Translation between Augmented Natural Languages

Giovanni Paolini, Ben Athiwaratkun, Jason Krone and
Jie Ma, Alessandro Achille, RISHITA ANUBHAI, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto

Keywords Paper

sequence to sequence, structured prediction, language models, transfer learning, few-shot learning, multi-task learning, generative modeling

0

0

0

0

12:16

04/07/2020

Differentiable Window for Dynamic Local Attention

Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, Xiaoli Li

Keywords Paper

Dynamic Attention, dynamic selection, NLP tasks, machine translation

0

0

0

0

9:51

06/12/2021

Topographic VAEs learn Equivariant Capsules

T. Anderson Keller, Max Welling

Keywords Paper

deep learning, generative model, graph learning

0

0

0

0

9:58

06/12/2021

Integrating Tree Path in Transformer for Code Representation

Han Peng, Ge Li, Wenhan Wang and
YunFei Zhao, Zhi Jin

Keywords Paper

machine learning, transformers

0

0

0

0

4:42

26/04/2020

Encoding word order in complex embeddings

Benyou Wang, Donghao Zhao, Christina Lioma and
Qiuchi Li, Peng Zhang, Jakob Grue Simonsen

Keywords Paper

word embedding, complex-valued neural network, position embedding

0

0

0

0

4:51

06/12/2021

Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation

Yufei Wang, Can Xu, Huang Hu and
Chongyang Tao, Stephen Wan, Mark Dras, Mark Johnson, Daxin Jiang

Keywords Paper

transformers

0

0

0

0

10:13

06/12/2021

NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM

Connor Holmes, Minjia Zhang, Yuxiong He, Bo Wu

Keywords Paper

optimization, transformers, language

0

0

0

0

10:53

08/12/2020

E.T.: Entity-Transformers. Coreference augmented Neural Language Model for richer mention representations via Entity-Transformer blocks

Nikolaos Stylianou, Ioannis Vlahavas

Keywords Paper

0

0

0

0

8:49

01/07/2020

Re-translation versus Streaming for Simultaneous Translation

Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, George Foster

Keywords Paper

0

0

0

0

23:21

26/04/2020

Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention

Chen Zhao, Chenyan Xiong, Corby Rosset and
Xia Song, Paul Bennett, Saurabh Tiwary

Keywords Paper

Transformer-XH, multi-hop QA, fact verification, extra hop attention, structured modeling

0

0

0

0

5:03

02/02/2021

A Unified Pretraining Framework for Passage Ranking and Expansion

Ming Yan, Chenliang Li, Bin Bi and
Wei Wang, Songfang Huang

Keywords Paper

0

0

0

0

16:33

04/07/2020

Character-Level Translation with Self-attention

Yingqiang Gao, Nikola I. Nikolov, Yuhuang Hu, Richard H.R. Hahnloser

Keywords Paper

Character-Level Translation, bilingual translation, self-attention models, transformer model

0

0

0

0

8:03

19/04/2021

On robustness of neural semantic parsers

Shuo Huang, Zhuang Li, Lizhen Qu, Lei Pan

Keywords Paper

0

0

0

0

11:11

04/07/2020

Extractive Summarization as Text Matching

Ming Zhong, Pengfei Liu, Yiran Chen and
Danqing Wang, Xipeng Qiu, Xuanjing Huang

Keywords Paper

Extractive Summarization, Text Matching, extractive task, semantic problem

0

0

0

0

11:44

16/11/2020

Cross-Thought for Sentence Encoder Pre-training

Shuohang Wang, Yuwei Fang, Siqi Sun and
Zhe Gan, Yu Cheng, Jingjing Liu, Jing Jiang

Keywords Paper

pre-training encoder, large-scale tasks, question answering, predicting words

0

0

0

0

12:06

19/04/2021

ENPAR:enhancing entity and entity pair representations for joint entity relation extraction

Yijun Wang, Changzhi Sun, Yuanbin Wu and
Hao Zhou, Lei Li, Junchi Yan

Keywords Paper

0

0

0

0

7:23

12/07/2020

Recurrent Hierarchical Topic-Guided RNN for Language Generation

Dandan Guo, Bo Chen, Ruiying Lu, Mingyuan Zhou

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

16:05

19/08/2021

Pretrained Language Model for Text Generation: A Survey

Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-Rong Wen

Keywords Paper

Natural language processing, General

0

0

0

0

14:25

02/02/2021

Bridging Towers of Multi-task Learning with a Gating Mechanism for Aspect-based Sentiment Analysis and Sequential Metaphor Identification

Rui Mao, Xiao Li

Keywords Paper

0

0

0

0

19:27

16/11/2020

Structured Pruning of Large Language Models

Ziheng Wang, Jeremy Wohlwend, Tao Lei

Keywords Paper

natural tasks, model compression, language tasks, pruning embeddings

0

0

0

0

11:04

04/07/2020

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

Xiaodong Liu, Yu Wang, Jianshu Ji and
Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao

Keywords Paper

Natural Understanding, NLU tasks, classification, regression

0

0

0

0

11:49

12/07/2020

Mapping natural-language problems to formal-language solutions using structured neural representations

Kezhen Chen, Qiuyuan Huang, Hamid Palangi and
Paul Smolensky, Ken Forbus, Jianfeng Gao

Keywords Paper

Applications - Neuroscience, Cognitive Science, Biology and Health

0

0

0

0

11:34

19/08/2021

Progressive Open-Domain Response Generation with Multiple Controllable Attributes

Haiqin Yang, Xiaoyuan Yao, Yiqun Duan and
Jianping Shen, Jie Zhong, Kun Zhang

Keywords Paper

Machine Learning, Learning Generative Models, Dialogue

0

0

0

0

14:43

04/07/2020

Multilingual Universal Sentence Encoder for Semantic Retrieval

Yinfei Yang, Daniel Cer, Amin Ahmad and
Mandy Guo, Jax Law, Noah Constant, Gustavo Hernandez Abrego, Steve Yuan, Chris Tar, Yun-hsuan Sung, Brian Strope, Ray Kurzweil

Keywords Paper

Semantic Retrieval, translation tasks, monolingual retrieval, translation retrieval

0

0

0

0

12:02

03/05/2021

MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

Tsz Him Cheung, Dit-Yan Yeung

Keywords Paper

automated data augmentation, deep learning, data augmentation, latent space

0

0

0

0

5:11

26/04/2020

Compositional languages emerge in a neural iterated learning model

Yi Ren, Shangmin Guo, Matthieu Labeau and
Shay B. Cohen, Simon Kirby

Keywords Paper

Compositionality, Multi-agent, Emergent language, Iterated learning

0

0

0

0

5:07

12/07/2020

Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao, Li Dong, Furu Wei and
Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, Hsiao-Wuen Hon

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

13:55

26/04/2020

Neural Machine Translation with Universal Visual Representation

Zhuosheng Zhang, Kehai Chen, Rui Wang and
Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Keywords Paper

Neural Machine Translation, Visual Representation, Multimodal Machine Translation, Language Representation

0

0

0

0

4:50

08/12/2020

Knowledge-Enhanced Natural Language Inference Based on Knowledge Graphs

Zikang Wang, Linjing Li, Daniel Zeng

Keywords Paper

0

0

0

0

12:02

04/07/2020

Enhancing Word Embeddings with Knowledge Extracted from Lexical Resources

Magdalena Biesialska, Bardia Rafieian, Marta R. Costa-jussà

Keywords Paper

semantic representations, dialog tracking, word embeddings, specialization methods

0

0

0

0

8:23