On the Ability and Limitations of Transformers to Recognize Formal Languages

16/11/2020

On the Ability and Limitations of Transformers to Recognize Formal Languages

Satwik Bhattamishra, Kabir Ahuja, Navin Goyal

Keywords: nlp tasks, construction, transformers, lstms

Abstract Paper Similar Papers

Abstract: Transformers have supplanted recurrent models in a large number of NLP tasks. However, the differences in their abilities to model different syntactic properties remain largely unknown. Past works suggest that LSTMs generalize very well on regular languages and have close connections with counter languages. In this work, we systematically study the ability of Transformers to model such languages as well as the role of its individual components in doing so. We first provide a construction of Transformers for a subclass of counter languages, including well-studied languages such as n-ary Boolean Expressions, Dyck-1, and its generalizations. In experiments, we find that Transformers do well on this subclass, and their learned mechanism strongly correlates with our construction. Perhaps surprisingly, in contrast to LSTMs, Transformers do well only on a subset of regular languages with degrading performance as we make languages more complex according to a well-known measure of complexity. Our analysis also provides insights on the role of self-attention mechanism in modeling certain behaviors and the influence of positional encoding schemes on the learning and generalization abilities of the model.

1

1

1

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

Attention is Not Only a Weight: Analyzing Transformers with Vector Norms

Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, Kentaro Inui

Keywords Paper

natural processing, norm-based analyses, word alignment, transformers

0

0

0

0

11:51

18/07/2021

Evolving Attention with Residual Convolutions

Yujing Wang, Yaming Yang, Jiangang Bai and
Mingliang Zhang, Jing Bai, JING YU, Ce Zhang, Gao Huang, Yunhai Tong

Keywords Paper

Deep Learning, Architectures

0

0

0

0

4:36

04/07/2020

Improving Transformer Models by Reordering their Sublayers

Ofir Press, Noah A. Smith, Omer Levy

Keywords Paper

task-specific reorderings, Transformer Models, Multilayer networks, randomly transformers

1

1

0

0

12:29

26/04/2020

Are Transformers universal approximators of sequence-to-sequence functions?

Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat and
Sashank Reddi, Sanjiv Kumar

Keywords Paper

Transformer, universal approximation, contextual mapping, expressive power, permutation equivariance

1

1

0

0

4:55

02/02/2021

Nyströmformer: A Nyström-based Algorithm for Approximating Self-Attention

Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty and
Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh

Keywords Paper

0

0

0

0

17:26

04/07/2020

Lipschitz Constrained Parameter Initialization for Deep Transformers

Hongfei Xu, Qiuhui Liu, Josef van Genabith and
Deyi Xiong, Jingyi Zhang

Keywords Paper

Lipschitz Initialization, Deep Transformers, Transformer model, layer normalization

1

0

0

0

4:54

19/04/2021

A neural few-shot text classification reality check

Thomas Dopierre, Christophe Gravier, Wilfried Logerais

Keywords Paper

0

0

0

0

9:47

06/12/2021

Going Beyond Linear Transformers with Recurrent Fast Weight Programmers

Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber

Keywords Paper

deep learning, reinforcement learning and planning, transformers

0

0

0

0

7:23

06/12/2021

Pay Attention to MLPs

Hanxiao Liu, Zihang Dai, David So, Quoc V Le

Keywords Paper

deep learning, transformers

0

0

0

0

1:43

19/04/2021

Adv-OLM: Generating textual adversaries via OLM

Vijit Malik, Ashwani Bhat, Ashutosh Modi

Keywords Paper

0

0

0

0

7:04

04/07/2020

Adaptive Transformers for Learning Multimodal Representations

Prajjwal Bhargava

Keywords Paper

Multimodal Representations, vision tasks, Adaptive Transformers, transformers

0

0

0

0

14:09

04/07/2020

Theoretical Limitations of Self-Attention in Neural Sequence Models

Michael Hahn

Keywords Paper

NLP, Self-Attention Models, Neural Models, Transformers

1

1

0

0

14:02

03/05/2021

Random Feature Attention

Hao Peng, Nikolaos Pappas, Dani Yogatama and
Roy Schwartz, Noah Smith, Lingpeng Kong

Keywords Paper

machine translation, transformers, language modeling, Attention

0

0

0

0

10:20

12/07/2020

Stabilizing Transformers for Reinforcement Learning

Emilio Parisotto, Francis Song, Jack Rae and
Razvan Pascanu, Caglar Gulcehre, Siddhant Jayakumar, Max Jaderberg, Raphael Lopez Kaufman, Aidan Clark, Seb Noury, Matthew Botvinick, Nicolas Heess, Raia Hadsell

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:20

06/12/2021

Are Transformers more robust than CNNs?

Yutong Bai, Jieru Mei, Alan Yuille, Cihang Xie

Keywords Paper

deep learning, robustness, adversarial robustness and security, transformers

0

0

0

0

15:44

16/11/2020

Chaining Behaviors from Data with Model-Free Reinforcement Learning

Avi Singh, Albert Yu, Jonathan Yang and
Jesse Zhang, Aviral Kumar, Sergey Levine

Keywords Paper

0

0

0

0

5:01

16/11/2020

Understanding the Difficulty of Training Transformers

Liyuan Liu, Xiaodong Liu, Jianfeng Gao and
Weizhu Chen, Jiawei Han

Keywords Paper

nlp tasks, training, transformer training, transformers

1

0

0

0

10:49

16/11/2020

What Do Position Embeddings Learn? An Empirical Study of Pre-Trained Language Model Positional Encoding

Yu-An Wang, Yun-Nung Chen

Keywords Paper

nlp tasks, transformers, position transformers, iconic tasks

0

0

0

0

10:13

03/05/2021

Parameter Efficient Multimodal Transformers for Video Representation Learning

Sangho Lee, Youngjae Yu, Gunhee Kim and
Thomas Breuel, Jan Kautz, Yale Song

Keywords Paper

Self-supervised learning, audio-visual representation learning, video representation learning

0

0

0

0

5:02

06/12/2021

Combiner: Full Attention Transformer with Sparse Computation Cost

Hongyu Ren, Hanjun Dai, Zihang Dai and
Mengjiao Yang, Jure Leskovec, Dale Schuurmans, Bo Dai

Keywords Paper

transformers

0

0

0

0

14:31

16/11/2020

Towards General and Autonomous Learning of Core Skills: A Case Study in Locomotion

Roland Hafner, Tim Hertweck, Philipp Kloeppner and
Michael Bloesch, Michael Neunert, Markus Wulfmeier, Saran Tunyasuvunakool, Nicolas Heess, Martin Riedmiller

Keywords Paper

0

0

0

0

5:24

16/11/2020

Hardware as Policy: Mechanical and Computational Co-Optimization using Deep Reinforcement Learning

Tianjian Chen, Zhanpeng He, Matei Ciocarlie

Keywords Paper

0

0

0

0

4:51

06/12/2021

Neural Circuit Synthesis from Specification Patterns

Frederik Schmitt, Christopher Hahn, Markus N Rabe, Bernd Finkbeiner

Keywords Paper

machine learning, transformers, generative model

0

0

0

0

14:12

06/12/2020

O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers

Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli and
Ankit Singh Rawat, Sashank Reddi, Sanjiv Kumar

Keywords Paper

0

0

0

0

3:23

18/07/2021

Thinking Like Transformers

Gail Weiss, Yoav Goldberg, Eran Yahav

Keywords Paper

Deep Learning, Others

0

0

0

0

5:15

06/12/2021

Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding

Shengjie Luo, Shanda Li, Tianle Cai and
Di He, Dinglan Peng, Shuxin Zheng, Guolin Ke, Liwei Wang, Tie-Yan Liu

Keywords Paper

optimization, machine learning, transformers, vision

0

0

0

0

10:07

06/12/2021

Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics

Ingmar Schubert, Danny Driess, Ozgur S. Oguz, Marc Toussaint

Keywords Paper

reinforcement learning and planning

0

0

0

0

8:36

06/12/2021

TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up

Yifan Jiang, Shiyu Chang, Zhangyang Wang

Keywords Paper

machine learning, transformers, vision, generative model

0

0

0

0

3:44

19/10/2020

Deep multifaceted transformers for multi-objective ranking in large-scale e-commerce recommender systems

Yulong Gu, Zhuoye Ding, Shuaiqiang Wang and
Lixin Zou, Yiding Liu, Dawei Yin

Keywords Paper

click-through rate prediction, conversation rate prediction, recommender systems, e-commerce, multi-task learning

0

0

0

0

10:34

16/11/2020

Transformer Based Multi-Source Domain Adaptation

Dustin Wright, Isabelle Augenstein

Keywords Paper

unsupervised adaptation, cnns, rnns, domain classifiers

0

0

0

0

11:30

06/12/2020

Deep Transformers with Latent Depth

Xian Li, Asa Cooper Stickland, Yuqing Tang, Xiang Kong

Keywords Paper

0

0

0

0

3:17

06/12/2021

ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

Yufei Xu, Qiming ZHANG, Jing Zhang, Dacheng Tao

Keywords Paper

machine learning, transformers, vision

0

0

0

0

10:16

06/12/2021

XCiT: Cross-Covariance Image Transformers

Alaaeldin Ali, Hugo Touvron, Mathilde Caron and
Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Herve Jegou

Keywords Paper

deep learning, machine learning, transformers, vision, language

0

0

0

0

13:15

02/02/2021

*-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task

Dmitry Tsarkov, Tibor Tihon, Nathan Scales and
Nikola Momchev, Danila Sinopalnikov, Nathanael Schärli

Keywords Paper

0

0

0

0

16:33

12/07/2020

Working Memory Graphs

Ricky Loynd, Roland Fernandez, Asli Celikyilmaz and
Adith Swaminathan, Matthew Hausknecht

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:36

18/07/2021

RRL: Resnet as representation for Reinforcement Learning

Rutav Shah, Vikash Kumar

Keywords Paper

Applications, Applications, Computer Vision; Deep Learning, Deep Autoencoders; Deep Learning, Generative Models; Probabilistic Methods , Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:13

26/04/2020

Reformer: The Efficient Transformer

Nikita Kitaev, Lukasz Kaiser, Anselm Levskaya

Keywords Paper

attention, locality sensitive hashing, reversible layers

0

0

0

0

14:23

06/12/2021

Transformers Generalize DeepSets and Can be Extended to Graphs & Hypergraphs

Jinwoo Kim, Saeyoon Oh, Seunghoon Hong

Keywords Paper

deep learning, transformers, graph learning

0

0

0

0

15:02

26/04/2020

Robustness Verification for Transformers

Zhouxing Shi, Huan Zhang, Kai-Wei Chang and
Minlie Huang, Cho-Jui Hsieh

Keywords Paper

Robustness, Verification, Transformers

0

0

0

0

5:46

06/12/2021

Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives

Murtaza Dalal, Deepak Pathak, Russ Salakhutdinov

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

10:01