How LSTM Encodes Syntax: Exploring Context Vectors and Semi-Quantization on Natural Text

Abstract: Long Short-Term Memory recurrent neural network (LSTM) is widely used and known to capture informative long-term syntactic dependencies. However, how such information are reflected in its internal vectors for natural text has not yet been sufficiently investigated. We analyze them by learning a language model where syntactic structures are implicitly given. We empirically show that the context update vectors, i.e. outputs of internal gates, are approximately quantized to binary or ternary values to help the language model to count the depth of nesting accurately, as Suzgun et al. (2019) recently show for synthetic Dyck languages. For some dimensions in the context vector, we show that their activations are highly correlated with the depth of phrase structures, such as VP and NP. Moreover, with an L1 regularization, we also found that it can accurately predict whether a word is inside a phrase structure or not from a small number of components of the context vector. Even for the case of learning from raw text, context vectors are shown to still correlate well with the phrase structures. Finally, we show that natural clusters of the functional words and the part of speeches that trigger phrases are represented in a small but principal subspace of the context-update vector of LSTM.

06/12/2021

How LSTM Encodes Syntax: Exploring Context Vectors and Semi-Quantization on Natural Text

Chihiro Shibata, Kei Uchiumi, Daichi Mochihashi

Comments

Similar Papers

Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning

Colin Wei, Sang Michael Xie, Tengyu Ma

Keywords Abstract Paper

theory, machine learning, self-supervised learning, generative model, representation learning, language

Residual Energy-Based Models for Text Generation

Yuntian Deng, Anton Bakhtin, Myle Ott and Arthur Szlam, Marc'Aurelio Ranzato

Keywords Abstract Paper

energy-based models, text generation

Probing Pretrained Language Models for Lexical Semantics

Ivan Vulić, Edoardo Maria Ponti, Robert Litschko and Goran Glavaš, Anna Korhonen

Keywords Abstract Paper

lexical tasks, pretrained models, lms, lexical strategies

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization

Wenxuan Zhou, Bill Yuchen Lin, Xiang Ren

Keywords Abstract Paper

Decision-Guided Weighted Automata Extraction from Recurrent Neural Networks

Xiyue Zhang, Xiaoning Du, Xiaofei Xie and Lei Ma, Yang Liu, Meng Sun

Keywords Abstract Paper

Syntactic Structure Distillation Pretraining for Bidirectional Encoders

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Abstract Paper

bert pretraining, structured tasks, natural understanding, textual learners

How does BERT's attention change when you fine-tune? An analysis methodology and a case study in negation scope

Yiyun Zhao, Steven Bethard

Keywords Abstract Paper

downstream task, NLP problems, knowledge-related tasks, downstream tasks

Do Response Selection Models Really Know What’s Next? Utterance Manipulation Strategies for Multi-turn Response Selection

Taesun Whang, Dongyub Lee, Dongsuk Oh and Chanhee Lee, Kijong Han, Dong-hun Lee, Saebyeok Lee

Keywords Abstract Paper

On the Linguistic Representational Power of Neural Machine Translation Models

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi and Hassan Sajjad, James Glass

Keywords Abstract Paper

Linguistic Models, natural processing, artificial intelligence, translating languages

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Abstract Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

Low-Resource Generation of Multi-hop Reasoning Questions

Jianxing Yu, Wei Liu, Shuang Qiu and Qinliang Su, Kai Wang, Xiaojun Quan, Jian Yin

Keywords Abstract Paper

Low-Resource Questions, generating questions, machine comprehension, multi-hop model

Refining Language Models with Compositional Explanations

Huihan Yao, Ying Chen, Qinyuan Ye and Xisen Jin, Xiang Ren

Keywords Abstract Paper

machine learning, fairness, language

Logical Natural Language Generation from Open-Domain Tables

Wenhu Chen, Jianshu Chen, Yu Su and Zhiyu Chen, William Yang Wang

Keywords Abstract Paper

Logical Generation, neural NLG, surface-level realizations, logical inference

Multi-timescale Representation Learning in LSTM Language Models

Shivangi Mahto, Vy Vo, Javier Turek, Alexander Huth

Keywords Abstract Paper

LSTM, timescales, Language Model

With More Contexts Comes Better Performance: Contextualized Sense Embeddings for All-Round Word Sense Disambiguation

Bianca Scarlini, Tommaso Pasini, Roberto Navigli

Keywords Abstract Paper

natural processing, english task, word-in-context task, contextualized embeddings

A pairwise probe for understanding BERT fine-tuning on machine reading comprehension

Jie Cai, Zhengzhou Zhu, Ping Nie, Qian Liu

Keywords Abstract Paper

machine reading comprehension, pairwise, fine-tune, BERT

CopyBERT: A Unified Approach to Question Generation with Self-Attention

Stalin Varanasi, Saadullah Amin, Guenter Neumann

Keywords Abstract Paper

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Boxin Wang, Shuohang Wang, Yu Cheng and Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Keywords Abstract Paper

adversarial training, QA, NLI, BERT, information theory, adversarial robustness

Encoding word order in complex embeddings

Benyou Wang, Donghao Zhao, Christina Lioma and Qiuchi Li, Peng Zhang, Jakob Grue Simonsen

Keywords Abstract Paper

word embedding, complex-valued neural network, position embedding

A Framework to Learn with Interpretation

Jayneel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc

Keywords Abstract Paper

deep learning, interpretability

Keywords Paper

Yuntian Deng, Anton Bakhtin, Myle Ott and
Arthur Szlam, Marc'Aurelio Ranzato

Keywords Paper

Ivan Vulić, Edoardo Maria Ponti, Robert Litschko and
Goran Glavaš, Anna Korhonen

Keywords Paper

Keywords Paper

Xiyue Zhang, Xiaoning Du, Xiaofei Xie and
Lei Ma, Yang Liu, Meng Sun

Keywords Paper

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and
Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Paper

Keywords Paper

Taesun Whang, Dongyub Lee, Dongsuk Oh and
Chanhee Lee, Kijong Han, Dong-hun Lee, Saebyeok Lee

Keywords Paper

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi and
Hassan Sajjad, James Glass

Keywords Paper

Keywords Paper

Jianxing Yu, Wei Liu, Shuang Qiu and
Qinliang Su, Kai Wang, Xiaojun Quan, Jian Yin

Keywords Paper

Huihan Yao, Ying Chen, Qinyuan Ye and
Xisen Jin, Xiang Ren

Keywords Paper

Wenhu Chen, Jianshu Chen, Yu Su and
Zhiyu Chen, William Yang Wang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Boxin Wang, Shuohang Wang, Yu Cheng and
Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Keywords Paper

Benyou Wang, Donghao Zhao, Christina Lioma and
Qiuchi Li, Peng Zhang, Jakob Grue Simonsen

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Hongfei Xu, Josef van Genabith, Deyi Xiong and
Qiuhui Liu, Jingyi Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Seung Jun Moon, Sangwoo Mo, Kimin Lee and
Jaeho Lee, Jinwoo Shin

Keywords Paper

Fabio Petroni, Patrick Lewis, Aleksandra Piktus and
Tim Rocktäschel, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel

Keywords Paper

Keywords Paper

Wangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam and
Seyeon Lee, Xiang Ren

Keywords Paper

Zeqi Tan, Yongliang Shen, Shuai Zhang and
Weiming Lu, Yueting Zhuang

Keywords Paper

Kyle Aitken, Vinay Ramasesh, Ankush Garg and
Yuan Cao, David Sussillo, Niru Maheswaranathan

Keywords Paper

Keywords Paper

Jheng-Hong Yang, Sheng-Chieh Lin, Rodrigo Nogueira and
Ming-Feng Tsai, Chuan-Ju Wang, Jimmy Lin

Keywords Paper

Keywords Paper

Keywords Paper