Surprisal Predicts Code-Switching in Chinese-English Bilingual Text

16/11/2020

Surprisal Predicts Code-Switching in Chinese-English Bilingual Text

Jesús Calvillo, Le Fang, Jeremy Cole, David Reitter

Keywords: code-switching, inhibition language, computational model, surprisal

Abstract Paper Similar Papers

Abstract: Why do bilinguals switch languages within a sentence? The present observational study asks whether word surprisal and word entropy predict code-switching in bilingual written conversation. We describe and model a new dataset of Chinese-English text with 1476 clean code-switched sentences, translated back into Chinese. The model includes known control variables together with word surprisal and word entropy. We found that word surprisal, but not entropy, is a significant predictor that explains code-switching above and beyond other well-known predictors. We also found sentence length to be a significant predictor, which has been related to sentence complexity. We propose high cognitive effort as a reason for code-switching, as it leaves fewer resources for inhibition of the alternative language. We also corroborate previous findings, but this time using a computational model of surprisal, a new language pair, and doing so for written language.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

19/04/2021

Enriching non-autoregressive transformer with syntactic and semantic structures for neural machine translation

Ye Liu, Yao Wan, Jianguo Zhang and
Wenting Zhao, Philip Yu

Keywords Paper

0

0

0

0

10:18

04/07/2020

Character-Level Translation with Self-attention

Yingqiang Gao, Nikola I. Nikolov, Yuhuang Hu, Richard H.R. Hahnloser

Keywords Paper

Character-Level Translation, bilingual translation, self-attention models, transformer model

0

0

0

0

8:03

04/07/2020

A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing

Hang Yan, Xipeng Qiu, Xuanjing Huang

Keywords Paper

Joint Segmentation, Joint Parsing, Chinese segmentation, dependency parsing

0

0

0

0

8:15

01/07/2020

Robust Neural Machine Translation with ASR Errors

Haiyang Xue, Yang Feng, Shuhao Gu, Wei Chen

Keywords Paper

0

0

0

0

8:15

04/07/2020

GLUECoS: An Evaluation Benchmark for Code-Switched NLP

Simran Khanuja, Sandipan Dandapat, Anirudh Srinivasan and
Sunayana Sitaram, Monojit Choudhury

Keywords Paper

Code-Switched NLP, cross-lingual tasks, NLP tasks, Language Identification

0

0

0

0

12:08

01/07/2020

A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards

Zi-Yi Dou, Sachin Kumar, Yulia Tsvetkov

Keywords Paper

0

0

0

0

4:35

16/11/2020

Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction

Mengyun Chen, Tao Ge, Xingxing Zhang and
Furu Wei, Ming Zhou

Keywords Paper

erroneous detection, erroneous correction, inference, language-independent approach

0

0

0

0

6:27

08/12/2020

Synonym Knowledge Enhanced Reader for Chinese Idiom Reading Comprehension

Siyu Long, Ran Wang, Kun Tao and
Jiali Zeng, Xinyu Dai

Keywords Paper

0

0

0

0

9:58

02/02/2021

LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching

Boer Lyu, Lu Chen, Su Zhu, Kai Yu

Keywords Paper

0

0

0

0

15:57

16/11/2020

A Joint Multiple Criteria Model in Transfer Learning for Cross-domain Chinese Word Segmentation

Kaiyu Huang, Degen Huang, Zhuang Liu, Fengran Mo

Keywords Paper

natural, chinese segmentation, chinese, chinese tasks

0

0

0

0

10:49

04/07/2020

Pre-training via Leveraging Assisting Languages for Neural Machine Translation

Haiyue Song, Raj Dabre, Zhuoyuan Mao and
Fei Cheng, Sadao Kurohashi, Eiichiro Sumita

Keywords Paper

Neural Translation, S2S tasks, LOI, low-resource translation

0

0

0

0

12:04

07/06/2020

Learning Cross-Lingual Word Embeddings from Twitter via Distant Supervision

Jose Camacho-Collados, Yerai Doval Mosquera, Eugenio Martínez-Cámara and
Luis Espinosa-Anke, Francesco Barbieri, Steven Schockaert

Keywords Paper

embedding spaces, embeddings, languages, learning, performance, representations, shared, spaces, texts, twitter, word embeddings, words

0

0

0

0

10:39

05/12/2020

English-to-Chinese transliteration with phonetic auxiliary task

Yuan He, Shay B. Cohen

Keywords Paper

0

0

0

0

14:10

08/12/2020

Text Classification by Contrastive Learning and Cross-lingual Data Augmentation for Alzheimer’s Disease Detection

Zhiqiang Guo, Zhaoci Liu, Zhenhua Ling and
Shijin Wang, Lingjing Jin, Yunxia Li

Keywords Paper

0

0

0

0

13:12

16/11/2020

Continuity of Topic, Interaction, and Query: Learning to Quote in Online Conversations

Lingzhi Wang, Jing Li, Xingshan Zeng and
Haisong Zhang, Kam-Fai Wong

Keywords Paper

persuasions, automatic generation, language generation, encoder-decoder framework

0

0

0

0

11:43

04/07/2020

Meta-Transfer Learning for Code-Switched Speech Recognition

Genta Indra Winata, Samuel Cahyawijaya, Zhaojiang Lin and
Zihan Liu, Peng Xu, Pascale Fung

Keywords Paper

Code-Switched Recognition, speech recognition, speech tasks, language tasks

0

0

0

0

6:07

06/12/2021

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

Yichong Leng, Xu Tan, Linchen Zhu and
Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiangyang Li, Edward Lin, Tie-Yan Liu

Keywords Paper

0

0

0

0

13:44

16/11/2020

Zero-Shot Crosslingual Sentence Simplification

Jonathan Mallinson, Rico Sennrich, Mirella Lapata

Keywords Paper

sentence simplification, translation, simplification, encoder-decoder models

0

0

0

0

10:34

04/07/2020

SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

Xingyi Cheng, Weidi Xu, Kunlong Chen and
Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu, Yuan Qi

Keywords Paper

Chinese Check, spelling errors, spelling language, CSC

0

0

0

0

10:27

01/07/2020

How Self-Attention Improves Rare Class Performance in a Question-Answering Dialogue Agent

Adam Stiff, Qi Song, Eric Fosler-Lussier

Keywords Paper

0

0

0

0

7:55

04/07/2020

A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation

Raúl Vázquez, Alessandro Raganato, Mathias Creutz, Jörg Tiedemann

Keywords Paper

Multilingual Translation, Neural translation, transfer learning, translation

0

0

0

0

14:05

08/12/2020

SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP

Katsuki Chousa, Masaaki Nagata, Masaaki Nishino

Keywords Paper

0

0

0

0

14:39

04/07/2020

AdvAug: Robust Adversarial Augmentation for Neural Machine Translation

Yong Cheng, Lu Jiang, Wolfgang Macherey, Jacob Eisenstein

Keywords Paper

Robust Augmentation, Neural Translation, Neural NMT, Neural

0

0

0

0

12:16

04/07/2020

Learning Source Phrase Representations for Neural Machine Translation

Hongfei Xu, Josef van Genabith, Deyi Xiong and
Qiuhui Liu, Jingyi Zhang

Keywords Paper

Neural Translation, WMT tasks, Learning Representations, Transformer model

0

0

0

0

7:18

04/07/2020

Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach

Wenyu Du, Zhouhan Lin, Yikang Shen and
Timothy J. O'Donnell, Yoshua Bengio, Yue Zhang

Keywords Paper

Language Modeling, Syntactic Approach, neural models, intermediate representation

0

0

0

0

10:52

16/11/2020

Sparse Text Generation

Pedro Henrique Martins, Zita Marinho, André F. T. Martins

Keywords Paper

story completion, dialogue generation, text generators, language models

0

0

0

0

11:27

02/02/2021

Commonsense Knowledge Augmentation for Low-Resource Languages via Adversarial Learning

Bosung Kim, Juae Kim, Youngjoong Ko, Jungyun Seo

Keywords Paper

0

0

0

0

19:38

04/07/2020

The Cascade Transformer: an Application for Efficient Answer Sentence Selection

Luca Soldaini, Alessandro Moschitti

Keywords Paper

Efficient Selection, Answer Selection, classification tasks, classification

0

0

0

0

13:39

19/08/2021

Focus on Interaction: A Novel Dynamic Graph Model for Joint Multiple Intent Detection and Slot Filling

Zeyuan Ding, Zhihao Yang, Hongfei Lin, Jian Wang

Keywords Paper

Natural Language Processing, Dialogue, Natural Language Processing

0

0

0

0

12:36

03/05/2021

A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks

Nikunj Saunshi, Sadhika Malladi, Sanjeev Arora

Keywords Paper

representation learning, self-supervised learning, language models, theory, transfer learning, natural language processing, unsupervised learning

0

0

0

0

5:16

02/02/2021

Bridging the Domain Gap: Improve Informal Language Translation via Counterfactual Domain Adaptation

Ke Wang, Guandan Chen, Zhongqiang Huang and
Xiaojun Wan, Fei Huang

Keywords Paper

0

0

0

0

18:24

04/07/2020

Simplify the Usage of Lexicon in Chinese NER

Ruotian Ma, Minlong Peng, Qi Zhang and
Zhongyu Wei, Xuanjing Huang

Keywords Paper

Chinese recognition, NER, Lattice-LSTM, complex architecture

0

0

0

0

11:07

01/07/2020

Incorporating Uncertain Segmentation Information into Chinese NER for Social Media Text

Shengbin Jia, Ling Ding, Xiaojun Chen and
Shijia E, Yang Xiang

Keywords Paper

0

0

0

0

18:51

16/11/2020

Learning Adaptive Segmentation Policy for Simultaneous Translation

Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He and
Hua Wu, Haifeng Wang

Keywords Paper

simultaneous translation, translation, segmentation, chinese-english translation

0

0

0

0

11:43

16/11/2020

Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank

Eleftheria Briakou, Marine Carpuat

Keywords Paper

detecting content, cross-lingual nlp, machine problem, annotation

0

0

0

0

11:06

16/11/2020

Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding

Samson Tan, Shafiq Joty, Lav Varshney, Min-Yen Kan

Keywords Paper

comprehension, fine-tuning models, downstream tasks, nlp systems

0

0

0

0

10:22

16/11/2020

Small but Mighty: New Benchmarks for Split and Rephrase

Li Zhang, Huaiyu Zhu, Siddhartha Brahma, Yunyao Li

Keywords Paper

text task, fine-grained evaluation, automatic process, rule-based model

0

0

0

0

6:58

16/11/2020

A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT

Masaaki Nagata, Katsuki Chousa, Masaaki Nishino

Keywords Paper

cross-language prediction, word problem, squad task, alignment

0

0

0

0

11:13

16/11/2020

On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment

Zirui Wang, Zachary C. Lipton, Yulia Tsvetkov

Keywords Paper

multilingual models, meta-learning algorithm, multilingual representations, negative interference

0

0

0

0

12:03

05/12/2020

Mixed-lingual pre-training for cross-lingual summarization

Ruochen Xu, Chenguang Zhu, Yu Shi and
Michael Zeng, Xuedong Huang

Keywords Paper

0

0

0

0

11:49