Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction

Abstract: This paper investigates how to effectively incorporate a pre-trained masked language model (MLM), such as BERT, into an encoder-decoder (EncDec) model for grammatical error correction (GEC). The answer to this question is not as straightforward as one might expect because the previous common methods for incorporating a MLM into an EncDec model have potential drawbacks when applied to GEC. For example, the distribution of the inputs to a GEC model can be considerably different (erroneous, clumsy, etc.) from that of the corpora used for pre-training MLMs; however, this issue is not addressed in the previous methods. Our experiments show that our proposed method, where we first fine-tune a MLM with a given GEC corpus and then use the output of the fine-tuned MLM as additional features in the GEC model, maximizes the benefit of the MLM. The best-performing model achieves state-of-the-art performances on the BEA-2019 and CoNLL-2014 benchmarks. Our code is publicly available at: https://github.com/kanekomasahiro/bert-gec.

06/12/2020

Zhiyong Wu, Yun Chen, Ben Kao, Qun Liu

Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction

Masahiro Kaneko, Masato Mita, Shun Kiyono, Jun Suzuki, Kentaro Inui

Comments

Similar Papers

MPNet: Masked and Permuted Pre-training for Language Understanding

Kaitao Song, Xu Tan, Tao Qin and Jianfeng Lu, Tie-Yan Liu

Keywords Abstract Paper

A pairwise probe for understanding BERT fine-tuning on machine reading comprehension

Jie Cai, Zhengzhou Zhu, Ping Nie, Qian Liu

Keywords Abstract Paper

machine reading comprehension, pairwise, fine-tune, BERT

On the Sentence Embeddings from Pre-trained Language Models

Bohan Li, Hao Zhou, Junxian He and Mingxuan Wang, Yiming Yang, Lei Li

Keywords Abstract Paper

natural processing, semantic task, semantic tasks, pre-trained representations

Improving AMR Parsing with Sequence-to-Sequence Pre-training

Dongqin Xu, Junhui Li, Muhua Zhu and Min Zhang, Guodong Zhou

Keywords Abstract Paper

abstract parsing, amr parsing, sequence-to-sequence parsing, machine translation

Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT

Zhiyong Wu, Yun Chen, Ben Kao, Qun Liu

Keywords Abstract Paper

Analyzing BERT, linguistic tasks, dependency parsing, probing tasks

Syntactic Data Augmentation Increases Robustness to Inference Heuristics

Junghyun Min, R. Thomas McCoy, Dipanjan Das and Emily Pitler, Tal Linzen

Keywords Abstract Paper

Syntactic Augmentation, natural inference, natural NLI, NLI

Improving Disfluency Detection by Self-Training a Self-Attentive Model

Paria Jamshid Lou, Mark Johnson

Keywords Abstract Paper

Disfluency Detection, joint parsing, Self-Attentive Model, Self-attentive parsers

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Abstract Paper

How Context Affects Language Models' Factual Predictions

Fabio Petroni, Patrick Lewis, Aleksandra Piktus and Tim Rocktäschel, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel

Keywords Abstract Paper

Incorporating BERT into Neural Machine Translation

Jinhua Zhu, Yingce Xia, Lijun Wu and Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tieyan Liu

Keywords Abstract Paper

BERT, Neural Machine Translation

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Pengyu Cheng, Weituo Hao, Siyang Yuan and Shijing Si, Lawrence Carin

Keywords Abstract Paper

Mutual Information, Pretrained Text Encoders, Contrastive Learning, Fairness

Better neural machine translation by extracting linguistic information from BERT

Hassan S. Shavarani, Anoop Sarkar

Keywords Abstract Paper

Data Weighted Training Strategies for Grammatical Error Correction

Jared Lichtarge, Chris Alberti, Shankar Kumar

Keywords Abstract Paper

neural nmt, neural, example scoring, gec

Pre-training Text-to-Text Transformers for Concept-centric Common Sense

Wangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam and Seyeon Lee, Xiang Ren

Keywords Abstract Paper

Self-supervised Learning, Commonsense Reasoning, Language Model Pre-training

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum

Keywords Abstract Paper

nlp applications, fine-tuning, meta-learning problem, supervised tasks

Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting

Sanyuan Chen, Yutai Hou, Yiming Cui and Wanxiang Che, Ting Liu, Xiangzhan Yu

Keywords Abstract Paper

pretraining, pretraining tasks, learning tasks, fine-tuning bert-large

Syntactic Structure Distillation Pretraining for Bidirectional Encoders

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Abstract Paper

bert pretraining, structured tasks, natural understanding, textual learners

Investigating learning dynamics of BERT fine-tuning

Yaru Hao, Li Dong, Furu Wei, Ke Xu

Keywords Abstract Paper

Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model

Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov

Keywords Abstract Paper

Refining Language Models with Compositional Explanations

Huihan Yao, Ying Chen, Qinyuan Ye and Xisen Jin, Xiang Ren

Keywords Abstract Paper

machine learning, fairness, language

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization

Wenxuan Zhou, Bill Yuchen Lin, Xiang Ren

Kaitao Song, Xu Tan, Tao Qin and
Jianfeng Lu, Tie-Yan Liu

Keywords Paper

Keywords Paper

Bohan Li, Hao Zhou, Junxian He and
Mingxuan Wang, Yiming Yang, Lei Li

Keywords Paper

Dongqin Xu, Junhui Li, Muhua Zhu and
Min Zhang, Guodong Zhou

Keywords Paper

Keywords Paper

Junghyun Min, R. Thomas McCoy, Dipanjan Das and
Emily Pitler, Tal Linzen

Keywords Paper

Keywords Paper

Keywords Paper

Fabio Petroni, Patrick Lewis, Aleksandra Piktus and
Tim Rocktäschel, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel

Keywords Paper

Jinhua Zhu, Yingce Xia, Lijun Wu and
Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tieyan Liu

Keywords Paper

Pengyu Cheng, Weituo Hao, Siyang Yuan and
Shijing Si, Lawrence Carin

Keywords Paper

Keywords Paper

Keywords Paper

Wangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam and
Seyeon Lee, Xiang Ren

Keywords Paper

Keywords Paper

Sanyuan Chen, Yutai Hou, Yiming Cui and
Wanxiang Che, Ting Liu, Xiangzhan Yu

Keywords Paper

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and
Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Paper

Keywords Paper

Keywords Paper

Huihan Yao, Ying Chen, Qinyuan Ye and
Xisen Jin, Xiang Ren

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Wasifur Rahman, Md Kamrul Hasan, Sangwu Lee and
AmirAli Bagher Zadeh, Chengfeng Mao, Louis-Philippe Morency, Ehsan Hoque

Keywords Paper

Hai Ye, Qingyu Tan, Ruidan He and
Juntao Li, Hwee Tou Ng, Lidong Bing

Keywords Paper

Keywords Paper

Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang and
Zhiyuan Liu, Maosong Sun

Keywords Paper

Keywords Paper

Keywords Paper

Michael Glass, Alfio Gliozzo, Rishav Chakravarti and
Anthony Ferritto, Lin Pan, G P Shrivatsa Bhargav, Dinesh Garg, Avi Sil

Keywords Paper

Boxin Wang, Shuohang Wang, Yu Cheng and
Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Keywords Paper

Keywords Paper

Keywords Paper

Lingkai Kong, Haoming Jiang, Yuchen Zhuang and
Jie Lyu, Tuo Zhao, Chao Zhang

Keywords Paper

Xinyin Ma, Yongliang Shen, Gongfan Fang and
Chen Chen, Chenghao Jia, Weiming Lu

Keywords Paper

Kaixin Ma, Filip Ilievski, Jonathan Francis and
Yonatan Bisk, Eric Nyberg, Alessandro Oltramari

Keywords Paper