Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention

Abstract: Most Chinese pre-trained models take character as the basic unit and learn representation according to character's external contexts, ignoring the semantics expressed in the word, which is the smallest meaningful utterance in Chinese. Hence, we propose a novel word-aligned attention to exploit explicit word information, which is complementary to various character-based Chinese pre-trained language models. Specifically, we devise a pooling mechanism to align the character-level attention to the word level and propose to alleviate the potential issue of segmentation error propagation by multi-source information fusion. As a result, word and character information are explicitly integrated at the fine-tuning procedure. Experimental results on five Chinese NLP benchmark tasks demonstrate that our method achieves significant improvements against BERT, ERNIE and BERT-wwm.

30/11/2020

Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention

Yanzeng Li, Bowen Yu, Xue Mengge, Tingwen Liu

Comments

Similar Papers

Self-supervised Learning of Orc-Bert Augmentator for Recognizing Few-Shot Oracle Characters

Wenhui Han, Xinlin Ren, Hangyu Lin and Yanwei Fu, Xiangyang Xue

Keywords Abstract Paper

Entity Enhanced BERT Pre-training for Chinese NER

Chen Jia, Yuefeng Shi, Qinrong Yang, Yue Zhang

Keywords Abstract Paper

chinese ner, pre-training, ner fine-tuning, ner

Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation

Ning Ding, Dingkun Long, Guangwei Xu and Muhua Zhu, Pengjun Xie, Xiaobin Wang, Haitao Zheng

Keywords Abstract Paper

Coupling Annotation, Cross-Domain Segmentation, Chinese segmentation, Chinese CWS

Try to Substitute: An Unsupervised Chinese Word Sense Disambiguation Method Based on HowNet

Bairu Hou, Fanchao Qi, Yuan Zang and Xurui Zhang, Zhiyuan Liu, Maosong Sun

Keywords Abstract Paper

LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching

Boer Lyu, Lu Chen, Su Zhu, Kai Yu

Keywords Abstract Paper

Multi-grained Chinese Word Segmentation with Weakly Labeled Data

Chen Gong, Zhenghua Li, Bowei Zou, Min Zhang

Keywords Abstract Paper

Spelling Error Correction with Soft-Masked BERT

Shaohua Zhang, Haoran Huang, Jicong Liu, Hang Li

Keywords Abstract Paper

Spelling Correction, Chinese correction, Chinese CSC, error detection

A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing

Hang Yan, Xipeng Qiu, Xuanjing Huang

Keywords Abstract Paper

Joint Segmentation, Joint Parsing, Chinese segmentation, dependency parsing

Handwritten Chinese Font Generation With Collaborative Stroke Refinement

Chuan Wen, Yujie Pan, Jie Chang and Ya Zhang, Siheng Chen, Yanfeng Wang, Mei Han, Qi Tian

Keywords Abstract Paper

A Complete Shift-Reduce Chinese Discourse Parser with Robust Dynamic Oracle

Shyh-Shiun Hung, Hen-Hsen Huang, Hsin-Hsi Chen

Keywords Abstract Paper

Chinese parsing, rhetorical recognition, Shift-Reduce Parser, Robust Oracle

Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge

Yuanhe Tian, Yan Song, Xiang Ao and Fei Xia, Xiaojun Quan, Tong Zhang, Yonggang Wang

Keywords Abstract Paper

Chinese Segmentation, Part-of-speech Tagging, Chinese processing, joint tagging

TED-CDB: A Large-Scale Chinese Discourse Relation Dataset on TED Talks

Wanqiu Long, Bonnie Webber, Deyi Xiong

Keywords Abstract Paper

-way classification, same-language transfer, same-domain transfer, ted-cdb

CLiMP: A benchmark for Chinese language model evaluation

Beilei Xiang, Changbing Yang, Yu Li and Alex Warstadt, Katharina Kann

Keywords Abstract Paper

Simplify the Usage of Lexicon in Chinese NER

Ruotian Ma, Minlong Peng, Qi Zhang and Zhongyu Wei, Xuanjing Huang

Keywords Abstract Paper

Chinese recognition, NER, Lattice-LSTM, complex architecture

Synonym Knowledge Enhanced Reader for Chinese Idiom Reading Comprehension

Siyu Long, Ran Wang, Kun Tao and Jiali Zeng, Xinyu Dai

Keywords Abstract Paper

UnihanLM: Coarse-to-fine Chinese-Japanese language model pretraining with the unihan database

Canwen Xu, Tao Ge, Chenliang Li, Furu Wei

Keywords Abstract Paper

Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction

Mengyun Chen, Tao Ge, Xingxing Zhang and Furu Wei, Ming Zhou

Keywords Abstract Paper

erroneous detection, erroneous correction, inference, language-independent approach

Diverse and Informative Dialogue Generation with Context-Specific Commonsense Knowledge Awareness

Sixing Wu, Ying Li, Dawei Zhang and Yang Zhou, Zhonghai Wu

Keywords Abstract Paper

Diverse Generation, dialogue generation, Context-Specific Awareness, Generative systems

Does Chinese BERT Encode Word Structure?

Yile Wang, Leyang Cui, Yue Zhang

Keywords Abstract Paper

Fine-grained Information Status Classification Using Discourse Context-Aware BERT

Yufang Hou

Keywords Abstract Paper

Joint Chinese Word Segmentation and Part-of-speech Tagging via Multi-channel Attention of Character N-grams

Yuanhe Tian, Yan Song, Fei Xia

Keywords Abstract Paper

Chinese document classification with bi-directional convolutional language model

Bin Liu, Guosheng Yin

Keywords Abstract Paper

Wenhui Han, Xinlin Ren, Hangyu Lin and
Yanwei Fu, Xiangyang Xue

Keywords Paper

Keywords Paper

Ning Ding, Dingkun Long, Guangwei Xu and
Muhua Zhu, Pengjun Xie, Xiaobin Wang, Haitao Zheng

Keywords Paper

Bairu Hou, Fanchao Qi, Yuan Zang and
Xurui Zhang, Zhiyuan Liu, Maosong Sun

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Chuan Wen, Yujie Pan, Jie Chang and
Ya Zhang, Siheng Chen, Yanfeng Wang, Mei Han, Qi Tian

Keywords Paper

Keywords Paper

Yuanhe Tian, Yan Song, Xiang Ao and
Fei Xia, Xiaojun Quan, Tong Zhang, Yonggang Wang

Keywords Paper

Keywords Paper

Beilei Xiang, Changbing Yang, Yu Li and
Alex Warstadt, Katharina Kann

Keywords Paper

Ruotian Ma, Minlong Peng, Qi Zhang and
Zhongyu Wei, Xuanjing Huang

Keywords Paper

Siyu Long, Ran Wang, Kun Tao and
Jiali Zeng, Xinyu Dai

Keywords Paper

Keywords Paper

Mengyun Chen, Tao Ge, Xingxing Zhang and
Furu Wei, Ming Zhou

Keywords Paper

Sixing Wu, Ying Li, Dawei Zhang and
Yang Zhou, Zhonghai Wu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Mohan Zhang, Luchen Tan, Zihang Fu and
Kun Xiong, Jimmy Lin, Ming Li, Zhengkai Tu

Keywords Paper

Keywords Paper

Keywords Paper

Hanqing Tao, Shiwei Tong, Kun Zhang and
Tong Xu, Qi Liu, Enhong Chen, Min Hou

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

Sixing Wu, Minghui Wang, Dawei Zhang and
Yang Zhou, Ying Li, Zhonghai Wu

Keywords Paper

Keywords Paper

Xiaoli Huang, Tongge Xu, Lvan Jiao and
Yueran Zu, Youmin Zhang

Keywords Paper

Xingyi Cheng, Weidi Xu, Kunlong Chen and
Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu, Yuan Qi

Keywords Paper