Mixed-lingual pre-training for cross-lingual summarization

05/12/2020

Mixed-lingual pre-training for cross-lingual summarization

Ruochen Xu, Chenguang Zhu, Yu Shi, Michael Zeng, Xuedong Huang

Keywords:

Abstract Paper Similar Papers

Abstract: Cross-lingual Summarization (CLS) aims at producing a summary in the target language for an article in the source language. Traditional solutions employ a two-step approach, i.e. translate -> summarize or summarize -> translate. Recently, end-to-end models have achieved better results, but these approaches are mostly limited by their dependence on large-scale labeled data. We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks such as translation and monolingual tasks like masked language models. Thus, our model can leverage the massive monolingual data to enhance its modeling of language. Moreover, the architecture has no task-specific components, which saves memory and increases optimization efficiency. We show in experiments that this pre-training scheme can effectively boost the performance of cross-lingual summarization. In NCLS dataset, our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

AdvAug: Robust Adversarial Augmentation for Neural Machine Translation

Yong Cheng, Lu Jiang, Wolfgang Macherey, Jacob Eisenstein

Keywords Paper

Robust Augmentation, Neural Translation, Neural NMT, Neural

0

0

0

0

12:16

19/04/2021

Enriching non-autoregressive transformer with syntactic and semantic structures for neural machine translation

Ye Liu, Yao Wan, Jianguo Zhang and
Wenting Zhao, Philip Yu

Keywords Paper

0

0

0

0

10:18

16/11/2020

Non-Autoregressive Machine Translation with Latent Alignments

Chitwan Saharia, William Chan, Saurabh Saxena, Mohammad Norouzi

Keywords Paper

non-autoregressive translation, machine translation, single-step translation, re-scoring

0

0

0

0

11:39

01/07/2020

A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards

Zi-Yi Dou, Sachin Kumar, Yulia Tsvetkov

Keywords Paper

0

0

0

0

4:35

04/07/2020

Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation

Aditya Siddhant, Ankur Bapna, Yuan Cao and
Orhan Firat, Mia Chen, Sneha Kudugunta, Naveen Arivazhagan, Yonghui Wu

Keywords Paper

Multilingual Translation, Multilingual , low-resource translation, low-resource NMT

1

1

0

0

6:51

14/06/2020

OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold

Mohamed Yousef, Tom E. Bishop

Keywords Paper

text recognition, weakly supervised, handwriting recognition, convolutional neural network fully convolutional, ctc

0

0

0

0

1:00

02/02/2021

Multilingual Transfer Learning for QA using Translation as Data Augmentation

Mihaela Bornea, Lin Pan, Sara Rosenthal and
Radu Florian, Avirup Sil

Keywords Paper

0

0

0

0

15:44

04/07/2020

Multi-Task Neural Model for Agglutinative Language Translation

Yirong Pan, Xiao Li, Yating Yang, Rui Dong

Keywords Paper

Agglutinative Translation, agglutinative task, bi-directional translation, agglutinative stemming

0

0

0

0

11:49

16/11/2020

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

Uma Roy, Noah Constant, Rami Al-Rfou and
Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords Paper

language-agnostic retrieval, cross-lingual tasks, cross-lingual retrieval, alignment

0

0

0

0

12:07

16/11/2020

Language Model Prior for Low-Resource Neural Machine Translation

Christos Baziotis, Barry Haddow, Alexandra Birch

Keywords Paper

neural translation, neural tm, knowledge distillation, training time

0

0

0

0

11:16

02/02/2021

Commonsense Knowledge Augmentation for Low-Resource Languages via Adversarial Learning

Bosung Kim, Juae Kim, Youngjoong Ko, Jungyun Seo

Keywords Paper

0

0

0

0

19:38

08/12/2020

Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity

Hamza Harkous, Isabel Groves, Amir Saffari

Keywords Paper

0

0

0

0

14:37

19/08/2021

ALaSca: an Automated approach for Large-Scale Lexical Substitution

Caterina Lacerra, Tommaso Pasini, Rocco Tripodi, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:27

19/04/2021

Alignment verification to improve NMT translation towards highly inflectional languages with limited resources

George Tambouratzis

Keywords Paper

0

0

0

0

12:02

16/11/2020

Multi-task Learning for Multilingual Neural Machine Translation

Yiren Wang, ChengXiang Zhai, Hany Hassan

Keywords Paper

bilingual nmt, bilingual, multilingual systems, translation task

0

0

0

0

10:48

18/07/2021

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation

Xiang Lin, Simeng Han, Shafiq Joty

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

16:00

16/11/2020

Zero-Shot Crosslingual Sentence Simplification

Jonathan Mallinson, Rico Sennrich, Mirella Lapata

Keywords Paper

sentence simplification, translation, simplification, encoder-decoder models

0

0

0

0

10:34

05/12/2020

Self-supervised learning for pairwise data refinement

Gustavo Hernandez Abrego, Bowen Liang, Wei Wang and
Zarana Parekh, Yinfei Yang, Yunhsuan Sung

Keywords Paper

0

0

0

0

15:17

18/07/2021

Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining and Speech Translation

Renjie Zheng, Junkun Chen, Mingbo Ma, Liang Huang

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

5:19

12/07/2020

More Information Supervised Probabilistic Deep Face Embedding Learning

Ying Huang, Shangfeng Qiu, Wenwei Zhang and
Xianghui Luo, Jinzhuo Wang

Keywords Paper

Applications - Computer Vision

0

0

0

0

12:10

06/12/2020

Cross-lingual Retrieval for Iterative Self-Supervised Training

Chau Tran, Yuqing Tang, Xian Li, Jiatao Gu

Keywords Paper

0

0

0

0

3:11

26/04/2020

Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model

Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov

Keywords Paper

0

0

0

0

5:00

04/07/2020

Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization

Yue Cao, Hui Liu, Xiaojun Wan

Keywords Paper

Neural Summarization, Cross-lingual summarization, cross-lingual training, pipeline methods

0

0

0

0

9:30

14/06/2020

Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation

Necati Cihan Camgöz, Oscar Koller, Simon Hadfield, Richard Bowden

Keywords Paper

sign language translation, sign language recognition, transformers, neuralmachine translation, multi-task learning, sequence-to-sequence

0

0

0

0

5:00

14/06/2020

Towards Accurate Scene Text Recognition With Semantic Reasoning Networks

Deli Yu, Xuan Li, Chengquan Zhang and
Tao Liu, Junyu Han, Jingtuo Liu, Errui Ding

Keywords Paper

scene text recognition, global semantic reasoning, strong semantic context, parallel decoding/inference, parallel visual attention, efficient decoder.

0

0

0

0

1:01

19/04/2021

PPT: Parsimonious parser transfer for unsupervised cross-lingual adaptation

Kemal Kurniawan, Lea Frermann, Philip Schulz, Trevor Cohn

Keywords Paper

0

0

0

0

11:52

04/07/2020

Relation Extraction with Explanation

Hamed Shahbazi, Xiaoli Fern, Reza Ghaeini, Prasad Tadepalli

Keywords Paper

relation extraction, Explanation, neural models, relation models

0

0

0

0

6:40

05/12/2020

Heads-up! Unsupervised constituency parsing via self-attention heads

Bowen Li, Taeuk Kim, Reinald Kim Amplayo, Frank Keller

Keywords Paper

0

0

0

0

13:55

16/11/2020

Partially-Aligned Data-to-Text Generation with Distant Supervision

Zihao Fu, Bei Shi, Wai Lam and
Lidong Bing, Zhiyuan Liu

Keywords Paper

data-to-text task, generation task, dataset problem, over-generation problem

0

0

0

0

11:58

16/11/2020

Towards Better Context-aware Lexical Semantics:Adjusting Contextualized Representations through Static Anchors

Qianchu Liu, Diana McCarthy, Anna Korhonen

Keywords Paper

transformation, contextualized models, dynamic embeddings, post-processing technique

0

0

0

0

6:53

16/11/2020

Stepwise Extractive Summarization and Planning with Structured Transformers

Shashi Narayan, Joshua Maynez, Jakub Adamek and
Daniele Pighin, Blaz Bratanic, Ryan McDonald

Keywords Paper

extractive summarization, stepwise summarization, sentence filtering, rotowire generation

0

0

0

0

11:56

14/09/2020

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Paper

deep learning, information redundancy, pruning

0

0

0

0

14:48

16/11/2020

DAGA: Data Augmentation with a Generation Approach forLow-resource Tagging Tasks

Bosheng Ding, Linlin Liu, Lidong Bing and
Canasai Kruengkrai, Thien Hai Nguyen, Shafiq Joty, Luo Si, Chunyan Miao

Keywords Paper

machine learning, generalization, low-resource tasks, named recognition

0

0

0

0

11:09

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

16/11/2020

Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

Nils Reimers, Iryna Gurevych

Keywords Paper

training, sentence models, monolingual models, monolingual model

0

0

0

0

11:31

16/11/2020

Generating Diverse Translation from Model Distribution with Dropout

Xuanfu Wu, Yang Feng, Chenze Shao

Keywords Paper

neural, inference, chinese-english tasks, nmt

0

0

0

0

11:09

04/07/2020

Lexically Constrained Neural Machine Translation with Levenshtein Transformer

Raymond Hendy Susanto, Shamil Chollampatt, Liling Tan

Keywords Paper

Lexically Translation, neural translation, Levenshtein Transformer, beam decoding

0

0

0

0

7:05

04/07/2020

A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation

Raúl Vázquez, Alessandro Raganato, Mathias Creutz, Jörg Tiedemann

Keywords Paper

Multilingual Translation, Neural translation, transfer learning, translation

0

0

0

0

14:05

18/07/2021

Self-supervised and Supervised Joint Training for Resource-rich Machine Translation

Yong Cheng, Wei Wang, Lu Jiang, Wolfgang Macherey

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

5:21

16/11/2020

LNMap: Departures from Isomorphic Assumption in Bilingual Lexicon Induction Through Non-Linear Mapping in Latent Space

Tasnim Mohiuddin, M Saiful Bari, Shafiq Joty

Keywords Paper

bilingual induction, bilingual, bli, semi-supervised method

0

0

0

0

12:09