MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization

04/07/2020

MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization

Canwen Xu, Jiaxin Pei, Hongtao Wu, Yiyu Liu, Chenliang Li

Keywords: Classification, Question Answering, Summarization, Natural Processing

Abstract Paper Similar Papers

Abstract: Recently, large-scale datasets have vastly facilitated the development in nearly all domains of Natural Language Processing. However, there is currently no cross-task dataset in NLP, which hinders the development of multi-task learning. We propose MATINF, the first jointly labeled large-scale dataset for classification, question answering and summarization. MATINF contains 1.07 million question-answer pairs with human-labeled categories and user-generated question descriptions. Based on such rich information, MATINF is applicable for three major NLP tasks, including classification, question answering, and summarization. We benchmark existing methods and a novel multi-task baseline over MATINF to inspire further research. Our comprehensive comparison and experiments over MATINF and other datasets demonstrate the merits held by MATINF.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus

Hao Fei, Meishan Zhang, Donghong Ji

Keywords Paper

Cross-Lingual Labeling, semantic labeling, natural understanding, model transferring

0

0

0

0

10:32

19/08/2021

Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation

Zilu Guo, Zhongqiang Huang, Kenny Q. Zhu and
Guandan Chen, Kaibo Zhang, Boxing Chen, Fei Huang

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation, NLP Applications and Tools

0

0

0

0

13:53

04/07/2020

Leveraging Pre-trained Checkpoints for Sequence Generation Tasks

Sascha Rothe, Shashi Narayan and Aliaksei Severyn

Keywords Paper

Sequence Tasks, Natural Processing, Natural tasks, Sequence Generation

0

0

0

0

13:12

08/12/2020

Automatic Interlinear Glossing for Under-Resourced Languages Leveraging Translations

Xingyuan Zhao, Satoru Ozaki, Antonios Anastasopoulos and
Graham Neubig, Lori Levin

Keywords Paper

0

0

0

0

13:52

26/04/2020

Neural Machine Translation with Universal Visual Representation

Zhuosheng Zhang, Kehai Chen, Rui Wang and
Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao

Keywords Paper

Neural Machine Translation, Visual Representation, Multimodal Machine Translation, Language Representation

0

0

0

0

4:50

16/11/2020

End-to-End Slot Alignment and Recognition for Cross-Lingual NLU

Weijia Xu, Batool Haider, Saab Mansour

Keywords Paper

natural understanding, natural, nlu, goal-oriented systems

0

0

0

0

9:46

26/04/2020

Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction

Taeuk Kim, Jihun Choi, Daniel Edmiston, Sang-goo Lee

Keywords Paper

0

0

0

0

4:59

16/11/2020

Partially-Aligned Data-to-Text Generation with Distant Supervision

Zihao Fu, Bei Shi, Wai Lam and
Lidong Bing, Zhiyuan Liu

Keywords Paper

data-to-text task, generation task, dataset problem, over-generation problem

0

0

0

0

11:58

04/07/2020

Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data

Emily M. Bender, Alexander Koller

Keywords Paper

NLP tasks, natural understanding, large models, NLU

0

0

0

0

12:35

19/08/2021

Ten Years of BabelNet: A Survey

Roberto Navigli, Michele Bevilacqua, Simone Conia and
Dario Montagnini, Francesco Cecconi

Keywords Paper

Natural language processing, General

0

0

0

0

13:34

19/04/2021

Bootstrapping relation extractors using syntactic search by examples

Matan Eyal, Asaf Amrami, Hillel Taub-Tabib, Yoav Goldberg

Keywords Paper

0

0

0

0

9:55

04/07/2020

Roles and Utilization of Attention Heads in Transformer-based Neural Language Models

Jae-young Jo, Sung-Hyon Myaeng

Keywords Paper

Transformer-based Models, natural tasks, downstream tasks, probing tasks

0

0

0

0

12:17

16/11/2020

Asking without Telling: Exploring Latent Ontologies in Contextual Representations

Julian Michael, Jan A. Botha, Ian Tenney

Keywords Paper

pretrained encoders, elmo, bert, latent learning

0

0

0

0

12:45

19/04/2021

Cross-lingual visual pre-training for multimodal machine translation

Ozan Caglayan, Menekse Kuyu, Mustafa Sercan Amac and
Pranava Madhyastha, Erkut Erdem, Aykut Erdem, Lucia Specia

Keywords Paper

0

0

0

0

6:16

02/06/2020

VQuAnDa: Verbalization QUestion ANswering DAtaset

Endri Kacupaj, Hamid Zafar, Jens Lehmann, Maria Maleshkova

Keywords Paper

0

0

0

0

22:29

19/08/2021

Cross-Domain Slot Filling as Machine Reading Comprehension

Mengshi Yu, Jian Liu, Yufeng Chen and
Jinan Xu, Yujie Zhang

Keywords Paper

Natural Language Processing, Dialogue, Information Extraction

0

0

0

0

11:09

04/07/2020

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

Xiaodong Liu, Yu Wang, Jianshu Ji and
Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao

Keywords Paper

Natural Understanding, NLU tasks, classification, regression

0

0

0

0

11:49

16/11/2020

Multi-Stage Pre-training for Low-Resource Domain Adaptation

Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan and
Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avi Sil, Todd Ward

Keywords Paper

nlp tasks, fine-tuning, auxiliary tasks, lm transfer

0

0

0

0

6:56

08/12/2020

SentiX: A Sentiment-Aware Pre-Trained Model for Cross-Domain Sentiment Analysis

Jie Zhou, Junfeng Tian, Rui Wang and
Yuanbin Wu, Wenming Xiao, Liang He

Keywords Paper

0

0

0

0

12:42

26/04/2020

Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model

Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov

Keywords Paper

0

0

0

0

5:00

04/07/2020

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable

Siqi Bao, Huang He, Fan Wang and
Hua Wu, Haifeng Wang

Keywords Paper

natural tasks, conversational answering, language generation, one-to-many problem

0

0

0

0

11:43

04/07/2020

Curriculum Learning for Natural Language Understanding

Benfeng Xu, Licheng Zhang, Zhendong Mao and
Quan Wang, Hongtao Xie, Yongdong Zhang

Keywords Paper

Curriculum Learning, Natural Understanding, natural tasks, NLU tasks

0

0

0

0

9:41

19/08/2021

Exemplification Modeling: Can You Give Me an Example, Please?

Edoardo Barba, Luigi Procopio, Caterina Lacerra and
Tommaso Pasini, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:47

04/07/2020

Generalizing Natural Language Analysis through Span-relation Representations

Zhengbao Jiang, Wei Xu, Jun Araki, Graham Neubig

Keywords Paper

Natural Analysis, Natural processing, dependency parsing, semantic labeling

0

0

0

0

8:30

04/07/2020

Multi-Domain Named Entity Recognition with Genre-Aware and Agnostic Inference

Jing Wang, Mayank Kulkarni, Daniel Preotiuc-Pietro

Keywords Paper

Multi-Domain Recognition, Named recognition, domain models, NER

0

0

0

0

11:46

08/12/2020

Best Practices for Data-Efficient Modeling in NLG:How to Train Production-Ready Neural Models with Less Data

Ankit Arun, Soumya Batra, Vikas Bhardwaj and
Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White

Keywords Paper

0

0

0

0

15:01

08/12/2020

Evaluating Cross-Lingual Transfer Learning Approaches in Multilingual Conversational Agent Models

Lizhen Tan, Olga Golovneva

Keywords Paper

0

0

0

0

9:23

19/10/2020

DeText: A deep text ranking framework with BERT

Weiwei Guo, Xiaowei Liu, Sida Wang and
Huiji Gao, Ananth Sankar, Zimeng Yang, Qi Guo, Liang Zhang, Bo Long, Bee-Chung Chen, Deepak Agarwal

Keywords Paper

ranking, deep language models, natural language processing

0

0

0

0

10:40

04/07/2020

Logical Natural Language Generation from Open-Domain Tables

Wenhu Chen, Jianshu Chen, Yu Su and
Zhiyu Chen, William Yang Wang

Keywords Paper

Logical Generation, neural NLG, surface-level realizations, logical inference

0

0

0

0

11:48

08/12/2020

Knowledge-Enhanced Natural Language Inference Based on Knowledge Graphs

Zikang Wang, Linjing Li, Daniel Zeng

Keywords Paper

0

0

0

0

12:02

04/07/2020

Would you Rather? A New Benchmark for Learning Machine Alignment with Cultural Values and Social Preferences

Yi Tay, Donovan Ong, Jie Fu and
Alvin Chan, Nancy Chen, Anh Tuan Luu, Chris Pal

Keywords Paper

Machine Alignment, Understanding preferences, natural understanding, natural task

0

0

0

0

5:25

08/12/2020

XplaiNLI: Explainable Natural Language Inference through Visual Analytics

Aikaterini-Lida Kalouli, Rita Sevastjanova, Valeria de Paiva and
Richard Crouch, Mennatallah El-Assady

Keywords Paper

0

0

0

0

4:42

02/02/2021

Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text

Nishtha Madaan, Inkit Padhi, Naveen Panwar, Diptikalyan Saha

Keywords Paper

0

0

0

0

20:15

16/11/2020

With More Contexts Comes Better Performance: Contextualized Sense Embeddings for All-Round Word Sense Disambiguation

Bianca Scarlini, Tommaso Pasini, Roberto Navigli

Keywords Paper

natural processing, english task, word-in-context task, contextualized embeddings

0

0

0

0

12:11

16/11/2020

KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation

Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang

Keywords Paper

data-to-text generation, data-to-text tasks, fully-supervised setting, pre-training learning

0

0

0

0

11:10

04/07/2020

Tabula nearly Rasa: Probing the linguistic knowledge of character-level neural language models trained on unsegmented text

Michael Hahn, Marco Baroni

Keywords Paper

natural tasks, morphological tasks, language usage, Tabula

0

0

0

0

14:40

08/12/2020

Provenance for Linguistic Corpora through Nanopublications

Timo Lek, Anna de Groot, Tobias Kuhn, Roser Morante

Keywords Paper

0

0

0

0

13:54

08/12/2020

Designing Templates for Eliciting Commonsense Knowledge from Pretrained Sequence-to-Sequence Models

Jheng-Hong Yang, Sheng-Chieh Lin, Rodrigo Nogueira and
Ming-Feng Tsai, Chuan-Ju Wang, Jimmy Lin

Keywords Paper

0

0

0

0

9:14

22/11/2021

Single-Modal Entropy based Active Learning for Visual Question Answering

Dong-Jin Kim, Jae Won Cho, Jinsoo Choi and
Yunjae Jung, In So Kweon

Keywords Paper

Visual Question Answering, Vision and Language, Active Learning

0

0

0

0

2:42

16/11/2020

Coreferential Reasoning Learning for Language Representation

Deming Ye, Yankai Lin, Jiaju Du and
Zhenghao Liu, Peng Li, Maosong Sun, Zhiyuan Liu

Keywords Paper

downstream tasks, coreferential reasoning, common tasks, language models

0

0

0

0

7:30