TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task

04/07/2020

TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task

Christoph Alt, Aleksandra Gabryszak, Leonhard Hennig

Keywords: TACRED Task, Relation Extraction, Relation RE, unsupervised RE

Abstract Paper Similar Papers

Abstract: TACRED is one of the largest, most widely used crowdsourced datasets in Relation Extraction (RE). But, even with recent advances in unsupervised pre-training and knowledge enhanced neural RE, models still show a high error rate. In this paper, we investigate the questions: Have we reached a performance ceiling or is there still room for improvement? And how do crowd annotations, dataset, and models contribute to this error rate? To answer these questions, we first validate the most challenging 5K examples in the development and test sets using trained annotators. We find that label errors account for 8% absolute F1 test error, and that more than 50% of the examples need to be relabeled. On the relabeled test set the average F1 score of a large baseline model set improves from 62.1 to 70.1. After validation, we analyze misclassifications on the challenging instances, categorize them into linguistically motivated error groups, and verify the resulting error hypotheses on three state-of-the-art RE models. We show that two groups of ambiguous relations are responsible for most of the remaining errors and that models may adopt shallow heuristics on the dataset when entities are not masked.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Re-TACRED: Addressing Shortcomings of the TACRED Dataset

George Stoica, Emmanouil Antonios Platanios, Barnabas Poczos

Keywords Paper

0

0

0

0

16:45

06/12/2021

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro and
Aaron Courville, Marc Bellemare

Keywords Paper

reinforcement learning and planning

0

0

0

0

19:36

08/12/2020

Multilingual Epidemiological Text Classification: A Comparative Study

Stephen Mutuvi, Emanuela Boros, Antoine Doucet and
Adam Jatowt, Gaël Lejeune, Moses Odeo

Keywords Paper

0

0

0

0

11:09

12/07/2020

Error-Bounded Correction of Noisy Labels

Songzhu Zheng, Pengxiang Wu, Aman Goswami and
Mayank Goswami, Dimitris Metaxas, Chao Chen

Keywords Paper

Deep Learning - Algorithms

1

1

1

1

12:34

16/11/2020

Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining

Ananya Sai, Akash Mohan Kumar, Siddhartha Arora, Mitesh Khapra

Keywords Paper

large pretraining, embedding metrics, n-gram metrics, deb

0

0

0

0

10:18

06/12/2021

What Matters for Adversarial Imitation Learning?

Manu Orsini, Anton Raichuk, Leonard Hussenot and
Damien Vincent, Robert Dadashi, Sertan Girgin, Matthieu Geist, Olivier Bachem, Olivier Pietquin, Marcin Andrychowicz

Keywords Paper

theory, reinforcement learning and planning

1

0

0

0

12:48

02/02/2021

How Robust are Model Rankings : A Leaderboard Customization Approach for Equitable Evaluation

Swaroop Mishra, Anjana Arunkumar

Keywords Paper

0

0

0

0

15:50

18/07/2021

Delving into Deep Imbalanced Regression

Yuzhe Yang, Kaiwen Zha, YINGCONG CHEN and
Hao Wang, Dina Katabi

Keywords Paper

Applications

0

0

0

0

16:37

12/07/2020

Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks

David Stutz, Matthias Hein, Bernt Schiele

Keywords Paper

Adversarial Examples

0

0

0

0

14:01

02/02/2021

Generating Natural Language Attacks in a Hard Label Black Box Setting

Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

Keywords Paper

0

0

0

0

18:48

06/12/2021

Improved Regularization and Robustness for Fine-tuning in Neural Networks

Dongyue Li, Hongyang Zhang

Keywords Paper

deep learning, machine learning, robustness, vision, transfer learning

0

0

0

0

12:03

05/04/2021

Accounting for Variance in Machine Learning Benchmarks

Xavier Bouthillier, Pierre Delaunay, Mirko Bronzi and
Assya Trofimov, Brennan Nichyporuk, Justin Szeto, Nazanin Mohammadi Sepahvand, Edward Raff, Kanika Madan, Vikram Voleti, Samira Ebrahimi Kahou, Vincent Michalski, Tal Arbel, Chris Pal, Gael Varoquaux, Pascal Vincent

Keywords Paper

0

0

0

0

19:40

05/04/2021

Accounting for Variance in Machine Learning Benchmarks

Xavier Bouthillier, Pierre Delaunay, Mirko Bronzi and
Assya Trofimov, Brennan Nichyporuk, Justin Szeto, Nazanin Mohammadi Sepahvand, Edward Raff, Kanika Madan, Vikram Voleti, Samira Ebrahimi Kahou, Vincent Michalski, Tal Arbel, Chris Pal, Gael Varoquaux, Pascal Vincent

Keywords Paper

0

0

0

0

5:06

03/05/2021

What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study

Marcin Andrychowicz, Anton Raichuk, Piotr Stanczyk and
Manu Orsini, Sertan Girgin, Raphaël Marinier, Hussenot Hussenot-Desenonges, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem

Keywords Paper

continuous control, Reinforcement learning

0

0

0

0

15:34

26/04/2020

I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively

Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

Keywords Paper

model comparison

0

0

0

0

4:53

18/07/2021

Decoupling Representation Learning from Reinforcement Learning

Adam Stooke, Kimin Lee, Pieter Abbeel, Michael Laskin

Keywords Paper

Optimization, Submodular Optimization, Algorithms, Bandit Algorithms; Algorithms, Online Learning, Deep Learning, Embedding and Representation learning

0

0

0

0

5:15

02/02/2021

A Systematic Evaluation of Object Detection Networks for Scientific Plots

Pritha Ganguly, Nitesh S Methani, Mitesh M. Khapra, Pratyush Kumar

Keywords Paper

0

0

0

0

18:00

19/10/2020

Exploiting class labels to boost performance on embedding-based text classification

Arkaitz Zubiaga

Keywords Paper

embeddings, text classification, weighting schemes

0

0

0

0

4:56

03/05/2021

Bag of Tricks for Adversarial Training

Tianyu Pang, Xiao Yang, Yinpeng Dong and
Hang Su, Jun Zhu

Keywords Paper

Adversarial Examples, Adversarial Training, Robustness

0

0

0

0

3:11

06/12/2020

Multi-task Batch Reinforcement Learning with Metric Learning

Jiachen Li, Quan Vuong, Shuang Liu and
Minghua Liu, Kamil Ciosek, Henrik Christensen, Hao Su

Keywords Paper

Algorithms -> Multitask and Transfer Learning; Algorithms -> Representation Learning; Data, Challenges, Implementations, and So, Applications -> Natural Language Processing

0

0

0

0

3:15

06/12/2021

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Beining Han, Chongyi Zheng, Harris Chan and
Keiran Paster, Michael Zhang, Jimmy Ba

Keywords Paper

reinforcement learning and planning, domain adaptation, representation learning

2

0

0

0

9:31

09/07/2020

Noise-tolerant, Reliable Active Classification with Comparison Queries

Max Hopkins, Shachar Lovett, Daniel Kane, Gaurav Mahajan

Keywords Paper

Active learning, Classification, Learning with algebraic or combinatorial structure, PAC learning

0

0

0

0

15:23

02/02/2021

Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model

Qizhou Wang, Bo Han, Tongliang Liu and
Gang Niu, Jian Yang, Chen Gong

Keywords Paper

0

0

0

0

14:56

06/12/2021

Improving Robustness using Generated Data

Sven Gowal, Sylvestre-Alvise Rebuffi, Olivia Wiles and
Florian Stimberg, Dan Andrei Calian, Timothy A Mann

Keywords Paper

machine learning, robustness, adversarial robustness and security, generative model

0

0

0

0

8:52

19/04/2021

Enhancing aspect-level sentiment analysis with word dependencies

Yuanhe Tian, Guimin Chen, Yan Song

Keywords Paper

0

0

0

0

11:46

05/01/2021

Efficient Video Annotation With Visual Interpolation and Frame Selection Guidance

Alina Kuznetsova, Aakrati Talati, Yiwen Luo and
Keith Simmons, Vittorio Ferrari

Keywords Paper

0

0

0

0

5:02

26/04/2020

Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension

Xinyun Chen, Chen Liang, Adams Wei Yu and
Denny Zhou, Dawn Song, Quoc V. Le

Keywords Paper

neural symbolic, reading comprehension, question answering

0

0

0

0

4:50

06/12/2020

Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View

Christos Thrampoulidis, oymak Oymak, Mahdi Soltanolkotabi

Keywords Paper

0

0

0

0

4:25

08/12/2020

An Analysis of Dataset Overlap on Winograd-Style Tasks

Ali Emami, Kaheer Suleman, Adam Trischler, Jackie Chi Kit Cheung

Keywords Paper

0

0

0

0

16:47

25/07/2020

Query-level early exit for additive learning-to-rank ensembles

Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando and
Raffaele Perego, Salvatore Trani

Keywords Paper

efficiency/effectiveness trade-offs, query-level earlyexit, additive regression trees, learning to rank

0

0

0

0

9:36

05/01/2021

Facial Emotion Recognition With Noisy Multi-Task Annotations

Siwei Zhang, Zhiwu Huang, Danda Pani Paudel, Luc Van Gool

Keywords Paper

0

0

0

0

4:48

06/12/2021

Data Augmentation Can Improve Robustness

Sylvestre-Alvise Rebuffi, Sven Gowal, Dan Andrei Calian and
Florian Stimberg, Olivia Wiles, Timothy A Mann

Keywords Paper

robustness, adversarial robustness and security

0

0

0

0

8:06

16/11/2020

On the Role of Supervision in Unsupervised Constituency Parsing

Haoyue Shi, Karen Livescu, Kevin Gimpel

Keywords Paper

few-shot parsing, unsupervised parsing, hyperparameter tuning, model selection

0

0

0

0

7:03

22/09/2020

Are we evaluating rigorously? Benchmarking recommendation for reproducible evaluation and fair comparison

Zhu Sun, Di Yu, Hui Fang and
Jie Yang, Xinghua Qu, Jie Zhang, Cong Geng

Keywords Paper

Benchmarks, Recommender Systems, Reproducible Evaluation

0

0

0

0

2:43

02/02/2021

EvaLDA: Efficient Evasion Attacks Towards Latent Dirichlet Allocation

Qi Zhou, Haipeng Chen, Yitao Zheng, Zhen Wang

Keywords Paper

0

0

0

0

19:28

04/07/2020

Logic-Guided Data Augmentation and Regularization for Consistent Question Answering

Akari Asai, Hannaneh Hajishirzi

Keywords Paper

Logic-Guided Augmentation, Regularization, Consistent Answering, natural questions

0

0

0

0

7:14

05/01/2021

EvidentialMix: Learning With Combined Open-Set and Closed-Set Noisy Labels

Ragav Sachdeva, Filipe R. Cordeiro, Vasileios Belagiannis and
Ian Reid, Gustavo Carneiro

Keywords Paper

0

0

0

0

4:58

13/04/2021

Good classifiers are abundant in the interpolating regime

Ryan Theisen, Jason Klusowski, Michael Mahoney

Keywords Paper

0

0

0

0

2:59

02/02/2021

Counting Maximal Satisfiable Subsets

Jaroslav Bendík, Kuldeep S. Meel

Keywords Paper

0

0

0

0

19:42

14/06/2020

Towards Transferable Targeted Attack

Maosen Li, Cheng Deng, Tengjiao Li and
Junchi Yan, Xinbo Gao, Heng Huang

Keywords Paper

adversarial, adversarial attacks, targeted adversarial attacks, black box adversarial attack, metric learning, poincare distance, imagenet, transfer learning

0

0

0

0

0:58