Toxicity Detection: Does Context Really Matter?

04/07/2020

Toxicity Detection: Does Context Really Matter?

John Pavlopoulos, Jeffrey Sorensen, Lucas Dixon, Nithum Thain, Ion Androutsopoulos

Keywords: Toxicity Detection, healthy discussions, toxicity systems, toxicity classifiers

Abstract Paper Similar Papers

Abstract: Moderation is crucial to promoting healthy online discussions. Although several ‘toxicity’ detection datasets and models have been published, most of them ignore the context of the posts, implicitly assuming that comments may be judged independently. We investigate this assumption by focusing on two questions: (a) does context affect the human judgement, and (b) does conditioning on context improve performance of toxicity detection systems? We experiment with Wikipedia conversations, limiting the notion of context to the previous post in the thread and the discussion title. We find that context can both amplify or mitigate the perceived toxicity of posts. Moreover, a small but significant subset of manually labeled posts (5% in one of our experiments) end up having the opposite toxicity labels if the annotators are not provided with context. Surprisingly, we also find no evidence that context actually improves the performance of toxicity classifiers, having tried a range of classifiers and mechanisms to make them context aware. This points to the need for larger datasets of comments annotated in context. We make our code and data publicly available.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

07/06/2020

Feature-Based Explanations Don’t Help People Detect Misclassifications of Online Toxicity

Samuel Carton, Qiaozhu Mei, Paul Resnick

Keywords Paper

behaviors, changes, humans, impact, learning, measures, performance, predictions, terms, toxic, toxicity

0

0

0

0

10:24

07/06/2021

On Positive Moderation Decisions

Mattia Samory

Keywords Paper

Qualitative and quantitative studies of social media, Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of social media behavior, Text categorization, topic recognition, demographic/gender

0

0

0

0

7:37

07/06/2020

Quick, Community-Specific Learning: How Distinctive Toxicity Norms Are Maintained in Political Subreddits

Ashwin Rajadesingan, Paul Resnick, Ceren Budak

Keywords Paper

behaviors, communities, interactions, learning, political, rates, reddit, sources, toxic, toxicity

0

0

0

0

9:26

12/08/2020

What Twitter Knows: Characterizing Ad Targeting Practices, User Perceptions, and Ad Explanations Through Users' Own Twitter Data

Miranda Wei, Madison Stamos, Sophie Veys and
Nathan Reitinger, Justin Goodman, Margot Herman, Dorota Filipczuk, Ben Weinshel, Michelle L. Mazurek, Blase Ur

Keywords Paper

0

0

0

0

12:16

08/12/2020

Conversation-Aware Filtering of Online Patient Forum Messages

Anne Dirkson, Suzan Verberne, Wessel Kraaij

Keywords Paper

0

0

0

0

10:06

19/04/2021

Challenges in automated debiasing for toxic language detection

Xuhui Zhou, Maarten Sap, Swabha Swayamdipta and
Yejin Choi, Noah Smith

Keywords Paper

0

0

0

0

11:54

07/06/2020

Empirical Analysis of Multi-Task Learning for Reducing Identity Bias in Toxic Comment Detection

Ameya Vaidya, Feng Mai, Yue Ning

Keywords Paper

attention, bias, deep learning, detection, groups, identities, learning, sources, toxic, toxicity

0

0

0

0

9:59

22/09/2020

Revisiting adversarially learned injection attacks against recommender systems

Jiaxi Tang, Hongyi Wen, Ke Wang

Keywords Paper

Recommender System, Security and Privacy, Adversarial Machine Learning

0

0

0

0

2:13

07/06/2021

The Effect of Moderation on Online Mental Health Conversations

David Wadden, Tal August, Qisheng Li, Tim Althoff

Keywords Paper

Psychological, personality-based and ethnographic studies of social media, Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of social media behavior, Text categorization, topic recognitio

0

0

0

0

8:00

16/11/2020

With Little Power Comes Great Responsibility

Dallas Card, Peter Henderson, Urvashi Khandelwal and
Robin Jia, Kyle Mahowald, Dan Jurafsky

Keywords Paper

human studies, machine translation, power analysis, power analyses

0

0

0

0

11:51

22/09/2020

Long-tail session-based recommendation

Siyi Liu, Yujia Zheng

Keywords Paper

Neural network, Session-based recommendation, Long-tail recommendation

0

0

0

0

2:19

25/04/2020

Keeping Community in the Loop: Understanding Wikipedia Stakeholder Values for Machine Learning-Based Systems

C. Estelle Smith, Bowen Yu, Anjali Srivastava and
Aaron Halfaker, Loren Terveen, Haiyi Zhu

Keywords Paper

wikipedia, peer production, value sensitive algorithm design, machine learning, ores, community values

0

0

0

0

12:05

03/05/2021

Mirostat: A Neural Text Decoding Algorithm That Directly Controls Perplexity

Sourya Basu, Govardana Sachithanandam Ramachandran, Nitish Shirish Keskar, Lav R Varshney

Keywords Paper

cross-entropy, incoherence, repetitions, sampling algorithms, Neural text decoding

0

0

0

0

5:07

08/12/2020

Learning Domain Terms - Empirical Methods to Enhance Enterprise Text Analytics Performance

Gargi Roy, Lipika Dey, Mohammad Shakir, Tirthankar Dasgupta

Keywords Paper

0

0

0

0

14:36

08/12/2020

Towards Preemptive Detection of Depression and Anxiety in Twitter

David Owen, Jose Camacho-Collados, Luis Espinosa Anke

Keywords Paper

0

0

0

0

8:15

19/04/2021

From toxicity in online comments to incivility in American news: Proceed with caution

Anushree Hede, Oshin Agarwal, Linda Lu and
Diana C. Mutz, Ani Nenkova

Keywords Paper

0

0

0

0

10:10

26/04/2020

NAS evaluation is frustratingly hard

Antoine Yang, Pedro M. Esperança, Fabio M. Carlucci

Keywords Paper

neural architecture search, nas, benchmark, reproducibility, harking

0

0

0

0

4:56

05/12/2020

Rumor detection on Twitter using multiloss hierarchical BiLSTM with an attenuation factor

Yudianto Sujana, Jiawen Li, Hung-Yu Kao

Keywords Paper

0

0

0

0

13:00

25/07/2020

Search result explanations improve efficiency and trust

Jerome Ramos, Carsten Eickhoff

Keywords Paper

bias, transparency, explainability, trustworthiness, fairness, exploratory search

0

0

0

0

6:38

22/09/2020

Personality bias of music recommendation algorithms

Alessandro B. Melchiorre, Eva Zangerle, Markus Schedl

Keywords Paper

music recommender systems, neural networks, personality, bias, dataset

0

0

0

0

2:09

03/05/2021

Contemplating Real-World Object Classification

Ali Borji

Keywords Paper

Robustness, object recognition, deep learning, ObjectNet

0

0

0

0

5:12

19/08/2021

Dialogue Disentanglement in Software Engineering: How Far are We?

Ziyou Jiang, Lin Shi, Celia Chen and
Jun Hu, Qing Wang

Keywords Paper

Natural Language Processing, Dialogue, NLP Applications and Tools, Resources and Evaluation

0

0

0

0

13:20

25/07/2020

A deep recurrent survival model for unbiased ranking

Jiarui Jin, Yuchen Fang, Weinan Zhang and
Kan Ren, Guorui Zhou, Jian Xu, Yong Yu, Jun Wang, Xiaoqiang Zhu, Kun Gai

Keywords Paper

cascade model, unbiased learning-to-rank, position bias

0

0

0

0

11:32

04/07/2020

What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context

Ramy Baly, Georgi Karadzhov, Jisun An and
Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav Nakov

Keywords Paper

News Profiling, media profiling, Text Analysis, political bias

0

0

0

0

12:07

12/08/2020

TextShield: Robust Text Classification Based on Multimodal Embedding and Neural Machine Translation

Jinfeng Li, Tianyu Du, Shouling Ji and
Rong Zhang, Quan Lu, Min Yang, Ting Wang

Keywords Paper

0

0

0

0

11:32

02/02/2021

A Free Lunch for Unsupervised Domain Adaptive Object Detection without Source Data

Xianfeng Li, Weijie Chen, Di Xie and
Shicai Yang, Peng Yuan, Shiliang Pu, Yueting Zhuang

Keywords Paper

0

0

0

0

19:06

16/11/2020

Design Challenges in Low-resource Cross-lingual Entity Linking

Xingyu Fu, Weijia Shi, Xiaodong Yu and
Zian Zhao, Dan Roth

Keywords Paper

cross-lingual linking, cross-lingual, xel, grounding entities

0

0

0

0

11:36

19/04/2021

Civil rephrases of toxic texts with self-supervised transformers

Léo Laugier, John Pavlopoulos, Jeffrey Sorensen, Lucas Dixon

Keywords Paper

0

0

0

0

11:26

04/07/2020

Generating Informative Conversational Response using Recurrent Knowledge-Interaction and Knowledge-Copy

Xiexiong Lin, Weiyu Jian, Jianshan He and
Taifeng Wang, Wei Chu

Keywords Paper

Generating Response, Knowledge-driven approaches, response steps, knowledge mechanism

0

0

0

0

14:28

02/02/2021

Communicative Message Passing for Inductive Relation Reasoning

Sijie Mai, Shuangjia Zheng, Yuedong Yang, Haifeng Hu

Keywords Paper

0

0

0

0

13:25

22/09/2020

TAFA: Two-headed attention fused autoencoder for context-aware recommendations

Jin Peng Zhou, Zhaoyue Cheng, Felipe Perez, Maksims Volkovs

Keywords Paper

Deep Learning, Context-Aware Recommender Systems, Neural Attention Networks

0

0

0

0

2:06

13/04/2021

Learning user preferences in non-stationary environments

Wasim Huleihel, Soumyabrata Pal, Ofer Shayevitz

Keywords Paper

0

0

0

0

3:14

14/06/2020

Equalization Loss for Long-Tailed Object Recognition

Jingru Tan, Changbao Wang, Buyu Li and
Quanquan Li, Wanli Ouyang, Changqing Yin, Junjie Yan

Keywords Paper

long tail, object detection, lvis, object recognition

0

0

0

0

1:00

23/08/2020

TIMME: Twitter ideology-detection via multi-task multi-relational embedding

Zhiping Xiao, Weiping Song, Haoyan Xu and
Zhicheng Ren, Yizhou Sun

Keywords Paper

graph convolutional networks, social network analysis, ideology detection, heterogeneous information network, multi-task learning

0

0

0

0

17:22

08/12/2020

Variation in Coreference Strategies across Genres and Production Media

Berfin Aktaş, Manfred Stede

Keywords Paper

0

0

0

0

15:07

14/06/2020

Iteratively-Refined Interactive 3D Medical Image Segmentation With Multi-Agent Reinforcement Learning

Xuan Liao, Wenhao Li, Qisen Xu and
Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Yanfeng Wang, Ya Zhang

Keywords Paper

medical image segmentation, interactive image segmentation, reinforcement learning

0

0

0

0

1:00

04/07/2020

Generating Counter Narratives against Online Hate Speech: Data and Strategies

Serra Sinem Tekiroğlu, Yi-Ling Chung, Marco Guerini

Keywords Paper

natural generation, generation data, data filtering, expert validation/post-editing

0

0

0

0

11:16

07/06/2020

Social Media Relevance Filtering Using Perplexity-Based Positive-Unlabelled Learning

Sunghwan Mac Kim, Stephen Wan, Cécile Paris, Andreas Duenser

Keywords Paper

cases, events, languages, learning, performance, sources, topic, traditional, traditional sources, twitter

0

0

0

0

10:12

19/08/2021

User Retention: A Causal Approach with Triple Task Modeling

Yang Zhang, Dong Wang, Qiang Li and
Yue Shen, Ziqi Liu, Xiaodong Zeng, Zhiqiang Zhang, Jinjie Gu, Derek F. Wong

Keywords Paper

Machine Learning, Deep Learning, Applications of Supervised Learning, Recommender Systems

0

0

0

0

13:53

22/09/2020

A joint dynamic ranking system with DNN and vector-based clustering bandit

Yu Liu, Xiaoxiao Xu, Jincheng Wang and
Yong Li, Changping Peng, Yongjun Bao, Weipeng P.Yan

Keywords Paper

Multi-arm bandits, Learning to rank

0

0

0

0

2:21