Hate Speech Detection in Saudi Twittersphere: A Deep Learning Approach

08/12/2020

Hate Speech Detection in Saudi Twittersphere: A Deep Learning Approach

Raghad Alshaalan, Hend Al-Khalifa

Keywords:

Abstract Paper Similar Papers

Abstract: With the rise of hate speech phenomena in Twittersphere, significant research efforts have been undertaken to provide automatic solutions for detecting hate speech, varying from simple ma-chine learning models to more complex deep neural network models. Despite that, research works investigating hate speech problem in Arabic are still limited. This paper, therefore, aims to investigate several neural network models based on Convolutional Neural Network (CNN) and Recurrent Neural Networks (RNN) to detect hate speech in Arabic tweets. It also evaluates the recent language representation model BERT on the task of Arabic hate speech detection. To conduct our experiments, we firstly built a new hate speech dataset that contains 9,316 annotated tweets. Then, we conducted a set of experiments on two datasets to evaluate four models: CNN, GRU, CNN+GRU and BERT. Our experimental results on our dataset and an out-domain dataset show that CNN model gives the best performance with an F1-score of 0.79 and AUROC of 0.89.

The video of this talk cannot be embedded. You can watch it here:

https://underline.io/lecture/6531-hate-speech-detection-in-saudi-twittersphere-a-deep-learning-ap-proach

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at COLING Workshops 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/09/2020

A Deep Dive into Multilingual Hate Speech Classification

Sai Saketh Aluru, Binny Mathew, Punyajoy Saha, Animesh Mukherjee

Keywords Paper

hate speech, multilingual, classification, bert, embeddings

0

0

0

0

14:20

16/11/2020

Hate-Speech and Offensive Language Detection in Roman Urdu

Hammad Rizwan, Muhammad Haroon Shakeel, Asim Karim

Keywords Paper

automatic detection, hate-speech detection, language models, transfer learning

0

0

0

0

10:55

08/12/2020

Team Oulu at SemEval-2020 Task 12: Multilingual Identification of Offensive Language, Type and Target of Twitter Post Using Translated Datasets

Md Saroar Jahan

Keywords Paper

0

0

0

0

10:36

01/07/2020

Sarcasm Identification and Detection in Conversion Context using BERT

Kalaivani A., Thenmozhi D.

Keywords Paper

0

0

0

0

5:17

01/07/2020

A Transformer Approach to Contextual Sarcasm Detection in Twitter

Hunter Gregory, Steven Li, Pouya Mohammadi and
Natalie Tarn, Rachel Draelos, Cynthia Rudin

Keywords Paper

0

0

0

0

4:56

08/12/2020

Towards Preemptive Detection of Depression and Anxiety in Twitter

David Owen, Jose Camacho-Collados, Luis Espinosa Anke

Keywords Paper

0

0

0

0

8:15

04/07/2020

Word-level Textual Adversarial Attacking as Combinatorial Optimization

Yuan Zang, Fanchao Qi, Chenghao Yang and
Zhiyuan Liu, Meng Zhang, Qun Liu, Maosong Sun

Keywords Paper

Textual attacking, Word-level attacking, combinatorial problem, Word-level Attacking

0

0

0

0

9:34

02/02/2021

Inferring Emotion from Large-scale Internet Voice Data: A Semi-supervised Curriculum Augmentation based Deep Learning Approach

Suping Zhou, Jia Jia, Zhiyong Wu and
Zhihan Yang, Yanfeng Wang, Wei Chen, Fanbo Meng, Shuo Huang, Jialie Shen, Xiaochuan Wang

Keywords Paper

0

0

0

0

17:24

14/09/2020

PS3: Partition-based Skew-Specialized Sampling for Batch Mode Active Learning in Imbalanced Text Data

Ricky Fajri, Samaneh Khoshrou, Robert Peharz, Mykola Pechenizkiy

Keywords Paper

batch-mode active learning, imbalance data, hate-speech recognition

0

0

0

0

15:16

19/04/2021

LESA: Linguistic encapsulation and semantic amalgamation based generalised claim detection from online content

Shreya Gupta, Parantak Singh, Megha Sundriyal and
Md. Shad Akhtar, Tanmoy Chakraborty

Keywords Paper

0

0

0

0

9:51

08/12/2020

Is it Great or Terrible? Preserving Sentiment in Neural Machine Translation of Arabic Reviews

Hadeel Saadany, Constantin Orasan

Keywords Paper

0

0

0

0

14:35

05/12/2020

Toxic language detection in social media for Brazilian Portuguese: New dataset and multilingual analysis

João Augusto Leite, Diego Silva, Kalina Bontcheva, Carolina Scarton

Keywords Paper

0

0

0

0

14:38

08/12/2020

AraBench: Benchmarking Dialectal Arabic-English Machine Translation

Hassan Sajjad, Ahmed Abdelali, Nadir Durrani, Fahim Dalvi

Keywords Paper

0

0

0

0

13:45

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

08/12/2020

Empathy-driven Arabic Conversational Chatbot

Tarek Naous, Christian Hokayem, Hazem Hajj

Keywords Paper

0

0

0

0

14:49

19/04/2021

Frequency-guided word substitutions for detecting textual adversarial examples

Maximilian Mozes, Pontus Stenetorp, Bennett Kleinberg, Lewis Griffin

Keywords Paper

0

0

0

0

6:34

02/02/2021

Abusive Language Detection in Heterogeneous Contexts: Dataset Collection and the Role of Supervised Attention

Hongyu Gong, Alberto Valido, Katherine M. Ingram and
Giulia Fanti, Suma Bhat, Dorothy L. Espelage

Keywords Paper

0

0

0

0

15:07

02/02/2021

Bigram and Unigram Based Text Attack via Adaptive Monotonic Heuristic Search

Xinghao Yang, Weifeng Liu, James Bailey and
Dacheng Tao, Wei Liu

Keywords Paper

0

0

0

0

17:17

04/07/2020

uBLEU: Uncertainty-Aware Automatic Evaluation Method for Open-Domain Dialogue Systems

Tsuta Yuma, Naoki Yoshinaga, Masashi Toyoda

Keywords Paper

Open-Domain Systems, uBLEU, Uncertainty-Aware Method, ΔBLEU

0

0

0

0

11:07

16/11/2020

Multilingual Offensive Language Identification with Cross-lingual Embeddings

Tharindu Ranasinghe, Marcos Zampieri

Keywords Paper

bengali, cross-lingual embeddings, transfer learning, cyberaggression

0

0

0

0

7:00

05/12/2020

Rumor detection on Twitter using multiloss hierarchical BiLSTM with an attenuation factor

Yudianto Sujana, Jiawen Li, Hung-Yu Kao

Keywords Paper

0

0

0

0

13:00

08/12/2020

Multilingual Emoticon Prediction of Tweets about COVID-19

Stefanos Stoikos, Mike Izbicki

Keywords Paper

0

0

0

0

6:19

01/07/2020

Neural Sarcasm Detection using Conversation Context

Nikhil Jaiswal

Keywords Paper

0

0

0

0

4:35

25/07/2020

Think beyond the word: Understanding the implied textual meaning by digesting context, local, and noise

Guoxiu He, Zhe Gao, Zhuoren Jiang and
Yangyang Kang, Changlong Sun, Xiaozhong Liu, Wei Lu

Keywords Paper

deep neural networks, text classification, semantic representation, implied textual meaning

0

0

0

0

19:57

16/11/2020

Q-learning with Language Model for Edit-based Unsupervised Summarization

Ryosuke Kohita, Akifumi Wachi, Yang Zhao, Ryuki Tachibana

Keywords Paper

abstractive textsummarization, unsupervised summarization, unsupervised summarizers, unsupervised methods

0

0

0

0

12:32

07/06/2021

Discovering and Categorising Language Biases in Reddit

Xavier Ferrer, Tom Van Nuenen, Jose M. Such, Natalia Criado

Keywords Paper

Qualitative and quantitative studies of social media, Social network analysis, communities identification, expertise and authority discovery, Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analy

0

0

0

0

8:03

01/07/2020

Training and Inference Methods for High-Coverage Neural Machine Translation

Michael Yang, Yixin Liu, Rahul Mayuranath

Keywords Paper

0

0

0

0

7:17

23/08/2020

TIMME: Twitter ideology-detection via multi-task multi-relational embedding

Zhiping Xiao, Weiping Song, Haoyan Xu and
Zhicheng Ren, Yizhou Sun

Keywords Paper

graph convolutional networks, social network analysis, ideology detection, heterogeneous information network, multi-task learning

0

0

0

0

17:22

04/07/2020

Paraphrase Generation by Learning How to Edit from Samples

Amirhossein Kazemnejad, Mohammadreza Salehi, Mahdieh Soleymani Baghshah

Keywords Paper

Paraphrase Generation, Neural sequence, sequence generation, retrieval-based method

0

0

0

0

12:20

04/07/2020

LINSPECTOR: Multilingual Probing Tasks for Word Representations

Gözde Gül Sahin, Clara Vania, Ilia Kuznetsov, Iryna Gurevych

Keywords Paper

Word Representations, NLP, classification tasks, probing tasks

0

0

0

0

11:51

06/12/2020

Bayesian Multi-type Mean Field Multi-agent Imitation Learning

Fan Yang, Alina Vereshchaka, Changyou Chen, Wen Dong

Keywords Paper

0

0

0

0

3:23

07/06/2020

Mining Archive.org’s Twitter Stream Grab for Pharmacovigilance Research Gold

Ramya Tekumalla, Javad Rafiei Asl, Juan M. Banda

Keywords Paper

building, learning, trends, tweets, twitter

0

0

0

0

3:07

19/04/2021

A few topical tweets are enough for effective user stance detection

Younes Samih, Kareem Darwish

Keywords Paper

0

0

0

0

11:27

07/06/2021

Exercise? I thought you said 'Extra Fries’: Leveraging Sentence Demarcations and Multi-hop Attention for Meme Affect Analysis

Shraman Pramanick, Md Shad Akhtar, Tanmoy Chakraborty

Keywords Paper

Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of social media behavior

0

0

0

0

7:57

08/12/2020

Detecting Urgency Status of Crisis Tweets: A Transfer Learning Approach for Low Resource Languages

Efsun Sarioglu Kayi, Linyong Nan, Bohan Qu and
Mona Diab, Kathleen McKeown

Keywords Paper

0

0

0

0

14:37

19/04/2021

From toxicity in online comments to incivility in American news: Proceed with caution

Anushree Hede, Oshin Agarwal, Linda Lu and
Diana C. Mutz, Ani Nenkova

Keywords Paper

0

0

0

0

10:10

16/11/2020

Training Question Answering Models From Synthetic Data

Raul Puri, Ryan Spring, Mohammad Shoeybi and
Mostofa Patwary, Bryan Catanzaro

Keywords Paper

question generation, squad task, em, data method

0

0

0

0

11:33

12/08/2020

Devil’s Whisper: A General Approach for Physical Adversarial Attacks against Commercial Black-box Speech Recognition Devices

Yuxuan Chen, Xuejing Yuan, Jiangshan Zhang and
Yue Zhao, Shengzhi Zhang, Kai Chen, XiaoFeng Wang

Keywords Paper

0

0

0

0

12:44

02/02/2021

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

Binny Mathew, Punyajoy Saha, Seid Muhie Yimam and
Chris Biemann, Pawan Goyal, Animesh Mukherjee

Keywords Paper

0

0

0

0

18:43

05/12/2020

A unified framework for multilingual and code-mixed visual question answering

Deepak Gupta, Pabitra Lenka, Asif Ekbal, Pushpak Bhattacharyya

Keywords Paper

0

0

0

0

11:48