A Dataset of State-Censored Tweets

07/06/2021

A Dataset of State-Censored Tweets

Tuğrulcan Elmas, Rebekah Overdorf, Karl Aberer

Keywords: Qualitative and quantitative studies of social media, Credibility of online content, Social network analysis, communities identification, expertise and authority discovery, Trust, reputation, recommendation systems

Abstract Paper Similar Papers

Abstract: Many governments impose traditional censorship methods on social media platforms. Instead of removing it completely, many social media companies, including Twitter, only withhold the content from the requesting country. This makes such content still accessible outside of the censored region, allowing for an excellent setting in which to study government censorship on social media. We mine such content using the Internet Archive's Twitter Stream Grab. We release a dataset of 583,437 tweets by 155,715 users that were censored between 2012-2020 July. We also release 4,301 accounts that were censored in their entirety. Additionally, we release a set of 22,083,759 supplemental tweets made up of all tweets by users with at least one censored tweet as well as instances of other users retweeting the censored user. We provide an exploratory analysis of this dataset. Our dataset will not only aid in the study of government censorship but will also aid in studying hate speech detection and the effect of censorship on social media users. The dataset is publicly available at https://doi.org/10.5281/zenodo.4439509

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICWSM 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News

Nguyen Vo, Kyumin Lee

Keywords Paper

fact-checking, fact-checking systems, fact-checked information, misinformation

0

0

0

0

11:43

07/06/2021

Political Bias and Factualness in News Sharing across more than 100,000 Online Communities

Galen Weld, Maria Glenski, Tim Althoff

Keywords Paper

Qualitative and quantitative studies of social media, Credibility of online content

0

0

0

0

8:10

07/06/2020

Behind the Mask: A Computational Study of Anonymous' Presence on Twitter

Keenan Jones, Jason R. C. Nurse, Shujun Li

Keywords Paper

accounts, claims, discussions, groups, influences, large_scale, learning, measures, networks, similarity, sites, topic, tweets, twitter

0

0

0

0

9:59

07/06/2021

Posting Bot Detection on Blockchain-based Social Media Platform using Machine Learning Techniques

Taehyun Kim, Hyomin Shin, Hyung Ju Hwang, Seungwon Jeong

Keywords Paper

Credibility of online content, Social network analysis, communities identification, expertise and authority discovery, Trust, reputation, recommendation systems

0

0

0

0

6:32

07/06/2020

Towards Measuring Adversarial Twitter Interactions against Candidates in the US Midterm Elections

Yiqing Hua, Thomas Ristenpart, Mor Naaman

Keywords Paper

contexts, discussions, elections, identities, impact, interactions, measures, political, political candidates, terms, toxic, tweets, twitter, violence

0

0

0

0

9:01

07/06/2020

Two Computational Models for Analyzing Political Attention in Social Media

Libby Hemphill, Angela M. Schöpke-Gonzalez

Keywords Paper

attention, classifiers, political, political rhetoric, services, tools, topic, tweets, twitter

0

0

0

0

8:50

07/06/2020

Characterizing the Use of Images in State-Sponsored Information Warfare Operations by Russian Trolls on Twitter

Savvas Zannettou, Tristan Caulfield, Barry Bradlyn and
Emiliano De Cristofaro, Gianluca Stringhini, Jeremy Blackburn

Keywords Paper

4chan, accounts, communities, engagement, events, images, images shared, influences, memes, networks, politically incorrect, reddit, shared, twitter, twitter reddit

0

0

0

0

9:18

04/07/2020

Prta: A System to Support the Analysis of Propaganda Techniques in the News

Giovanni Da San Martino, Shaden Shaar, Yifan Zhang and
Seunghak Yu, Alberto Barrón-Cedeño, Preslav Nakov

Keywords Paper

online disinformation, fact-checking detection, disinformation detection, media thinking

0

0

0

0

11:46

07/06/2021

VoterFraud2020: a Multi-modal Dataset of Election Fraud Claims on Twitter

Anton Abilov, Yiqing Hua, Hana Matatov and
Ofra Amir, Mor Naaman

Keywords Paper

Qualitative and quantitative studies of social media, Social network analysis, communities identification, expertise and authority discovery

0

0

0

0

2:52

11/08/2020

Padding Ain't Enough: Assessing the Privacy Guarantees of Encrypted DNS

Jonas Bushart, Christian Rossow

Keywords Paper

0

0

0

0

11:05

03/05/2021

LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition

Valeria Cherepanova, Micah Goldblum, Harrison Foley and
Shiyuan Duan, John P Dickerson, Gavin Taylor, Tom Goldstein

Keywords Paper

facial recognition, adversarial attacks

0

0

0

0

4:49

07/06/2020

Mining Archive.org’s Twitter Stream Grab for Pharmacovigilance Research Gold

Ramya Tekumalla, Javad Rafiei Asl, Juan M. Banda

Keywords Paper

building, learning, trends, tweets, twitter

0

0

0

0

3:07

07/06/2021

How-to Present News on Social Media: A Causal Analysis of Editing News Headlines for Boosting User Engagement

Kunwoo Park, Haewoon Kwak, Jisun An, Sanjay Chawla

Keywords Paper

Analysis of the relationship between social media and mainstream media, Credibility of online content, Text categorization, topic recognition, demographic/gender/age identification, Engagement, motivations, incentives, and gamification.

0

0

0

0

7:06

12/08/2020

What Twitter Knows: Characterizing Ad Targeting Practices, User Perceptions, and Ad Explanations Through Users' Own Twitter Data

Miranda Wei, Madison Stamos, Sophie Veys and
Nathan Reitinger, Justin Goodman, Margot Herman, Dorota Filipczuk, Ben Weinshel, Michelle L. Mazurek, Blase Ur

Keywords Paper

0

0

0

0

12:16

07/06/2020

Variation across Scales: Measurement Fidelity under Twitter Data Sampling

Siqi Wu, Marian-Andrei Rizoiu, Lexing Xie

Keywords Paper

attention, bias, cascades, changes, collection, graphs, influences, measures, networks, rates, structure, terms, tweets, twitter

0

0

0

0

9:59

02/02/2021

Segmentation of Tweets with URLs and its Applications to Sentiment Analysis

Abdullah Aljebreen, Weiyi Meng, Eduard Dragut

Keywords Paper

0

0

0

0

15:57

11/08/2020

A Comprehensive Study of DNS-over-HTTPS Downgrade Attack

Qing Huang, Deliang Chang, Zhou Li

Keywords Paper

0

0

0

0

11:02

02/02/2021

Embracing Domain Differences in Fake News: Cross-domain Fake News Detection using Multi-modal Data

Amila Silva, Ling Luo, Shanika Karunasekera, Christopher Leckie

Keywords Paper

0

0

0

0

16:13

04/07/2020

What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context

Ramy Baly, Georgi Karadzhov, Jisun An and
Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav Nakov

Keywords Paper

News Profiling, media profiling, Text Analysis, political bias

0

0

0

0

12:07

19/10/2020

On-demand influencer discovery on social media

Cheng Zheng, Qin Zhang, Sean Young, Wei Wang

Keywords Paper

rare topics, influence convolution, topic-specific influencers

0

0

0

0

7:12

23/08/2020

TIMME: Twitter ideology-detection via multi-task multi-relational embedding

Zhiping Xiao, Weiping Song, Haoyan Xu and
Zhicheng Ren, Yizhou Sun

Keywords Paper

graph convolutional networks, social network analysis, ideology detection, heterogeneous information network, multi-task learning

0

0

0

0

17:22

05/12/2020

Rumor detection on Twitter using multiloss hierarchical BiLSTM with an attenuation factor

Yudianto Sujana, Jiawen Li, Hung-Yu Kao

Keywords Paper

0

0

0

0

13:00

04/07/2020

Toxicity Detection: Does Context Really Matter?

John Pavlopoulos, Jeffrey Sorensen, Lucas Dixon and
Nithum Thain, Ion Androutsopoulos

Keywords Paper

Toxicity Detection, healthy discussions, toxicity systems, toxicity classifiers

0

0

0

0

12:00

11/08/2020

Triplet Censors: Demystifying Great Firewall’s DNS Censorship Behavior

Anonymous, Arian Akhavan Niaki, Nguyen Phong Hoang and
Phillipa Gill, Amir Houmansadr

Keywords Paper

0

0

0

0

11:51

04/07/2020

#NotAWhore! A Computational Linguistic Perspective of Rape Culture and Victimization on Social Media

Ashima Suvarna, Grusha Bhalla

Keywords Paper

sexual survivors, victim blaming, computationally methods, transfer-learning method

0

0

0

0

10:47

16/11/2020

On the Reliability and Validity of Detecting Approval of Political Actors in Tweets

Indira Sen, Fabian Flöck, Claudia Wagner

Keywords Paper

social-media-based estimates, labeling approval, sentiment methods, stance methods

0

0

0

0

14:17

12/08/2020

Poison Over Troubled Forwarders: A Cache Poisoning Attack Targeting DNS Forwarding Devices

Xiaofeng Zheng, Chaoyi Lu, Jian Peng and
Qiushi Yang, Dongjie Zhou, Baojun Liu, Keyu Man, Shuang Hao, Haixin Duan, Zhiyun Qian

Keywords Paper

0

0

0

0

10:05

25/04/2020

Effects of Credibility Indicators on Social Media News Sharing Intent

Waheeb Yaqub, Otari Kakhidze, Morgan Brockman and
Nasir Memon, Sameer Patil

Keywords Paper

fake news, misinformation, disinformation, news headlines, news sharing, fact-check indicators, social media, facebook

0

0

0

0

12:32

12/08/2020

DatashareNetwork: A Decentralized Privacy-Preserving Search Engine for Investigative Journalists

Kasra Edalatnejad, Wouter Lueks, Julien Pierre Martin; Soline Ledésert and
Anne L'Hôte, Bruno Thomas, Laurent Girod, Carmela Troncoso

Keywords Paper

0

0

0

0

12:19

07/06/2020

Analysing the Extent of Misinformation in Cancer Related Tweets

Rakesh Bal, Sayan Sinha, Swastika Dutta and
Risabh Joshi, Sayan Ghosh, Ritam Dutt

Keywords Paper

cancer, claims, deep learning, detection, learning, linguistic, misinformation, spread, texts, tweets, twitter

0

0

0

0

3:03

04/07/2020

Generating Counter Narratives against Online Hate Speech: Data and Strategies

Serra Sinem Tekiroğlu, Yi-Ling Chung, Marco Guerini

Keywords Paper

natural generation, generation data, data filtering, expert validation/post-editing

0

0

0

0

11:16

12/08/2020

Shim Shimmeny: Evaluating the Security and Privacy Contributions of Link Shimming in the Modern Web

Frank Li

Keywords Paper

0

0

0

0

12:24

07/06/2021

A Large Open Dataset from the Parler Social Network

Max Aliapoulios, Emmi Bevensee, Jeremy Blackburn and
Barry Bradlyn, Emiliano De Cristofaro, Gianluca Stringhini, Savvas Zannettou

Keywords Paper

Qualitative and quantitative studies of social media, Social network analysis, communities identification, expertise and authority discovery

0

0

0

0

2:47

07/06/2021

X-Posts Explained: Analyzing and Predicting Controversial Contributions in Thematically Diverse Reddit Forums

Anna Guimarães, Gerhard Weikum

Keywords Paper

Qualitative and quantitative studies of social media, Centrality/influence of social media publications and authors, Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of social media behav

0

0

0

0

7:46

07/06/2021

On Positive Moderation Decisions

Mattia Samory

Keywords Paper

Qualitative and quantitative studies of social media, Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of social media behavior, Text categorization, topic recognition, demographic/gender

0

0

0

0

7:37

19/10/2020

ReCOVery: A multimodal repository for COVID-19 news credibility research

Xinyi Zhou, Apurva Mulay, Emilio Ferrara, Reza Zafarani

Keywords Paper

coronavirus, covid-19, fake news, infodemic, information credibility, multimodal, repository, pandemic, social media

0

0

0

0

9:59

26/08/2020

Federated Heavy Hitters Discovery with Differential Privacy

Wennan Zhu, Peter Kairouz, Brendan McMahan and
Haicheng Sun, Wei Li

Keywords Paper

0

0

0

0

14:08

07/06/2021

Network Inference from a Mixture of Diffusion Models for Fake News Mitigation

Karishma Sharma, Xinran He, Sungyong Seo, Yan Liu

Keywords Paper

Credibility of online content, Social network analysis, communities identification, expertise and authority discovery

0

0

0

0

8:57

02/02/2021

Initiative Defense against Facial Manipulation

Qidong Huang, Jie Zhang, Wenbo Zhou and
Weiming Zhang, Nenghai Yu

Keywords Paper

0

0

0

0

11:00

25/04/2020

"I Just Want to Hack Myself to Not Get Distracted": Evaluating Design Interventions for Self-Control on Facebook

Ulrik Lyngs, Kai Lukoff, Petr Slovak and
William Seymour, Helena Webb, Marina Jirotka, Jun Zhao, Max Van Kleek, Nigel Shadbolt

Keywords Paper

facebook, problematic use, self-control, distraction, ict non-use, addiction, focus, interruptions

0

0

0

0

15:02