Variation across Scales: Measurement Fidelity under Twitter Data Sampling

07/06/2020

Variation across Scales: Measurement Fidelity under Twitter Data Sampling

Siqi Wu, Marian-Andrei Rizoiu, Lexing Xie

Keywords: attention, bias, cascades, changes, collection, graphs, influences, measures, networks, rates, structure, terms, tweets, twitter

Abstract Paper Similar Papers

Abstract: A comprehensive understanding of data bias is the cornerstone of mitigating biases in social media research. This paper presents in-depth measurements of the effects of Twitter data sampling across different timescales and different subjects (entities, networks, and cascades). By constructing two complete tweet streams, we show that Twitter rate limit message is an accurate measure for the volume of missing tweets. Despite sampling rates having clear temporal variations, we find that the Bernoulli process with a uniform rate well approximates Twitter data sampling, and it allows to estimate the ground-truth entity frequency and ranking with the observed sample data. In terms of network analysis, we observe significant structure changes in both the user-hashtag bipartite graph and the retweet network. Finally, we measure the retweet cascades. We identify risks for information diffusion models that rely on tweet inter-arrival times and user influence. This work calls attention to the social data bias caused by data collection, and proposes methods to measure the systematic biases introduced by sampling.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICWSM 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Twitter Event Summarization by Exploiting Semantic Terms and Graph Network

Quanzhi Li, Qiong Zhang

Keywords Paper

0

0

0

0

15:58

07/06/2020

An Experimental Study of Structural Diversity in Social Networks

Jessica Su, Krishna Kamath, Aneesh Sharma and
Johan Ugander, Sharad Goel

Keywords Paper

cases, causal, changes, common, engagement, groups, large_scale, networks, rates, relationships, retention rates, twitter

0

0

0

0

8:44

23/08/2020

SimClusters: Community-based representations for heterogeneous recommendations at twitter

Venu Satuluri, Yao Wu, Xun Zheng and
Yilei Qian, Brian Wichers, Qieyun Dai, Gui Ming Tang, Jerry Jiang, Jimmy Lin

Keywords Paper

community detection, personalization, recommender systems

0

0

0

0

4:55

23/08/2020

TIMME: Twitter ideology-detection via multi-task multi-relational embedding

Zhiping Xiao, Weiping Song, Haoyan Xu and
Zhicheng Ren, Yizhou Sun

Keywords Paper

graph convolutional networks, social network analysis, ideology detection, heterogeneous information network, multi-task learning

0

0

0

0

17:22

07/06/2020

Two Computational Models for Analyzing Political Attention in Social Media

Libby Hemphill, Angela M. Schöpke-Gonzalez

Keywords Paper

attention, classifiers, political, political rhetoric, services, tools, topic, tweets, twitter

0

0

0

0

8:50

04/07/2020

Relational Graph Attention Network for Aspect-based Sentiment Analysis

Kai Wang, Weizhou Shen, Yunyi Yang and
Xiaojun Quan, Rui Wang

Keywords Paper

Aspect-based Analysis, encoding information, sentiment prediction, Relational Network

0

0

0

0

6:56

16/11/2020

Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings

Yue Wang, Jing Li, Michael Lyu, Irwin King

Keywords Paper

keyphrase prediction, text modeling, classification, generation

0

0

0

0

11:55

04/07/2020

Cross-Lingual Disaster-related Multi-label Tweet Classification with Manifold Mixup

Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea

Keywords Paper

Cross-Lingual Classification, Distinguishing messages, disaster management, multi-label tweets

0

0

0

0

12:49

02/02/2021

Segmentation of Tweets with URLs and its Applications to Sentiment Analysis

Abdullah Aljebreen, Weiyi Meng, Eduard Dragut

Keywords Paper

0

0

0

0

15:57

07/06/2020

Social Media Relevance Filtering Using Perplexity-Based Positive-Unlabelled Learning

Sunghwan Mac Kim, Stephen Wan, Cécile Paris, Andreas Duenser

Keywords Paper

cases, events, languages, learning, performance, sources, topic, traditional, traditional sources, twitter

0

0

0

0

10:12

25/04/2020

Synthesized Social Signals: Computationally-Derived Social Signals from Account Histories

Jane Im, Sonali Tandon, Eshwar Chandrasekharan and
Taylor Denby, Eric Gilbert

Keywords Paper

social computing, social signals, social platform, social media

0

0

0

0

14:11

07/06/2021

How-to Present News on Social Media: A Causal Analysis of Editing News Headlines for Boosting User Engagement

Kunwoo Park, Haewoon Kwak, Jisun An, Sanjay Chawla

Keywords Paper

Analysis of the relationship between social media and mainstream media, Credibility of online content, Text categorization, topic recognition, demographic/gender/age identification, Engagement, motivations, incentives, and gamification.

0

0

0

0

7:06

05/12/2020

Rumor detection on Twitter using multiloss hierarchical BiLSTM with an attenuation factor

Yudianto Sujana, Jiawen Li, Hung-Yu Kao

Keywords Paper

0

0

0

0

13:00

26/04/2020

Provable Filter Pruning for Efficient Neural Networks

Lucas Liebenwein, Cenk Baykal, Harry Lang and
Dan Feldman, Daniela Rus

Keywords Paper

theory, compression, filter pruning, neural networks

0

0

0

0

5:22

06/12/2021

VigDet: Knowledge Informed Neural Temporal Point Process for Coordination Detection on Social Media

Yizhou Zhang, Karishma Sharma, Yan Liu

Keywords Paper

generative model

0

0

0

0

14:35

07/06/2021

On Predicting Personal Values of Social Media Users using Community-Specific Language Features and Personal Value Correlation

Amila Silva, Pei-Chi Lo, Ee Peng Lim

Keywords Paper

Psychological, personality-based and ethnographic studies of social media, Qualitative and quantitative studies of social media, Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of social

0

0

0

0

8:07

04/07/2020

Predicting the Topical Stance and Political Leaning of Media using Tweets

Peter Stefanov, Kareem Darwish, Atanas Atanasov, Preslav Nakov

Keywords Paper

supervised solutions, cascaded method, unsupervised learning, supervised learning

0

0

0

0

7:16

06/12/2021

Distilling Meta Knowledge on Heterogeneous Graph for Illicit Drug Trafficker Detection on Social Media

Yiyue Qian, Yiming Zhang, Yanfang (Fa Ye, Chuxu Zhang

Keywords Paper

deep learning, optimization, graph learning, meta learning, representation learning, few shot learning

0

0

0

0

14:10

07/06/2021

VoterFraud2020: a Multi-modal Dataset of Election Fraud Claims on Twitter

Anton Abilov, Yiqing Hua, Hana Matatov and
Ofra Amir, Mor Naaman

Keywords Paper

Qualitative and quantitative studies of social media, Social network analysis, communities identification, expertise and authority discovery

0

0

0

0

2:52

23/08/2020

Scaling choice models of relational social data

Jan Overgoor, George Pakapol Supaniratisai, Johan Ugander

Keywords Paper

network formation, choice models, social networks

0

0

0

0

17:51

08/12/2020

Misspelling Detection from Noisy Product Images

Varun Nagaraj Rao, Mingwei Shen

Keywords Paper

0

0

0

0

10:59

19/10/2020

On-demand influencer discovery on social media

Cheng Zheng, Qin Zhang, Sean Young, Wei Wang

Keywords Paper

rare topics, influence convolution, topic-specific influencers

0

0

0

0

7:12

07/06/2021

Network Inference from a Mixture of Diffusion Models for Fake News Mitigation

Karishma Sharma, Xinran He, Sungyong Seo, Yan Liu

Keywords Paper

Credibility of online content, Social network analysis, communities identification, expertise and authority discovery

0

0

0

0

8:57

07/06/2020

Characterizing the Social Media News Sphere through User Co-Sharing Practices

Mattia Samory, Vartan Kesiz Abnousi, Tanushree Mitra

Keywords Paper

articles, communities, groups, linguistic, measures, misinformation, news, news articles, news sources, political, shared, sources, structure, twitter

0

0

0

0

10:07

19/04/2021

Semantic oppositeness assisted deep contextual modeling for automatic rumor detection in social networks

Nisansa Silva, Dejing Dou

Keywords Paper

0

0

0

0

12:00

03/08/2020

Regret Analysis of Bandit Problems with Causal Background Knowledge

Yangyi Lu, Amirhossein Meisami, Ambuj Tewari, William Yan

Keywords Paper

0

0

0

0

7:32

26/08/2020

Federated Heavy Hitters Discovery with Differential Privacy

Wennan Zhu, Peter Kairouz, Brendan McMahan and
Haicheng Sun, Wei Li

Keywords Paper

0

0

0

0

14:08

07/06/2020

Behind the Mask: A Computational Study of Anonymous' Presence on Twitter

Keenan Jones, Jason R. C. Nurse, Shujun Li

Keywords Paper

accounts, claims, discussions, groups, influences, large_scale, learning, measures, networks, similarity, sites, topic, tweets, twitter

0

0

0

0

9:59

12/08/2020

Shim Shimmeny: Evaluating the Security and Privacy Contributions of Link Shimming in the Modern Web

Frank Li

Keywords Paper

0

0

0

0

12:24

04/07/2020

GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media

Yi-Ju Lu, Cheng-Te Li

Keywords Paper

Explainable Detection, fake problem, GCAN, Graph-aware Networks

0

0

0

0

10:48

05/12/2020

SentiRec: Sentiment diversity-aware neural news recommendation

Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang

Keywords Paper

0

0

0

0

14:58

18/07/2021

Correcting Exposure Bias for Link Recommendation

Shantanu Gupta, Hao Wang, Zachary Lipton, Bernie Wang

Keywords Paper

Social Aspects of Machine Learning, Fairness, Accountability, and Transparency

0

0

0

0

5:04

04/07/2020

Exploring the Role of Context to Distinguish Rhetorical and Information-Seeking Questions

Yuan Zhuang, Ellen Riloff

Keywords Paper

distinguishing questions, classification models, Rhetorical Questions, features

0

0

0

0

11:32

06/12/2021

Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models

Runzhe Wan, Lin Ge, Rui Song

Keywords Paper

meta learning, bandits, transfer learning

0

0

0

0

13:18

07/06/2020

Empirical Analysis of Multi-Task Learning for Reducing Identity Bias in Toxic Comment Detection

Ameya Vaidya, Feng Mai, Yue Ning

Keywords Paper

attention, bias, deep learning, detection, groups, identities, learning, sources, toxic, toxicity

0

0

0

0

9:59

18/07/2021

Regularized Online Allocation Problems: Fairness and Beyond

Santiago Balseiro, Haihao Lu, Vahab Mirrokni

Keywords Paper

Algorithms, Online Learning Algorithms

0

0

0

0

5:23

04/07/2020

Fine-grained Interest Matching for Neural News Recommendation

Heyuan Wang, Fangzhao Wu, Zheng Liu, Xing Xie

Keywords Paper

Fine-grained Matching, Neural Recommendation, Personalized recommendation, news recommendation

0

0

0

0

10:06

12/08/2020

What Twitter Knows: Characterizing Ad Targeting Practices, User Perceptions, and Ad Explanations Through Users' Own Twitter Data

Miranda Wei, Madison Stamos, Sophie Veys and
Nathan Reitinger, Justin Goodman, Margot Herman, Dorota Filipczuk, Ben Weinshel, Michelle L. Mazurek, Blase Ur

Keywords Paper

0

0

0

0

12:16

12/07/2020

Message Passing Least Squares: A Unified Framework for Fast and Robust Group Synchronization

Yunpeng Shi, Gilad Lerman

Keywords Paper

Applications - Computer Vision

0

0

0

0

14:35

13/04/2021

Decision making problems with funnel structure: A multi-task learning approach with application to email marketing campaigns

Ziping Xu, Amirhossein Meisami, Ambuj Tewari

Keywords Paper

0

0

0

0

3:01