Abstract:
In this work, we empirically validate three common assumptions in building political media bias datasets, which are (i) labelers´ political leanings do not affect labeling tasks; (ii) news articles follow their source outlet´s political leaning; and (iii) political leaning of a news outlet is stable across different topics. We build a ground-truth dataset of manually annotated article-level political leaning and validate the three assumptions. Our findings warn that the three assumptions could be invalid even for a small dataset. We hope that our work calls attention to the (in)validity of common assumptions in building political media bias datasets.