Abstract:
Portmanteaus are new words formed by combining the sounds and meanings of two words. Given their sticky nature, portmanteaus are often used to create political and personal attacks by combining a target entity with derogatory terms, which can then be spread online for promoting hate speech and defamation. In this paper, we present a framework to decompose political portmanteaus used online into their component words. Using our annotated dataset of political portmanteaus, we train a system that decomposes 76.2% of the political portmanteaus into their component words. Furthermore, for 93.4% of the political portmanteaus, our system finds the correct component words in its top 10 results, suggesting that using better ranking methods can lead to stronger results. This work provides a framework for both understanding an intriguing linguistic phenomena and for building hate-speech filters that could catch novel words that would bypass traditional hate speech detection approaches.