12/08/2020

TextShield: Robust Text Classification Based on Multimodal Embedding and Neural Machine Translation

Jinfeng Li, Tianyu Du, Shouling Ji, Rong Zhang, Quan Lu, Min Yang, Ting Wang

Keywords:

Abstract: Text-based toxic content detection is an important tool for reducing harmful interactions in online social media environments. Yet, its underlying mechanism, deep learning-based text classification (DLTC), is inherently vulnerable to maliciously crafted adversarial texts. To mitigate such vulnerabilities, intensive research has been conducted on strengthening English-based DLTC models. However, the existing defenses are not effective for Chinese-based DLTC models, due to the unique sparseness, diversity, and variation of the Chinese language. In this paper, we bridge this striking gap by presenting TextShield, a new adversarial defense framework specifically designed for Chinese-based DLTC models. TextShield differs from previous work in several key aspects: (i) – it applies to Chinese-based DLTC models without requiring re-training; (ii) – it significantly reduces the attack success rate even under the setting of adaptive attacks; and (iii) – it has little impact on the performance of DLTC models over legitimate inputs. Extensive evaluations show that it outperforms both existing methods and the industry-leading platforms. Future work will explore its applicability in broader practical tasks.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at USENIX Security 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers