12/07/2020

Robustness to Programmable String Transformations via Augmented Abstract Training

Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni

Keywords: Adversarial Examples

Abstract: Deep neural networks for natural language processing tasks are vulnerable to adversarial input perturbations. Existing works have proposed to improve the robustness against specific adversarial input perturbations (e.g., token substitutions), but do not consider general perturbations such as token insertions, token deletions, token swaps, etc. To fill this gap, we present a technique to train models that are robust to user-defined string transformations. Our technique combines data augmentation---to detect worst-case transformed inputs---and verifiable training using abstract interpretation---to further increase the robustness of the model on the worst-case transformed inputs. We use our technique to train models on the AG and SST2 datasets and show that the resulting models are robust to combinations of user-defined transformations mimicking spelling mistakes and other meaning-preserving transformations.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers