FAR: A General Framework for Attributional Robustness

Abstract: Attribution maps are popular tools for explaining neural networks' predictions. By assigning an importance value to each input dimension that represents its impact towards the outcome, they give an intuitive explanation of the decision process. However, recent work has discovered vulnerability of these maps to imperceptible adversarial changes, which can prove critical in safety-relevant domains, such as healthcare. Therefore, we define a novel generic framework for attributional robustness (FAR) as general problem formulation for training models with robust attributions. This framework consist of a generic regularization term and training objective that minimize the maximal dissimilarity of attribution maps in a local neighbourhood of the input. We show that FAR is a generalized, less constrained formulation of currently existing training methods. We then propose two new concretizations of this framework, AAT and AdvAAT, that directly optimize for both robust attributions and predictions. Experiments performed on widely used vision datasets show that our methods perform better or comparably to current ones in terms of attributional robustness while being more generally applicable. We finally show that our methods mitigate undesired dependencies between attributional robustness and some training and estimation parameters, which seem to critically affect other competitor methods.

12/07/2020

adversarial training, adversarially robust generalization, mixup, adversarial defense, adversarial examples, adversarial robustness, security

5:01

12/07/2020

FAR: A General Framework for Attributional Robustness

Adam Ivankay, Ivan Girardi, Chiara Marchiori, Pascal Frossard

Comments

Similar Papers

Improving Robustness of Deep-Learning-Based Image Reconstruction

Ankit Raj, Yoram Bresler, Bo Li

Keywords Abstract Paper

Calibrating Structured Output Predictors for Natural Language Processing

Abhyuday Jagannatha, Hong Yu

Keywords Abstract Paper

Natural Processing, natural applications, NLP applications, named recognition

Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Learning

A. Tuan Nguyen, Hyewon Jeong, Eunho Yang, Sung Ju Hwang

Keywords Abstract Paper

Evaluating model robustness and stability to dataset shift

Adarsh Subbaswamy, Roy Adams, Suchi Saria

Keywords Abstract Paper

Towards Understanding the Dynamics of the First-Order Adversaries

Zhun Deng, Hangfeng He, Jiaoyang Huang, Weijie Su

Keywords Abstract Paper

Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift

Alexander Chan, Ahmed Alaa, Zhaozhi Qian, Mihaela van der Schaar

Keywords Abstract Paper

Adversarial Neural Pruning with Latent Vulnerability Suppression

Divyam Madaan, Jinwoo Shin, Sung Ju Hwang

Keywords Abstract Paper

Bridging Adversarial and Statistical Domain Transfer via Spectral Adaptation Networks

Christoph Raab, Philipp Väth, Peter Meier, Frank-Michael Schleif

Keywords Abstract Paper

Interventional Sum-Product Networks: Causal Inference with Tractable Probabilistic Models

Matej Zečević, Devendra Dhami, Athresh Karanam and Sriraam Natarajan, Kristian Kersting

Keywords Abstract Paper

deep learning, generative model, graph learning, causality

Regret minimization for causal inference on large treatment space

Akira Tanimoto, Tomoya Sakai, Takashi Takenouchi, Hisashi Kashima

Keywords Abstract Paper

Identifying Causal-Effect Inference Failure with Uncertainty-Aware Models

Andrew Jesson, Sören Mindermann, Uri Shalit, Yarin Gal

Keywords Abstract Paper

, Theory -> Learning Theory

An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers

Ramakrishna Vedantam, David Lopez-Paz, David Schwab

Keywords Abstract Paper

theory, deep learning, domain adaptation

Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity

Shuxiao Chen, Hangfeng He, Weijie Su

Keywords Abstract Paper

Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization

Saehyung Lee, Hyungyu Lee, Sungroh Yoon

Keywords Abstract Paper

adversarial training, adversarially robust generalization, mixup, adversarial defense, adversarial examples, adversarial robustness, security

Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions

Omer Gottesman, Joseph Futoma, Yao Liu and Sonali Parbhoo, Leo Celi, Emma Brunskill, Finale Doshi-Velez

Keywords Abstract Paper

Estimation of Bounds on Potential Outcomes For Decision Making

Maggie Makar, Fredrik Johansson, John Guttag, David Sontag

Keywords Abstract Paper

Enforcing robust control guarantees within neural network policies

Priya Donti, Melrose Roderick, Mahyar Fazlyab, Zico Kolter

Keywords Abstract Paper

reinforcement learning, differentiable optimization, robust control

Invariant Rationalization

Shiyu Chang, Yang Zhang, Mo Yu, Tommi Jaakkola

Keywords Abstract Paper

Accountability, Transparency and Interpretability

Detecting Adversarial Examples from Sensitivity Inconsistency of Spatial-Transform Domain

Jinyu Tian, Jiantao Zhou, Yuanman Li, Jia Duan

Keywords Abstract Paper

Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning

Ali Mousavi, Lihong Li, Qiang Liu, Denny Zhou

Keywords Abstract Paper

reinforcement learning, off-policy estimation, importance sampling, propensity score

Scalable Bayesian Inverse Reinforcement Learning

Alex Chan, Mihaela van der Schaar

Keywords Abstract Paper

Bayesian, Imitation Learning, Inverse reinforcement learning

Learning a Simple and Effective Model for Multi-turn Response Generation with Auxiliary Tasks

Yufan Zhao, Can Xu, Wei Wu

Keywords Abstract Paper

multi-turn generation, response generation, word recovery, utterance recovery

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Matej Zečević, Devendra Dhami, Athresh Karanam and
Sriraam Natarajan, Kristian Kersting

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Omer Gottesman, Joseph Futoma, Yao Liu and
Sonali Parbhoo, Leo Celi, Emma Brunskill, Finale Doshi-Velez

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Xiong-Hui Chen, Shengyi Jiang, Feng Xu and
Zongzhang Zhang, Yang Yu

Keywords Paper

Xiaoting Shao, Arseny Skryagin, Wolfgang Stammer and
Patrick Schramowski, Kristian Kersting

Keywords Paper

Prasad Chalasani, Jiefeng Chen, Amrita Roy Chowdhury and
Xi Wu, Somesh Jha

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Harshay Shah, Kaustav Tamuly, Aditi Raghunathan and
Prateek Jain, Praneeth Netrapalli

Keywords Paper

Keywords Paper

Jiaxing Wang, Haoli Bai, Jiaxiang Wu and
Xupeng Shi, Junzhou Huang, Irwin King, Michael Lyu, Jian Cheng

Keywords Paper