Bayes-TrEx: a Bayesian Sampling Approach to Model Transparency by Example

Abstract: Post-hoc explanation methods are gaining popularity for interpreting, understanding, and debugging neural networks. Most analyses using such methods explain decisions in response to inputs drawn from the test set. However, the test set may have few examples that trigger some model behaviors, such as high-confidence failures or ambiguous classifications. To address these challenges, we introduce a flexible model inspection framework: Bayes-TrEx. Given a data distribution, Bayes-TrEx finds in-distribution examples which trigger a specified prediction confidence. We demonstrate several use cases of Bayes-TrEx, including revealing highly confident (mis)classifications, visualizing class boundaries via ambiguous examples, understanding novel-class extrapolation behavior, and exposing neural network overconfidence. We use Bayes-TrEx to study classifiers trained on CLEVR, MNIST, and Fashion-MNIST, and we show that this framework enables more flexible holistic model analysis than just inspecting the test set. Code and supplemental material are available at https://github.com/serenabooth/Bayes-TrEx.

26/08/2020

Bayes-TrEx: a Bayesian Sampling Approach to Model Transparency by Example

Serena Booth, Yilun Zhou, Ankit Shah, Julie Shah

Comments

Similar Papers

Non-Parametric Calibration for Classification

Jonathan Wenger, Hedvig Kjellström, Rudolph Triebel )

Keywords Abstract Paper

Making neural networks interpretable with attribution: Application to implicit signals prediction

Darius Afchar, Romain Hennequin

Keywords Abstract Paper

Implicit Recommender System, Interpretable machine learning

Improving Deep Learning Interpretability by Saliency Guided Training

Aya Abdelsalam Ismail, Hector Corrada Bravo, Soheil Feizi

Keywords Abstract Paper

deep learning, transformers, vision, language, interpretability

An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks

Lifu Tu, Tianyu Liu, Kevin Gimpel

Keywords Abstract Paper

natural processing, sequence labeling, semantic labeling, parsing

Decoupling Representation and Classifier for Long-Tailed Recognition

Bingyi Kang, Saining Xie, Marcus Rohrbach and Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

Keywords Abstract Paper

long-tailed recognition, classification

Posterior Re-calibration for Imbalanced Datasets

Junjiao Tian, Yen-Cheng Liu, Nathaniel Glaser and Yen-Chang Hsu, Zsolt Kira

Keywords Abstract Paper

Algorithms -> Few-Shot Learning, Applications -> Computer Vision

Improving Compositionality of Neural Networks by Decoding Representations to Inputs

Mike Wu, Noah Goodman, Stefano Ermon

Keywords Abstract Paper

deep learning, machine learning, adversarial robustness and security, generative model

A Framework to Learn with Interpretation

Jayneel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc

Keywords Abstract Paper

deep learning, interpretability

Active Bayesian Assessment of Black-Box Classifiers

Disi Ji, Robert L. Logan, Padhraic Smyth, Mark Steyvers

Keywords Abstract Paper

Nonparametric estimation of heterogeneous treatment effects: From theory to learning algorithms

Alicia Curth, Mihaela Schaar

Keywords Abstract Paper

Influence decompositions for neural network attribution

Kyle Reing, Greg Ver Steeg, Aram Galstyan

Keywords Abstract Paper

Neural Topic Modeling with Cycle-Consistent Adversarial Training

Xuemeng Hu, Rui Wang, Deyu Zhou, Yuxuan Xiong

Keywords Abstract Paper

neural modeling, deep models, adversarial-neural model, adversarially network

Location Attention for Extrapolation to Longer Sequences

Yann Dubois, Gautier Dagan, Dieuwke Hupkes, Elia Bruni

Keywords Abstract Paper

Extrapolation, natural processing, generalization, Lookup task

Estimating informativeness of samples with Smooth Unique Information

Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini and Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

Keywords Abstract Paper

dataset summarization, ntk, stability theory, sample information, information theory

Learning perturbation sets for robust machine learning

Eric Wong, Zico Kolter

Keywords Abstract Paper

conditional variational autoencoder, adversarial examples, perturbation sets, robust machine learning

Explanation-based Data Augmentation for Image Classification

Sandareka Wickramanayake, Wynne Hsu, Mong Li Lee

Keywords Abstract Paper

deep learning, machine learning, vision, interpretability

Optimization and Analysis of the pAp@k Metric for Recommender Systems

Gaurush Hiranandani, Warut Vijitbenjaronk, Sanmi Koyejo, Prateek Jain

Keywords Abstract Paper

Learning Theory

Model Composition: Can Multiple Neural Networks Be Combined into a Single Network Using Only Unlabeled Data?

Amin Banitalebi-Dehkordi, Xinyu Kang, Yong Zhang

Keywords Abstract Paper

Model Composition, Combining Neural Networks, Pseudo Label, Self Training, Label Aggregation, Combining Models

Meta-Cal: Well-controlled Post-hoc Calibration by Ranking

Xingchen Ma, Matthew B Blaschko

Keywords Abstract Paper

Algorithms, Supervised Learning

Learning from Context or Names? An Empirical Study on Neural Relation Extraction

Hao Peng, Tianyu Gao, Xu Han and Yankai Lin, Peng Li, Zhiyuan Liu, Maosong Sun, Jie Zhou

Keywords Abstract Paper

relation benchmarks, re scenarios, neural models, re models

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Bingyi Kang, Saining Xie, Marcus Rohrbach and
Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

Keywords Paper

Junjiao Tian, Yen-Cheng Liu, Nathaniel Glaser and
Yen-Chang Hsu, Zsolt Kira

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini and
Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Hao Peng, Tianyu Gao, Xu Han and
Yankai Lin, Peng Li, Zhiyuan Liu, Maosong Sun, Jie Zhou

Keywords Paper

Keywords Paper

Keywords Paper

Mikel Landajuela Larma, Brenden Petersen, Sookyung Kim and
Claudio Santiago, Ruben Glatt, Nathan Mundhenk, Jacob Pettit, Daniel Faissol

Keywords Paper

Ruobing Xie, Shaoliang Zhang, Rui Wang and
Feng Xia, Leyu Lin

Keywords Paper

Randy Ardywibowo, Shahin Boluki, Xinyu Gong and
Zhangyang Wang, Xiaoning Qian

Keywords Paper

Yinpeng Dong, Qi-An Fu, Xiao Yang and
Tianyu Pang, Hang Su, Zihao Xiao, Jun Zhu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper