Data preprocessing to mitigate bias: A maximum entropy based approach

12/07/2020

Data preprocessing to mitigate bias: A maximum entropy based approach

Elisa Celis, Vijay Keswani, Nisheeth Vishnoi

Keywords: Fairness, Equity, Justice, and Safety

Abstract Paper Similar Papers

Abstract: Data containing human or social features may over- or under-represent groups with respect to salient social attributes such as gender or race, which can lead to biases in downstream applications. Prior approaches towards preprocessing data to mitigate such biases either reweigh the points in the dataset or set up a constrained optimization problem on the domain to minimize a metric of bias. However, the former do not learn a distribution over the entire domain and the latter do not scale well with the domain size. This paper presents an optimization framework that can be used as a data preprocessing method towards mitigating bias: It can learn distributions over large domains and controllably adjust the representation rates of protected groups and/or achieve target fairness metrics such as statistical parity, yet remains close to the empirical distribution induced by the given dataset. Our approach appeals to the principle of maximum entropy, which states that amongst all distributions satisfying a given set of constraints, we should choose the one closest in KL-divergence to a given prior. While maximum entropy distributions can succinctly encode distributions over large domains, they can be difficult to compute. Our main technical contribution is an instantiation of the maximum entropy framework for our set of constraints and priors, which encode our bias mitigation goals, that runs in time polynomial in the dimension of the data. Empirically, we observe that samples from the learned distribution have desired representation rates and statistical rates, and when used for training a classifier incurs only a slight loss in accuracy while maintaining fairness properties.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Fair Classification with Adversarial Perturbations

L. Elisa Celis, Anay Mehrotra, Nisheeth Vishnoi

Keywords Paper

optimization, machine learning, fairness

0

0

0

0

15:03

02/02/2021

Controllable Guarantees for Fair Outcomes via Contrastive Information Estimation

Umang Gupta, Aaron M Ferber, Bistra Dilkina, Greg Ver Steeg

Keywords Paper

0

0

0

0

16:48

06/12/2020

Self-training Avoids Using Spurious Features Under Domain Shift

Yining Chen, Colin Wei, Ananya Kumar, Tengyu Ma

Keywords Paper

0

0

0

0

3:18

06/12/2020

Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies

Itai Gat, Idan Schwartz, Alex Schwing, Tamir Hazan

Keywords Paper

0

0

0

0

3:18

06/12/2020

Trade-offs and Guarantees of Adversarial Representation Learning for Information Obfuscation

Han Zhao, Jianfeng Chi, Yuan Tian, Geoffrey Gordon

Keywords Paper

0

0

0

0

3:17

02/02/2021

Fair Influence Maximization: a Welfare Optimization Approach

Aida Rahmattalabi, Shahin Jabbari, Himabindu Lakkaraju and
Phebe Vayanos, Max Izenberg, Ryan Brown, Eric Rice, Milind Tambe

Keywords Paper

0

0

0

0

16:15

02/02/2021

Fair and Efficient Allocations with Limited Demands

Sushirdeep Narayana, Ian A. Kash

Keywords Paper

0

0

0

0

19:32

06/12/2021

Bayesian decision-making under misspecified priors with applications to meta-learning

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

meta learning, bandits

0

0

0

0

14:58

06/12/2021

An Axiomatic Theory of Provably-Fair Welfare-Centric Machine Learning

Cyrus Cousins

Keywords Paper

theory, machine learning

0

0

0

0

14:57

03/08/2020

Towards Threshold Invariant Fair Classification

Mingliang Chen, Min Wu

Keywords Paper

0

0

0

0

8:12

26/04/2020

The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget

Anirudh Goyal, Yoshua Bengio, Matthew Botvinick, Sergey Levine

Keywords Paper

Variational Information Bottleneck, Reinforcement learning

0

0

0

0

5:10

18/07/2021

Fair Classification with Noisy Protected Attributes: A Framework with Provable Guarantees

L. Elisa Celis, Lingxiao Huang, Vijay Keswani, Nisheeth K. Vishnoi

Keywords Paper

Social Aspects of Machine Learning, Fairness, Accountability, and Transparency

0

0

0

0

5:12

26/08/2020

Stretching the Effectiveness of MLE from Accuracy to Bias for Pairwise Comparisons

Jingyan Wang, Nihar Shah, R Ravi

Keywords Paper

0

0

0

0

13:34

26/08/2020

Optimal sampling in unbiased active learning

Henrik Imberg, Johan Jonasson, Marina Axelson-Fisk

Keywords Paper

0

0

0

0

13:45

06/12/2020

Steering Distortions to Preserve Classes and Neighbors in Supervised Dimensionality Reduction

Benoît Colange, Jaakko Peltonen, Michael Aupetit and
Denys Dutykh, Sylvain Lespinats

Keywords Paper

, Theory -> Learning Theory

0

0

0

0

3:18

03/05/2021

Multi-Class Uncertainty Calibration via Mutual Information Maximization-based Binning

Kanil Patel, William H Beluch, Bin Yang and
Michael Pfeiffer, Dan Zhang

Keywords Paper

deep neural networks, histogram binning, post-hoc calibration, uncertainty calibration, mutual information

0

0

0

0

5:13

26/08/2020

Calibrated Surrogate Maximization of Linear-fractional Utility in Binary Classification

Han Bao, Masashi Sugiyama

Keywords Paper

0

0

0

0

15:01

06/12/2020

Fair Multiple Decision Making Through Soft Interventions

Yaowei Hu, Yongkai Wu, Lu Zhang, Xintao Wu

Keywords Paper

Algorithms -> Relational Learning; Applications -> Network Analysis; Deep Learning -> Attention Models; Deep Learning -> Recurr, Deep Learning -> Generative Models

0

0

0

0

3:21

18/07/2021

Fair Selective Classification Via Sufficiency

Joshua Lee, Yuheng Bu, Deepta Rajan and
Prasanna Sattigeri, Rameswar Panda, Subhro Das, Gregory Wornell

Keywords Paper

Social Aspects of Machine Learning, Fairness, Accountability, and Transparency

0

0

0

0

18:20

06/12/2020

Fairness in Streaming Submodular Maximization: Algorithms and Hardness

Marwa El Halabi, Slobodan Mitrović, Ashkan Norouzi-Fard and
Jakab Tardos, Jakub Tarnawski

Keywords Paper

0

0

0

0

3:17

18/07/2021

Examining and Combating Spurious Features under Distribution Shift

Chunting Zhou, Xuezhe Ma, Paul Michel, Graham Neubig

Keywords Paper

Deep Learning, Embedding and Representation learning

0

0

0

0

5:53

12/07/2020

Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards

Umer Siddique, Paul Weng, Matthieu Zimmer

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:17

18/07/2021

Multi-group Agnostic PAC Learnability

Guy Rothblum, Gal Yona

Keywords Paper

Social Aspects of Machine Learning, Fairness, Accountability, and Transparency

0

0

0

0

5:30

12/07/2020

Bridging the Gap Between f-GANs and Wasserstein GANs

Jiaming Song, Stefano Ermon

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

12:06

12/07/2020

Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels

Yu-Ting Chou, Gang Niu, Hsuan-Tien Lin, Masashi Sugiyama

Keywords Paper

Unsupervised and Semi-Supervised Learning

0

0

0

0

14:39

26/08/2020

Optimized Score Transformation for Fair Classification

Dennis Wei, Karthikeyan Natesan Ramamurthy, Flavio Calmon

Keywords Paper

0

0

0

0

14:47

06/12/2020

Axioms for Learning from Pairwise Comparisons

Ritesh Noothigattu, Dominik Peters, Ariel Procaccia

Keywords Paper

0

0

0

0

3:27

03/05/2021

Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients

Jing An, Lexing Ying, Yuhua Zhu

Keywords Paper

stability, stochastic asymptotics, resampling, reweighting, biased sampling

0

0

0

0

5:31

06/12/2020

Robust Optimization for Fairness with Noisy Protected Groups

Serena Wang, Wenshuo Guo, Harikrishna Narasimhan and
Andy Cotter, Maya Gupta, Michael Jordan

Keywords Paper

0

0

0

0

3:15

18/07/2021

Fundamental Tradeoffs in Distributionally Adversarial Training

Mohammad Mehrabi, Adel Javanmard, Ryan A. Rossi and
Anup Rao, Tung Mai

Keywords Paper

Theory

0

0

0

1

5:50

06/12/2021

Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration

Yu Wang, Jingyang Lin, Jingjing Zou and
Yingwei Pan, Ting Yao, Tao Mei

Keywords Paper

self-supervised learning, contrastive learning

0

0

0

0

12:26

06/12/2020

Bayes Consistency vs. H-Consistency: The Interplay between Surrogate Loss Functions and the Scoring Function Class

Mingyuan Zhang, Shivani Agarwal

Keywords Paper

0

0

0

0

3:19

12/07/2020

Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing

Sanghamitra Dutta, Dennis Wei, Hazar Yueksel and
Pin-Yu Chen, Sijia Liu, Kush Varshney

Keywords Paper

Fairness, Equity, Justice, and Safety

0

0

0

0

13:41

02/02/2021

Balanced Open Set Domain Adaptation via Centroid Alignment

Mengmeng Jing, Jingjing Li, Lei Zhu and
Zhengming Ding, Ke Lu, Yang Yang

Keywords Paper

0

0

0

0

14:46

06/12/2020

The Advantage of Conditional Meta-Learning for Biased Regularization and Fine Tuning

Giulia Denevi, Massimiliano Pontil, Carlo Ciliberto

Keywords Paper

0

0

0

0

3:24

03/08/2020

On the design of consequential ranking algorithms

Behzad Tabibian, Vicenç Gómez, Abir De and
Bernhard Schölkopf, Manuel Gomez Rodriguez

Keywords Paper

0

0

0

0

9:21

02/02/2021

Group Fairness by Probabilistic Modeling with Latent Fair Decisions

YooJung Choi, Meihua Dang, Guy Van den Broeck

Keywords Paper

0

0

0

0

19:30

06/12/2021

Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets

Max Ryabinin, Andrey Malinin, Mark Gales

Keywords Paper

machine learning

0

0

0

0

12:36

06/12/2021

Unbiased Classification through Bias-Contrastive and Bias-Balanced Learning

Youngkyu Hong, Eunho Yang

Keywords Paper

machine learning, contrastive learning, fairness

0

0

0

0

11:29

25/07/2020

Multi-grouping robust fair ranking

Thibaut Thonet, Jean-Michel Renders

Keywords Paper

grouping robustness, multi-grouping fair ranking, fair ranking

0

0

0

0

8:59