Abstract:
Although AI systems archive a great success in various societal fields, there still exists a challengeable issue of outputting discriminatory results with respect to protected attributes (e.g., gender and age). The popular approach to solving the issue is to remove protected attribute information in the decision process. However, this approach has a limitation that beneficial information for target tasks may also be eliminated. To overcome the limitation, we propose Fairness-aware Disentangling Variational Auto-Encoder (FD-VAE) that disentangles data representation into three subspaces: 1) Target Attribute Latent (TAL), 2) Protected Attribute Latent (PAL), 3) Mutual Attribute Latent (MAL). On top of that, we propose a decorrelation loss that aligns the overall information into each subspace, instead of removing the protected attribute information. After learning the representation, we re-encode MAL to include only target information and combine it with TAL to perform downstream tasks. In our experiments on CelebA and UTK Face datasets, we show that the proposed method mitigates unfairness in facial attribute classification tasks with respect to gender and age. Ours outperforms previous methods by large margins on two standard fairness metrics, equal opportunity and equalized odds.