Stereotype and skew: Quantifying gender bias in pre-trained and fine-tuned language models

Abstract: This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender bias present in contextual language models when tackling the WinoBias pronoun resolution task. We find evidence that gender stereotype correlates approximately negatively with gender skew in out-of-the-box models, suggesting that there is a trade-off between these two forms of bias. We investigate two methods to mitigate bias. The first approach is an online method which is effective at removing skew at the expense of stereotype. The second, inspired by previous work on ELMo, involves the fine-tuning of BERT using an augmented gender-balanced dataset. We show that this reduces both skew and stereotype relative to its unaugmented fine-tuned counterpart. However, we find that existing gender bias benchmarks do not fully probe professional bias as pronoun resolution may be obfuscated by cross-correlations from other manifestations of gender prejudice.

04/07/2020

Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of social media behavior, Text categorization, topic recognition, demographic/gender/age identification

8:05

08/12/2020

Circles are like Ellipses, or Ellipses are like Circles? Measuring the Degree of Asymmetry of Static and Contextual Word Embeddings and the Implications to Representation Learning

Stereotype and skew: Quantifying gender bias in pre-trained and fine-tuned language models

Daniel Vassimon Manela, David Errington, Thomas Fisher, Boris Breugel, Pasquale Minervini

Comments

Similar Papers

Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation

Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani and Bryan McCann, Vicente Ordonez, Caiming Xiong

Keywords Abstract Paper

Tailoring Embeddings, Gender Mitigation, Double-Hard Debias, downstream models

Adapting Text Embeddings for Causal Inference

Victor Veitch, Dhanya Sridhar, David Blei

Keywords Abstract Paper

Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation

Emily Dinan, Angela Fan, Adina Williams and Jack Urbanek, Douwe Kiela, Jason Weston

Keywords Abstract Paper

counterfactual augmentation, targeted collection, bias training, generative models

Detecting Independent Pronoun Bias with Partially-Synthetic Data Generation

Robert Munro, Alex (Carmen) Morrison

Keywords Abstract Paper

measuring models, parsers, language models, machine models

Robustness and reliability of gender bias assessment in word embeddings: The role of base pairs

Haiyang Zhang, Alison Sneyd, Mark Stevenson

Keywords Abstract Paper

Explaining the Efficacy of Counterfactually Augmented Data

Divyansh Kaushik, Amrith Setlur, Eduard H Hovy, Zachary Lipton

Keywords Abstract Paper

sentiment analysis, text classification, natural language inference, annotation artifacts, humans in the loop

Machine translationese: Effects of algorithmic bias on linguistic complexity in machine translation

Eva Vanmassenhove, Dimitar Shterionov, Matthew Gwilliam

Keywords Abstract Paper

Fairness without Demographics through Adversarially Reweighted Learning

Preethi Lahoti, Alex Beutel, Jilin Chen and Kang Lee, Flavien Prost, Nithum Thain, Xuezhi Wang, Ed Chi

Keywords Abstract Paper

Measuring Societal Biases from Text Corpora with Smoothed First-Order Co-occurrence

Navid Rekabsaz, Robert West, James Henderson, Allan Hanbury

Keywords Abstract Paper

Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of social media behavior, Text categorization, topic recognition, demographic/gender/age identification

Assessing Polyseme Sense Similarity through Co-predication Acceptability and Contextualised Embedding Distance

Janosch Haber, Massimo Poesio

Keywords Abstract Paper

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

Yangming Li, lemao liu, Shuming Shi

Keywords Abstract Paper

Negative Sampling, Unlabeled Entity Problem, Named Entity Recognition

Contextualizing Hate Speech Classifiers with Post-hoc Explanation

Brendan Kennedy, Xisen Jin, Aida Mostafazadeh Davani and Morteza Dehghani, Xiang Ren

Keywords Abstract Paper

Contextualizing Classifiers, Post-hoc Explanation, Hate classifiers, fine-tuned classifiers

Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings

Vaibhav Kumar, Tenzin Bhotia, Vaibhav Kumar, Tanmoy Chakraborty

Keywords Abstract Paper

word embeddings, semantic words, coreference resolution, post-processing methods

UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

Jian Guan, Minlie Huang

Keywords Abstract Paper

open-ended generation, story generation, evaluating generation, constructing samples

Learning Disentangled Representation for Fair Facial Attribute Classification via Fairness-aware Information Alignment

Sungho Park, Sunhee Hwang, Dohyung Kim, Hyeran Byun

Keywords Abstract Paper

How Does Selective Mechanism Improve Self-Attention Networks?

Xinwei Geng, Longyue Wang, Xing Wang and Bing Qin, Ting Liu, Zhaopeng Tu

Keywords Abstract Paper

NLP tasks, natural inference, semantic labelling, machine translation

Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem

Danielle Saunders, Bill Byrne

Keywords Abstract Paper

Reducing Bias, Neural Translation, Domain Problem, NLP tasks

FairFaceGAN: Fairness-aware Facial Image-to-Image Translation

Sunhee Hwang, Sungho Park, Dohyung Kim and Mirae Do, Hyeran Byun

Keywords Abstract Paper

fairness in computer vision, image-to-image translation, equality of opportunity, equalized odds

LIREx: Augmenting Language Inference with Relevant Explanations

Xinyan Zhao, V.G.Vinod Vydiswaran

Keywords Abstract Paper

Have We Solved The Hard Problem? It’s Not Easy! Contextual Lexical Contrast as a Means to Probe Neural Coherence

Wenqiang Lei, Yisong Miao, Runpeng Xie and Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Abstract Paper

WER-BERT: Automatic WER estimation with BERT in a balanced ordinal classification paradigm

Akshay Krishna Sheshadri, Anvesh Rao Vijjini, Sukhdeep Kharbanda

Keywords Abstract Paper

Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning

Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani and
Bryan McCann, Vicente Ordonez, Caiming Xiong

Keywords Paper

Keywords Paper

Emily Dinan, Angela Fan, Adina Williams and
Jack Urbanek, Douwe Kiela, Jason Weston

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Preethi Lahoti, Alex Beutel, Jilin Chen and
Kang Lee, Flavien Prost, Nithum Thain, Xuezhi Wang, Ed Chi

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Brendan Kennedy, Xisen Jin, Aida Mostafazadeh Davani and
Morteza Dehghani, Xiang Ren

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Xinwei Geng, Longyue Wang, Xing Wang and
Bing Qin, Ting Liu, Zhaopeng Tu

Keywords Paper

Keywords Paper

Sunhee Hwang, Sungho Park, Dohyung Kim and
Mirae Do, Hyeran Byun

Keywords Paper

Keywords Paper

Wenqiang Lei, Yisong Miao, Runpeng Xie and
Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Paul Pu Liang, Irene Mengze Li, Emily Zheng and
Yao Chong Lim, Ruslan Salakhutdinov, Louis-Philippe Morency

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Xuhui Zhou, Maarten Sap, Swabha Swayamdipta and
Yejin Choi, Noah Smith

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Bo Zhang, Yue Zhang, Rui Wang and
Zhenghua Li, Min Zhang

Keywords Paper

Keywords Paper

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

Keywords Paper

Zhiquan Wen, Guanghui Xu, Mingkui Tan and
Qingyao Wu, Qi Wu

Keywords Paper