Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients

03/05/2021

Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients

Jing An, Lexing Ying, Yuhua Zhu

Keywords: stability, stochastic asymptotics, resampling, reweighting, biased sampling

Abstract Paper Similar Papers

Abstract: A data set sampled from a certain population is biased if the subgroups of the population are sampled at proportions that are significantly different from their underlying proportions. Training machine learning models on biased data sets requires correction techniques to compensate for the bias. We consider two commonly-used techniques, resampling and reweighting, that rebalance the proportions of the subgroups to maintain the desired objective function. Though statistically equivalent, it has been observed that resampling outperforms reweighting when combined with stochastic gradient algorithms. By analyzing illustrative examples, we explain the reason behind this phenomenon using tools from dynamical stability and stochastic asymptotics. We also present experiments from regression, classification, and off-policy prediction to demonstrate that this is a general phenomenon. We argue that it is imperative to consider the objective function design and the optimization algorithm together while addressing the sampling bias.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Unbiased Classification through Bias-Contrastive and Bias-Balanced Learning

Youngkyu Hong, Eunho Yang

Keywords Paper

machine learning, contrastive learning, fairness

0

0

0

0

11:29

13/04/2021

Comparing the value of labeled and unlabeled data in method-of-moments latent variable estimation

Mayee Chen, Benjamin Cohen-Wang, Stephen Mussmann and
Frederic Sala, Christopher Re

Keywords Paper

0

0

0

0

3:04

25/07/2020

Asymmetric tri-training for debiasing missing-not-at-random explicit feedback

Yuta Saito

Keywords Paper

recommender systems, unsupervised domain adaptation, missing-not-at-random, matrix factorization, selection bias, explicit feedback

0

0

0

0

18:03

14/09/2020

Learning Gradient Boosted Multi-label Classification Rules

Michael Rapp, Eneldo Loza Mencía, Johannes Fürnkranz and
Vu-Linh Nguyen, Eyke Hüllermeier

Keywords Paper

multi-label classification, gradient boosting, rule learning

0

0

0

0

15:45

06/12/2021

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Paper

optimization, machine learning

0

0

0

0

14:51

14/06/2020

NestedVAE: Isolating Common Factors via Weak Supervision

Matthew J. Vowels, Necati Cihan Camgöz, Richard Bowden

Keywords Paper

fairness, bias, representation learning, invariance, vae, variational, weakly supervised, information bottleneck

0

0

0

0

1:00

02/02/2021

Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning

Wenzhen Huang, Qiyue Yin, Junge Zhang, Kaiqi Huang

Keywords Paper

0

0

0

0

14:38

12/07/2020

Meta-learning with Stochastic Linear Bandits

Leonardo Cella, Alessandro Lazaric, Massimiliano Pontil

Keywords Paper

Transfer, Multitask and Meta-learning

1

1

0

0

13:17

26/04/2020

Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation

Xinjie Fan, Yizhe Zhang, Zhendong Wang, Mingyuan Zhou

Keywords Paper

binary softmax, discrete variables, policy gradient, pseudo actions, reinforcement learning, variance reduction

0

0

0

0

4:59

06/12/2021

Out-of-Distribution Generalization in Kernel Regression

Abdulkadir Canatar, Blake Bordelon, Cengiz Pehlevan

Keywords Paper

theory, deep learning, machine learning

0

0

0

0

15:07

12/07/2020

Explaining Groups of Points in Low-Dimensional Representations

Gregory Plumb, Jonathan Terhorst, Sriram Sankararaman, Ameet Talwalkar

Keywords Paper

Accountability, Transparency and Interpretability

0

0

0

0

12:07

19/08/2021

Variational Model-based Policy Optimization

Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh

Keywords Paper

Machine Learning, Reinforcement Learning

0

0

0

0

15:31

12/07/2020

Data preprocessing to mitigate bias: A maximum entropy based approach

Elisa Celis, Vijay Keswani, Nisheeth Vishnoi

Keywords Paper

Fairness, Equity, Justice, and Safety

0

0

0

0

14:52

18/07/2021

Examining and Combating Spurious Features under Distribution Shift

Chunting Zhou, Xuezhe Ma, Paul Michel, Graham Neubig

Keywords Paper

Deep Learning, Embedding and Representation learning

0

0

0

0

5:53

02/02/2021

Group Fairness by Probabilistic Modeling with Latent Fair Decisions

YooJung Choi, Meihua Dang, Guy Van den Broeck

Keywords Paper

0

0

0

0

19:30

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

06/12/2021

Adaptive Ensemble Q-learning: Minimizing Estimation Bias via Error Feedback

Hang Wang, Sen Lin, Junshan Zhang

Keywords Paper

0

0

0

0

11:19

18/07/2021

Dynamic Balancing for Model Selection in Bandits and RL

Ashok Cutkosky, Christoph Dann, Abhimanyu Das and
Claudio Gentile, Aldo Pacchiano, Manish Purohit

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:18

18/07/2021

Directional Bias Amplification

Angelina Wang, Olga Russakovsky

Keywords Paper

Reinforcement Learning and Planning, Exploration, Algorithms, Bandit Algorithms; Reinforcement Learning and Planning, Reinforcement Learning; Theory, Learning Theory, Social Aspects of Machine Learning, Fairness, Accountability, and Transparency

0

0

0

0

5:54

06/12/2020

The Advantage of Conditional Meta-Learning for Biased Regularization and Fine Tuning

Giulia Denevi, Massimiliano Pontil, Carlo Ciliberto

Keywords Paper

0

0

0

0

3:24

26/04/2020

Empirical Bayes Transductive Meta-Learning with Synthetic Gradients

Shell Xu Hu, Pablo Moreno, Yang Xiao and
Xi Shen, Guillaume Obozinski, Neil Lawrence, Andreas Damianou

Keywords Paper

Meta-learning, Empirical Bayes, Synthetic Gradient, Information Bottleneck

0

0

0

0

4:47

06/12/2020

Model-based Adversarial Meta-Reinforcement Learning

Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma

Keywords Paper

0

0

0

0

3:31

23/08/2020

Targeted data-driven regularization for out-of-distribution generalization

Mohammad Mahdi Kamani, Sadegh Farhang, Mehrdad Mahdavi, James Z. Wang

Keywords Paper

data-driven regularization, out-of-distribution generalization, bilevel programming

0

0

0

0

6:36

03/05/2021

Understanding the failure modes of out-of-distribution generalization

Vaishnavh Nagarajan, Anders J Andreassen, Behnam Neyshabur

Keywords Paper

theoretical study, spurious correlations, out-of-distribution generalization, empirical risk minimization

0

1

0

1

5:12

06/12/2020

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

Aviral Kumar, Abhishek Gupta, Sergey Levine

Keywords Paper

0

0

0

0

3:25

26/08/2020

Identifying and Correcting Label Bias in Machine Learning

Heinrich Jiang, Ofir Nachum

Keywords Paper

0

0

0

0

12:42

06/12/2021

Certifying Robustness to Programmable Data Bias in Decision Trees

Anna Meyer, Aws Albarghouthi, Loris D'Antoni

Keywords Paper

robustness, fairness

0

0

0

0

13:03

03/05/2021

Blending MPC & Value Function Approximation for Efficient Reinforcement Learning

Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots

Keywords Paper

reinforcement learning, model-predictive control

0

0

0

0

5:09

13/04/2021

On multilevel monte carlo unbiased gradient estimation for deep latent variable models

Yuyang Shi, Rob Cornish

Keywords Paper

0

0

0

0

3:06

18/07/2021

Sequential Domain Adaptation by Synthesizing Distributionally Robust Experts

Bahar Taskesen, Man Chung Yue, Jose Blanchet and
Daniel Kuhn, Viet Anh Nguyen

Keywords Paper

Optimization, Convex Optimization, Theory, Regularization

0

0

0

0

17:53

06/12/2020

Trade-offs and Guarantees of Adversarial Representation Learning for Information Obfuscation

Han Zhao, Jianfeng Chi, Yuan Tian, Geoffrey Gordon

Keywords Paper

0

0

0

0

3:17

18/07/2021

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient

Botao Hao, Yaqi Duan, Tor Lattimore and
Csaba Szepesvari, Mengdi Wang

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:20

06/12/2020

Modeling and Optimization Trade-off in Meta-learning

Katelyn Gao, Ozan Sener

Keywords Paper

0

0

0

0

3:21

06/12/2021

Variance-Aware Off-Policy Evaluation with Linear Function Approximation

Yifei Min, Tianhao Wang, Dongruo Zhou, Quanquan Gu

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

12:17

18/07/2021

Understanding and Mitigating Accuracy Disparity in Regression

Jianfeng Chi, Yuan Tian, Geoff Gordon, Han Zhao

Keywords Paper

Social Aspects of Machine Learning, Fairness, Accountability, and Transparency

0

0

0

0

5:17

12/07/2020

Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime

Stéphane d'Ascoli, Maria Refinetti, Giulio Biroli, Florent Krzakala

Keywords Paper

Deep Learning - Theory

0

0

0

0

15:11

18/07/2021

Finding Relevant Information via a Discrete Fourier Expansion

Mohsen Heidari, Jithin Sreedharan, Gil Shamir, Wojciech Szpankowski

Keywords Paper

, Theory, Theory, Statistical Learning Theory

0

0

0

0

5:25

18/07/2021

Online A-Optimal Design and Active Linear Regression

Xavier Fontaine, Pierre Perrault, Michal Valko, Vianney Perchet

Keywords Paper

Algorithms, Online Learning Algorithms

0

0

0

0

5:21

06/12/2021

Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Tobias Sutter, Andreas Krause, Daniel Kuhn

Keywords Paper

optimization, machine learning

0

0

0

0

15:05