We need to talk about random splits

19/04/2021

We need to talk about random splits

Anders Søgaard, Sebastian Ebert, Jasmijn Bastings, Katja Filippova

Keywords:

Abstract Paper Similar Papers

Abstract: (CITATION) argued for using random splits rather than standard splits in NLP experiments. We argue that random splits, like standard splits, lead to overly optimistic performance estimates. We can also split data in biased or adversarial ways, e.g., training on short sentences and evaluating on long ones. Biased sampling has been used in domain adaptation to simulate real-world drift; this is known as the covariate shift assumption. In NLP, however, even worst-case splits, maximizing bias, often under-estimate the error observed on new samples of in-domain data, i.e., the data that models should minimally generalize to at test time. This invalidates the covariate shift assumption. Instead of using multiple random splits, future benchmarks should ideally include multiple, independent test sets instead; if infeasible, we argue that multiple biased splits leads to more realistic performance estimates than multiple random splits.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EACL 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Recursive Inference for Variational Autoencoders

Minyoung Kim, Vladimir Pavlovic

Keywords Paper

0

0

0

0

3:24

13/04/2021

On multilevel monte carlo unbiased gradient estimation for deep latent variable models

Yuyang Shi, Rob Cornish

Keywords Paper

0

0

0

0

3:06

06/12/2020

Minimax Value Interval for Off-Policy Evaluation and Policy Optimization

Nan Jiang, Jiawei Huang

Keywords Paper

Algorithms -> Classification, Algorithms -> Semi-Supervised Learning

0

0

0

0

2:56

26/04/2020

SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models

Yucen Luo, Alex Beatson, Mohammad Norouzi and
Jun Zhu, David Duvenaud, Ryan P. Adams, Ricky T. Q. Chen

Keywords Paper

0

0

0

0

5:14

06/12/2021

Differentiable Annealed Importance Sampling and the Perils of Gradient Noise

Guodong Zhang, Kyle Hsu, Jianing Li and
Chelsea Finn, Roger Grosse

Keywords Paper

optimization, generative model

0

0

0

0

15:30

03/08/2020

On Counterfactual Explanations under Predictive Multiplicity

Martin Pawelczyk, Klaus Broelemann, Gjergji. Kasneci

Keywords Paper

0

0

0

0

8:03

03/08/2020

Flexible Approximate Inference via Stratified Normalizing Flows

Chris Cundy, Stefano Ermon

Keywords Paper

0

0

0

0

7:33

19/08/2021

Independence-aware Advantage Estimation

Pushi Zhang, Li Zhao, Guoqing Liu and
Jiang Bian, Minlie Huang, Tao Qin, Tie-Yan Liu

Keywords Paper

Machine Learning, Reinforcement Learning, Deep Reinforcement Learning

0

0

0

0

14:58

06/12/2021

Shared Independent Component Analysis for Multi-Subject Neuroimaging

Hugo Richard, Pierre Ablin, Bertrand Thirion and
Alexandre Gramfort, Aapo Hyvarinen

Keywords Paper

representation learning

0

0

0

0

14:21

12/07/2020

Parametric Gaussian Process Regressors

Martin Jankowiak, Geoff Pleiss, Jacob Gardner

Keywords Paper

Gaussian Processes

0

0

0

0

12:05

06/12/2021

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

Zhengzhuo Xu, Zenghao Chai, Chun Yuan

Keywords Paper

theory, machine learning

0

0

0

0

4:23

19/08/2021

Likelihood-free Out-of-Distribution Detection with Invertible Generative Models

Amirhossein Ahmadian, Fredrik Lindsten

Keywords Paper

Machine Learning, Deep Learning, Uncertainty Representations, Anomaly/Outlier Detection

0

0

0

0

15:18

06/12/2021

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Paria Rashidinejad, Banghua Zhu, Cong Ma and
Jiantao Jiao, Stuart Russell

Keywords Paper

theory, reinforcement learning and planning, bandits

0

0

0

0

12:21

06/12/2020

Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration

Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

Keywords Paper

0

0

0

0

3:17

12/07/2020

Doubly robust off-policy evaluation with shrinkage

Yi Su, Maria Dimakopoulou, Akshay Krishnamurthy, Miroslav Dudik

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

15:08

19/08/2021

Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness

Dazhong Shen, Chuan Qin, Chao Wang and
Hengshu Zhu, Enhong Chen, Hui Xiong

Keywords Paper

Machine Learning, Bayesian Learning, Probabilistic Machine Learning, Unsupervised Learning

0

0

0

0

13:04

14/06/2020

Effectively Unbiased FID and Inception Score and Where to Find Them

Min Jin Chong, David Forsyth

Keywords Paper

fid, inception score, evaluation, generative models, gans, sobol sequence

0

0

0

0

1:01

13/04/2021

Comparing the value of labeled and unlabeled data in method-of-moments latent variable estimation

Mayee Chen, Benjamin Cohen-Wang, Stephen Mussmann and
Frederic Sala, Christopher Re

Keywords Paper

0

0

0

0

3:04

18/07/2021

Online A-Optimal Design and Active Linear Regression

Xavier Fontaine, Pierre Perrault, Michal Valko, Vianney Perchet

Keywords Paper

Algorithms, Online Learning Algorithms

0

0

0

0

5:21

18/07/2021

Understanding Failures in Out-of-Distribution Detection with Deep Generative Models

Lily Zhang, Mark Goldstein, Rajesh Ranganath

Keywords Paper

Deep Learning, Generative Models

0

0

0

0

4:54

23/08/2020

On sampled metrics for item recommendation

Walid Krichene, Steffen Rendle

Keywords Paper

item recommendation, sampled metric, evaluation, metrics

0

0

0

0

16:46

22/11/2021

Deep Least Squares Alignment for Unsupervised Domain Adaptation

Youshan Zhang, Brian D. Davison

Keywords Paper

Unsupervised Domain Adaptation, Least Squares, Distribution Alignment

0

0

0

0

9:59

03/05/2021

When does preconditioning help or hurt generalization?

Shun-ichi Amari, Jimmy Ba, Roger Grosse and
Chen Li, Atsushi Nitanda, Taiji Suzuki, Denny Wu, Ji Xu

Keywords Paper

high-dimensional asymptotics, generalization, second-order optimization, natural gradient descent

0

0

0

0

5:21

13/04/2021

CONTRA: Contrarian statistics for controlled variable selection

Mukund Sudarshan, Aahlad Puli, Lakshmi Subramanian and
Sriram Sankararaman, Rajesh Ranganath

Keywords Paper

0

0

0

0

3:33

25/07/2020

Asymmetric tri-training for debiasing missing-not-at-random explicit feedback

Yuta Saito

Keywords Paper

recommender systems, unsupervised domain adaptation, missing-not-at-random, matrix factorization, selection bias, explicit feedback

0

0

0

0

18:03

06/12/2020

Decision-Making with Auto-Encoding Variational Bayes

Romain Lopez, Pierre Boyeau, Nir Yosef and
Michael Jordan, Jeff Regier

Keywords Paper

0

0

0

0

3:21

18/07/2021

T-SCI: A Two-Stage Conformal Inference Algorithm with Guaranteed Coverage for Cox-MLP

Jiaye Teng, Zeren Tan, Yang Yuan

Keywords Paper

Algorithms, Others

0

0

0

0

4:43

14/06/2020

Revisiting Saliency Metrics: Farthest-Neighbor Area Under Curve

Sen Jia, Neil D. B. Bruce

Keywords Paper

visual saliency, saliency metric, center bias, area under curve

0

0

0

0

4:50

06/12/2020

Trust the Model When It Is Confident: Masked Model-based Actor-Critic

Feiyang Pan, Jia He, Dandan Tu, Qing He

Keywords Paper

0

0

0

0

2:57

23/08/2020

Imputing various incomplete attributes via distance likelihood maximization

Shaoxu Song, Yu Sun

Keywords Paper

distance likelihood, incomplete data, data imputation

0

0

0

0

11:45

26/08/2020

More Powerful Selective Kernel Tests for Feature Selection

Jen Ning Lim, Makoto Yamada, Wittawat Jitkrittum and
Yoshikazu Terada, Shigeyuki Matsui, Hidetoshi Shimodaira

Keywords Paper

0

0

0

0

9:25

26/08/2020

A Unified Statistically Efficient Estimation Framework for Unnormalized Models

Masatoshi Uehara, Takafumi Kanamori, Takashi Takenouchi, Takeru Matsuda

Keywords Paper

0

0

0

0

13:58

03/05/2021

Combining Ensembles and Data Augmentation Can Harm Your Calibration

Yeming Wen, Ghassen Jerfel, Rafael Müller and
Michael W Dusenberry, Jasper Snoek, Balaji Lakshminarayanan, Dustin Tran

Keywords Paper

Uncertainty estimates, Ensembles, Calibration

0

0

0

0

6:10

12/07/2020

When are Non-Parametric Methods Robust?

Robi Bhattacharjee, Kamalika Chaudhuri

Keywords Paper

Learning Theory

0

0

0

0

15:17

06/12/2021

For high-dimensional hierarchical models, consider exchangeability of effects across covariates instead of across datasets

Brian Trippe, Hilary Finucane, Tamara Broderick

Keywords Paper

theory, machine learning

0

0

0

0

14:02

04/07/2020

Neural Mixed Counting Models for Dispersed Topic Discovery

Jiemin Wu, Yanghui Rao, Zusheng Zhang and
Haoran Xie, Qing Li, Fu Lee Wang, Ziye Chen

Keywords Paper

Dispersed Discovery, mining topics, Neural Models, Mixed models

0

0

0

0

10:29

06/12/2021

SOPE: Spectrum of Off-Policy Estimators

Christina Yuan, Yash Chandak, Stephen Giguere and
Philip S. Thomas, Scott Niekum

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:41

03/05/2021

Selective Classification Can Magnify Disparities Across Groups

Erik Jones, Shiori Sagawa, Pang Wei Koh and
Ananya Kumar, Percy Liang

Keywords Paper

log-concavity, group disparities, selective classification, robustness

0

0

0

0

5:24

14/06/2020

Stochastic Classifiers for Unsupervised Domain Adaptation

Zhihe Lu, Yongxin Yang, Xiatian Zhu and
Cong Liu, Yi-Zhe Song, Tao Xiang

Keywords Paper

unsupervised domain adaptation, stochastic classifiers, adversarial learning, local alignment, multi-head network, object classification, semantic segmentation

0

0

0

0

1:00

06/12/2020

Understanding Double Descent Requires A Fine-Grained Bias-Variance Decomposition

Ben Adlam, Jeffrey Pennington

Keywords Paper

0

0

0

0

3:30