Showing Your Work Doesn't Always Work

Abstract: In natural language processing, a recently popular line of work explores how to best report the experimental results of neural networks. One exemplar publication, titled "Show Your Work: Improved Reporting of Experimental Results" (Dodge et al., 2019), advocates for reporting the expected validation effectiveness of the best-tuned model, with respect to the computational budget. In the present work, we critically examine this paper. As far as statistical generalizability is concerned, we find unspoken pitfalls and caveats with this approach. We analytically show that their estimator is biased and uses error-prone assumptions. We find that the estimator favors negative errors and yields poor bootstrapped confidence intervals. We derive an unbiased alternative and bolster our claims with empirical evidence from statistical simulation. Our codebase is at https://github.com/castorini/meanmax.

12/07/2020

Showing Your Work Doesn't Always Work

Raphael Tang, Jaejun Lee, Ji Xin, Xinyu Liu, Yaoliang Yu, Jimmy Lin

Comments

Similar Papers

Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation

Steven Kleinegesse, Michael Gutmann

Keywords Abstract Paper

Bayesian Adaptation for Covariate Shift

Aurick Zhou, Sergey Levine

Keywords Abstract Paper

deep learning, machine learning, robustness, vision, domain adaptation

Benchmarking simulation-based inference

Jan-Matthis Lueckmann, Jan Boelts, David Greenberg and Pedro Goncalves, Jakob Macke

Keywords Abstract Paper

Integrated Latent Heterogeneity and Invariance Learning in Kernel Space

Jiashuo Liu, Zheyuan Hu, Peng Cui and Bo Li, Zheyan Shen

Keywords Abstract Paper

deep learning, reinforcement learning and planning, machine learning

On the Role of Optimization in Double Descent: A Least Squares Study

Ilja Kuzborskij, Csaba Szepesvari, Omar Rivasplata and Amal Rannen-Triki, Razvan Pascanu

Keywords Abstract Paper

theory, deep learning, optimization

Finite-sample regret bound for distributionally robust offline tabular reinforcement learning

Zhengqing Zhou, Zhengyuan Zhou, Qinxun Bai and Linhai Qiu, Jose Blanchet, Peter Glynn

Keywords Abstract Paper

Towards a better understanding of label smoothing in neural machine translation

Yingbo Gao, Weiyue Wang, Christian Herold and Zijian Yang, Hermann Ney

Keywords Abstract Paper

Adaptive, Distribution-Free Prediction Intervals for Deep Networks

Danijel Kivaranovic, Kory D. Johnson, Hannes Leeb

Keywords Abstract Paper

Adaptive Sampling for Minimax Fair Classification

Shubhanshu Shekhar, Greg Fields, Mohammad Ghavamzadeh, Tara Javidi

Keywords Abstract Paper

deep learning, machine learning, fairness

Instabilities of Offline RL with Pre-Trained Neural Representation

Ruosong Wang, Yifan Wu, Russ Salakhutdinov, Sham Kakade

Keywords Abstract Paper

Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation

Keywords Abstract Paper

Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift

Alexander Chan, Ahmed Alaa, Zhaozhi Qian, Mihaela van der Schaar

Keywords Abstract Paper

Bridging Adversarial and Statistical Domain Transfer via Spectral Adaptation Networks

Christoph Raab, Philipp Väth, Peter Meier, Frank-Michael Schleif

Keywords Abstract Paper

Calibration of Neural Networks using Splines

Kartik Gupta, Amir Rahimi, Thalaiyasingam Ajanthan and Thomas Mensink, Cristian Sminchisescu, Richard Hartley

Keywords Abstract Paper

uncertainty, calibration measure, neural network calibration

Efficient Statistical Tests: A Neural Tangent Kernel Approach

Sheng Jia, Ehsan Nezhadarya, Yuhuai Wu, Jimmy Ba

Keywords Abstract Paper

Making Sense of Reinforcement Learning and Probabilistic Inference

Brendan O'Donoghue, Ian Osband, Catalin Ionescu

Keywords Abstract Paper

Reinforcement learning, Bayesian inference, Exploration

Effective Estimation of Deep Generative Language Models

Tom Pelsmaeker, Wilker Aziz

Keywords Abstract Paper

Estimation Models, parameterisation models, posterior collapse, language modelling

QEBA: Query-Efficient Boundary-Based Blackbox Attack

Huichen Li, Xiaojun Xu, Xiaolu Zhang and Shuang Yang, Bo Li

Keywords Abstract Paper

adversarial machine learning, black-box attack, boundary-based attack, attacking public api

The Causal-Neural Connection: Expressiveness, Learnability, and Inference

Kevin Xia, Kai-Zhan Lee, Yoshua Bengio, Elias Bareinboim

Keywords Abstract Paper

deep learning, causality

Enhancing Simple Models by Exploiting What They Already Know

Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss

Keywords Abstract Paper

Joint Inference for Neural Network Depth and Dropout Regularization

Kishan K C, Rui Li, MohammadMahdi Gilany

Keywords Abstract Paper

deep learning, generative model, continual learning

In search of robust measures of generalization

Gintare Karolina Dziugaite, Alexandre Drouin, Brady Neal and Nitarshan Rajkumar, Ethan Caballero, Linbo Wang, Ioannis Mitliagkas, Dan Roy

Keywords Abstract Paper

Towards Deeper Deep Reinforcement Learning with Spectral Normalization

Keywords Paper

Keywords Paper

Jan-Matthis Lueckmann, Jan Boelts, David Greenberg and
Pedro Goncalves, Jakob Macke

Keywords Paper

Jiashuo Liu, Zheyuan Hu, Peng Cui and
Bo Li, Zheyan Shen

Keywords Paper

Ilja Kuzborskij, Csaba Szepesvari, Omar Rivasplata and
Amal Rannen-Triki, Razvan Pascanu

Keywords Paper

Zhengqing Zhou, Zhengyuan Zhou, Qinxun Bai and
Linhai Qiu, Jose Blanchet, Peter Glynn

Keywords Paper

Yingbo Gao, Weiyue Wang, Christian Herold and
Zijian Yang, Hermann Ney

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Kartik Gupta, Amir Rahimi, Thalaiyasingam Ajanthan and
Thomas Mensink, Cristian Sminchisescu, Richard Hartley

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Huichen Li, Xiaojun Xu, Xiaolu Zhang and
Shuang Yang, Bo Li

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Gintare Karolina Dziugaite, Alexandre Drouin, Brady Neal and
Nitarshan Rajkumar, Ethan Caballero, Linbo Wang, Ioannis Mitliagkas, Dan Roy

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jongheon Jeong, Sejun Park, Minkyu Kim and
Heung-Chang Lee, Do-Guk Kim, Jinwoo Shin

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Bahar Taskesen, Man Chung Yue, Jose Blanchet and
Daniel Kuhn, Viet Anh Nguyen

Keywords Paper

Zhuoran Yang, Chi Jin, Zhaoran Wang and
Mengdi Wang, Michael Jordan

Keywords Paper

Jiefeng Li, Tong Chen, Ruiqi Shi and
Yujing Lou, Yong-Lu Li, Cewu Lu

Keywords Paper