Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast

Abstract: Modern neural networks do not always produce well-calibrated predictions, even when trained with a proper scoring function such as cross-entropy. In classification settings, simple methods such as isotonic regression or temperature scaling may be used in conjunction with a held-out dataset to calibrate model outputs. However, extending these methods to structured prediction is not always straightforward or effective; furthermore, a held-out calibration set may not always be available. In this paper, we study \textitensemble distillation as a general framework for producing well-calibrated structured prediction models while avoiding the prohibitive inference-time cost of ensembles. We validate this framework on two tasks: named-entity recognition and machine translation. We find that, across both tasks, ensemble distillation produces models which retain much of, and occasionally improve upon, the performance and calibration benefits of ensembles, while only requiring a single model during test-time.

18/07/2021

Rameswar Panda, Michele Merler, Mayoore S Jaiswal and
Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Richard Chen, Minsik Cho, Rogerio Feris, David Kung, Bishwaranjan Bhattacharjee

Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast—Choose Three

Steven Reich, David Mueller, Nicholas Andrews

Comments

Similar Papers

Meta-Cal: Well-controlled Post-hoc Calibration by Ranking

Xingchen Ma, Matthew B Blaschko

Keywords Abstract Paper

Algorithms, Supervised Learning

Confidence-Aware Learning for Deep Neural Networks

Sangheum Hwang, Jooyoung Moon, Jihyo Kim, Younghak Shin

Keywords Abstract Paper

Deep Learning - Algorithms

Model Fusion via Optimal Transport

Sidak Pal Singh, Martin Jaggi

Keywords Abstract Paper

Ensemble Distribution Distillation

Andrey Malinin, Bruno Mlodozeniec, Mark Gales

Keywords Abstract Paper

Ensemble Distillation, Knowledge Distillation, Uncertainty Estimation, Density Estimation

Hyperparameter-Free Out-of-Distribution Detection Using Cosine Similarity

Engkarat Techapanurak, Masanori Suganuma, Takayuki Okatani

Keywords Abstract Paper

Improved Natural Language Generation via Loss Truncation

Daniel Kang, Tatsunori Hashimoto

Keywords Abstract Paper

Natural Generation, optimization, estimation, distinguishability

Benchmarking simulation-based inference

Jan-Matthis Lueckmann, Jan Boelts, David Greenberg and Pedro Goncalves, Jakob Macke

Keywords Abstract Paper

Go with the flow: Adaptive control for Neural ODEs

Mathieu Chalvidal, Matthew Ricci, Rufin VanRullen, Thomas Serre

Keywords Abstract Paper

Neural ODEs, Normalizing flows, Hypernetworks, Optimal Control Theory

Integrated Latent Heterogeneity and Invariance Learning in Kernel Space

Jiashuo Liu, Zheyuan Hu, Peng Cui and Bo Li, Zheyan Shen

Keywords Abstract Paper

deep learning, reinforcement learning and planning, machine learning

DoLFIn: Distributions over Latent Features for Interpretability

Phong Le, Willem Zuidema

Keywords Abstract Paper

Intra Order-preserving Functions for Calibration of Multi-Class Neural Networks

Amir Rahimi, Amirreza Shaban, Ching-An Cheng and Richard I Hartley, Byron Boots

Keywords Abstract Paper

Adaptive, Distribution-Free Prediction Intervals for Deep Networks

Danijel Kivaranovic, Kory D. Johnson, Hannes Leeb

Keywords Abstract Paper

NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search

Rameswar Panda, Michele Merler, Mayoore S Jaiswal and Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Richard Chen, Minsik Cho, Rogerio Feris, David Kung, Bishwaranjan Bhattacharjee

Keywords Abstract Paper

Non-Parametric Calibration for Classification

Jonathan Wenger, Hedvig Kjellström, Rudolph Triebel )

Keywords Abstract Paper

Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee

Wei Hu, Zhiyuan Li, Dingli Yu

Keywords Abstract Paper

deep learning theory, regularization, noisy labels

Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains

Christian Gumbsch, Martin V. Butz, Georg Martius

Keywords Abstract Paper

deep learning, reinforcement learning and planning, interpretability

Forecasting sequential data using Consistent Koopman Autoencoders

Omri Azencot, N. Benjamin Erichson, Vanessa Lin, Michael Mahoney

Keywords Abstract Paper

Sequential, Network, and Time-Series Modeling

On Linear Identifiability of Learned Representations

Geoffrey Roeder, Luke Metz, Durk Kingma

Keywords Abstract Paper

Deep Learning, Embedding and Representation learning

Bridging Adversarial and Statistical Domain Transfer via Spectral Adaptation Networks

Christoph Raab, Philipp Väth, Peter Meier, Frank-Michael Schleif

Keywords Abstract Paper

Efficient Statistical Tests: A Neural Tangent Kernel Approach

Sheng Jia, Ehsan Nezhadarya, Yuhuai Wu, Jimmy Ba

Keywords Abstract Paper

Deep Learning

Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks

Agustinus Kristiadi, Matthias Hein, Philipp Hennig

Keywords Abstract Paper

Deep Learning - General

Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jan-Matthis Lueckmann, Jan Boelts, David Greenberg and
Pedro Goncalves, Jakob Macke

Keywords Paper

Keywords Paper

Jiashuo Liu, Zheyuan Hu, Peng Cui and
Bo Li, Zheyan Shen

Keywords Paper

Keywords Paper

Amir Rahimi, Amirreza Shaban, Ching-An Cheng and
Richard I Hartley, Byron Boots

Keywords Paper

Keywords Paper

Rameswar Panda, Michele Merler, Mayoore S Jaiswal and
Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Richard Chen, Minsik Cho, Rogerio Feris, David Kung, Bishwaranjan Bhattacharjee

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Liyang Liu, Shilong Zhang, Zhanghui Kuang and
Aojun Zhou, Jing-Hao Xue, Xinjiang Wang, Yimin Chen, Wenming Yang, Qingmin Liao, Wayne Zhang

Keywords Paper

Krishnamurthy (Dj) Dvijotham, Jamie Hayes, Borja Balle and
Zico Kolter, Chongli Qin, Andras Gyorgy, Kai Xiao, Sven Gowal, Pushmeet Kohli

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yinjun Wu, Jingchao Ni, Wei Cheng and
Bo Zong, Dongjin Song, Zhengzhang Chen, Yanchi Liu, Xuchao Zhang, Haifeng Chen, Susan B Davidson

Keywords Paper

Keywords Paper

Soumya Ghosh, Will Stephenson, Stan Nguyen and
Sameer Deshpande, Tamara Broderick

Keywords Paper

Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena and
Surya Ganguli, Jonathan Bloom, Daniel Yamins

Keywords Paper

Keywords Paper