Generalization Bounds in the Presence of Outliers: a Median-of-Means Study

18/07/2021

Generalization Bounds in the Presence of Outliers: a Median-of-Means Study

Pierre Laforgue, Guillaume Staerman, Stephan Clémençon

Keywords: Theory, Statistical Learning Theory

Abstract Paper Similar Papers

Abstract: In contrast to the empirical mean, the Median-of-Means (MoM) is an estimator of the mean θ of a square integrable r.v. Z, around which accurate nonasymptotic confidence bounds can be built, even when Z does not exhibit a sub-Gaussian tail behavior. Thanks to the high confidence it achieves on heavy-tailed data, MoM has found various applications in machine learning, where it is used to design training procedures that are not sensitive to atypical observations. More recently, a new line of work is now trying to characterize and leverage MoM’s ability to deal with corrupted data. In this context, the present work proposes a general study of MoM’s concentration properties under the contamination regime, that provides a clear understanding on the impact of the outlier proportion and the number of blocks chosen. The analysis is extended to (multisample) U-statistics, i.e. averages over tuples of observations, that raise additional challenges due to the dependence induced. Finally, we show that the latter bounds can be used in a straightforward fashion to derive generalization guarantees for pairwise learning in a contaminated setting, and propose an algorithm to compute provably reliable decision functions.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Robust compressed sensing using generative models

Ajil Jalal, Liu Liu, Alex Dimakis, Constantine Caramanis

Keywords Paper

Neuroscience and Cognitive Science -> Neuroscience, Neuroscience and Cognitive Science -> Neural Coding

0

0

0

0

3:19

13/04/2021

When OT meets MoM: Robust estimation of wasserstein distance

Guillaume Staerman, Pierre Laforgue, Pavlo Mozharovskyi, Florence d’Alché-Buc

Keywords Paper

0

0

0

0

2:43

06/12/2021

Uniform Concentration Bounds toward a Unified Framework for Robust Clustering

Debolina Paul, Saptarshi Chakraborty, Swagatam Das, Jason Xu

Keywords Paper

optimization, clustering

0

0

0

0

15:22

02/02/2021

Infinite Gaussian Mixture Modeling with an Improved Estimation of the Number of Clusters

Avi Matza, Yuval Bistritz

Keywords Paper

0

0

0

0

20:14

18/07/2021

RATT: Leveraging Unlabeled Data to Guarantee Generalization

Saurabh Garg, Sivaraman Balakrishnan, Zico Kolter, Zachary Lipton

Keywords Paper

Probabilistic Methods, Graphical Models, Theory, Computational Complexity, Theory, Models of Learning and Generalization

0

0

0

1

17:27

13/04/2021

Adversarially robust estimate and risk analysis in linear regression

Yue Xing, Ruizhi Zhang, Guang Cheng

Keywords Paper

0

0

0

0

3:03

26/04/2020

Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks

Tianyu Pang, Kun Xu, Jun Zhu

Keywords Paper

Trustworthy Machine Learning, Adversarial Robustness, Inference Principle, Mixup

0

0

0

0

4:59

14/06/2020

Auxiliary Training: Towards Accurate and Robust Models

Linfeng Zhang, Muzhou Yu, Tong Chen and
Zuoqiang Shi, Chenglong Bao, Kaisheng Ma

Keywords Paper

model robustness, data augmentation, adversarial attack, training method, classification

0

0

0

0

0:56

06/12/2021

Multi-Objective Meta Learning

Feiyang YE, Baijiong Lin, Zhixiong Yue and
Pengxin Guo, Qiao Xiao, Yu Zhang

Keywords Paper

deep learning, optimization, meta learning, domain adaptation, few shot learning

0

0

0

0

12:27

06/12/2020

NeuMiss networks: differentiable programming for supervised learning with missing values.

Marine Le Morvan, Julie Josse, Thomas Moreau and
Erwan Scornet, Gael Varoquaux

Keywords Paper

0

0

0

0

3:20

18/07/2021

Federated Deep AUC Maximization for Hetergeneous Data with a Constant Communication Complexity

Zhuoning Yuan, Zhishuai Guo, Yi Xu and
Yiming Ying, Tianbao Yang

Keywords Paper

Optimization, Distributed and Parallel Optimization

0

0

0

0

5:04

04/07/2020

Generative Semantic Hashing Enhanced via Boltzmann Machines

Lin Zheng, Qinliang Su, Dinghan Shen, Changyou Chen

Keywords Paper

Generative Hashing, large-scale retrieval, training, Boltzmann Machines

0

0

0

0

11:26

12/07/2020

Training Binary Neural Networks through Learning with Noisy Supervision

Kai Han, Yunhe Wang, Yixing Xu and
Chunjing Xu, Enhua Wu, Chang Xu

Keywords Paper

Unsupervised and Semi-Supervised Learning

0

0

0

0

12:34

03/05/2021

On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning

Ren Wang, Kaidi Xu, Sijia Liu and
Pin-Yu Chen, Lily Weng, Chuang Gan, Meng Wang

Keywords Paper

0

0

0

0

5:12

06/12/2021

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

Kuan-Lin Chen, Ching-Hua Lee, Harinath Garudadri, Bhaskar D Rao

Keywords Paper

optimization, vision

0

0

0

0

13:27

05/12/2020

Towards a better understanding of label smoothing in neural machine translation

Yingbo Gao, Weiyue Wang, Christian Herold and
Zijian Yang, Hermann Ney

Keywords Paper

0

0

0

0

13:37

14/06/2020

Single-Step Adversarial Training With Dropout Scheduling

Vivek B.S., R. Venkatesh Babu

Keywords Paper

adversarial training, robustness, efficient training, representation learning, generalization, supervised learning, recognition, classification, neural networks, deep learning

0

0

0

0

1:01

06/12/2020

MMA Regularization: Decorrelating Weights of Neural Networks by Maximizing the Minimal Angles

Zhennan Wang, Canqun Xiang, Wenbin Zou, Chen Xu

Keywords Paper

0

0

0

0

3:23

02/02/2021

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization

Wenxuan Zhou, Bill Yuchen Lin, Xiang Ren

Keywords Paper

0

0

0

0

16:25

06/12/2020

Task-Robust Model-Agnostic Meta-Learning

Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Keywords Paper

0

0

0

0

3:17

06/12/2021

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness

Jongheon Jeong, Sejun Park, Minkyu Kim and
Heung-Chang Lee, Do-Guk Kim, Jinwoo Shin

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security

0

0

0

0

12:23

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

16/11/2020

Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models

Thuy-Trang Vu, Dinh Phung, Gholamreza Haffari

Keywords Paper

combinatorial problem, unsupervised tasks, named recognition, broad-coverage models

0

0

0

0

11:57

26/08/2020

Adversarial Robustness of Flow-Based Generative Models

Phillip Pope, Yogesh Balaji, Soheil Feizi

Keywords Paper

0

0

0

0

12:24

06/12/2020

Improving Generalization in Reinforcement Learning with Mixture Regularization

KAIXIN WANG, Bingyi Kang, Jie Shao, Jiashi Feng

Keywords Paper

0

0

0

1

3:14

06/12/2021

Evaluating model performance under worst-case subpopulations

Mike Li, Hongseok Namkoong, Shangzhou Xia

Keywords Paper

robustness, fairness

0

0

0

0

5:45

07/09/2020

Transferring Pretrained Networks to Small Data via Category Decorrelation

Ying Jin, Zhangjie Cao, Mingsheng Long, Jianmin Wang

Keywords Paper

Category Decorrelation, Under Transfer

1

1

0

0

8:39

06/12/2020

High-recall causal discovery for autocorrelated time series with latent confounders

Andreas Gerhardus, Jakob Runge

Keywords Paper

0

0

0

0

3:22

26/04/2020

On Generalization Error Bounds of Noisy Gradient Methods for Non-Convex Learning

Jian Li, Xuanyuan Luo, Mingda Qiao

Keywords Paper

learning theory, generalization, nonconvex learning, stochastic gradient descent, Langevin dynamics

0

0

0

0

4:50

06/12/2021

Time-series Generation by Contrastive Imitation

Daniel Jarrett, Ioana Bica, Mihaela van der Schaar

Keywords Paper

generative model

0

0

0

0

8:47

04/07/2020

Learning to Faithfully Rationalize by Construction

Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron C. Wallace

Keywords Paper

NLP, neural classification, training, automatic evaluations

0

0

0

0

11:55

06/12/2021

Universal Approximation Using Well-Conditioned Normalizing Flows

Holden Lee, Chirag Pabbaraju, Anish Prasad Sevekari, Andrej Risteski

Keywords Paper

theory, deep learning, generative model

0

0

0

0

11:46

13/04/2021

Automatic differentiation variational inference with mixtures

Warren Morningstar, Sharad Vikram, Cusuh Ham and
Andrew Gallagher, Joshua Dillon

Keywords Paper

0

0

0

0

3:05

06/12/2020

Stochastic Normalization

Zhi Kou, Kaichao You, Mingsheng Long, Jianmin Wang

Keywords Paper

0

0

0

0

3:13

18/07/2021

Uniform Convergence, Adversarial Spheres and a Simple Remedy

Gregor Bachmann, Seyed Moosavi, Thomas Hofmann

Keywords Paper

Theory, Deep learning Theory

0

2

0

0

5:52

03/05/2021

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

Beliz Gunel, Jingfei Du, Alexis Conneau, Veselin Stoyanov

Keywords Paper

supervised contrastive learning, pre-trained language model fine-tuning, natural language understanding, generalization, few-shot learning, robustness

0

0

0

0

4:44

17/08/2020

Learning temporal coherence via self-supervision for GAN-based video generation

Mengyu Chu, You Xie, Jonas Mayer and
Laura Leal-Taixé, Nils Thuerey

Keywords Paper

self-supervision, temporal cycle-consistency, video super-resolution, generative adversarial network, unpaired video translation

0

0

0

0

16:59

18/07/2021

Delving into Deep Imbalanced Regression

Yuzhe Yang, Kaiwen Zha, YINGCONG CHEN and
Hao Wang, Dina Katabi

Keywords Paper

Applications

0

0

0

0

16:37

06/12/2020

Multi-task Additive Models for Robust Estimation and Automatic Structure Discovery

Yingjie Wang, Hong Chen, Feng Zheng and
Chen Xu, Tieliang Gong, Yanhong Chen

Keywords Paper

Applications -> Time Series Analysis; Probabilistic Methods -> Variational Inference, Probabilistic Methods -> Causal Inference

0

0

0

0

3:00

18/07/2021

Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks

Nezihe Merve Gürel, Xiangyu Qi, Luka Rimanic and
Ce Zhang, Bo Li

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

0

5:46