Validation Free and Replication Robust Volume-based Data Valuation

Abstract: Data valuation arises as a non-trivial challenge in use cases such as collaborative data sharing, data markets, among others. The value of data is often associated with the learning performance (e.g., validation accuracy) of the model trained on the data. This intuitive methodology introduces a high coupling between data valuation and validation. This may be undesirable because a validation set may not be available in practice, and it can be challenging for the data providers to reach an agreement on the choice of the validation set. A separate but practical issue is data replication. Given the value of some data points, a dishonest data provider may replicate these data points to exploit the valuation for a higher reward/payment. We observe that the diversity of the data points is an inherent property of the dataset that is independent of validation. We formalize diversity via the volume of the data matrix (determinant of its left Gram). This allows us to formally connect the diversity of data to the learning performance without requiring validation. Furthermore, we propose a robust volume with theoretical replication robustness guarantees by following the intuition that copying the same data points does not increase the diversity in data. We perform extensive experiments to demonstrate its consistency and practical advantages over existing baselines and show that our method is model- and task-agnostic and ﬂexibly adaptable to various neural networks.

04/07/2020

Validation Free and Replication Robust Volume-based Data Valuation

Xinyi Xu, Zhaoxuan Wu, Chuan Sheng Foo, Bryan Kian Hsiang Low

Comments

Similar Papers

Estimating the influence of auxiliary tasks for multi-task learning of sequence tagging tasks

Fynn Schröder, Chris Biemann

Keywords Abstract Paper

multi-task tasks, MTL, TL, MTL setups

Gradient Driven Rewards to Guarantee Fairness in Collaborative Machine Learning

Xinyi Xu, Lingjuan Lyu, Xingjun Ma and Chenglin Miao, Chuan Sheng Foo, Bryan Kian Hsiang Low

Keywords Abstract Paper

machine learning, fairness, federated learning

Adversarial Training Reduces Information and Improves Transferability

Matteo Terzi, Alessandro Achille, Marco Maggipinto, Gian Antonio Susto

Keywords Abstract Paper

Learning to Faithfully Rationalize by Construction

Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron C. Wallace

Keywords Abstract Paper

NLP, neural classification, training, automatic evaluations

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and Danil Karpushkin, Dmitry Vetrov

Keywords Abstract Paper

deep learning, optimization

Sample elicitation

Jiaheng Wei, Zuyue Fu, Yang Liu and Xingyu Li, Zhuoran Yang, Zhaoran Wang

Keywords Abstract Paper

QPLEX: Duplex Dueling Multi-Agent Q-Learning

Jianhao Wang, Zhizhou Ren, Terry Liu and Yang Yu, Chongjie Zhang

Keywords Abstract Paper

Dueling structure, Value factorization, Multi-agent reinforcement learning

Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift

Remi Tachet des Combes, Han Zhao, Yu-Xiang Wang, Geoffrey Gordon

Keywords Abstract Paper

Multi-Cell Compositional LSTM for NER Domain Adaptation

Chen Jia, Yue Zhang

Keywords Abstract Paper

NER Adaptation, Cross-domain NER, multi-task learning, cross-domain transfer

Imitation Learning via Off-Policy Distribution Matching

Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

Keywords Abstract Paper

reinforcement learning, deep learning, imitation learning, adversarial learning

Contrastive Model Invertion for Data-Free Knolwedge Distillation

Gongfan Fang, Jie Song, Xinchao Wang and Chengchao Shen, Xingen Wang, Mingli Song

Keywords Abstract Paper

Machine Learning, Deep Learning, Explainable/Interpretable Machine Learning, Transfer, Adaptation, Multi-task Learning

Distance-Based Regularisation of Deep Networks for Fine-Tuning

Henry Gouk, Timothy Hospedales, massimiliano pontil

Keywords Abstract Paper

Statistical Learning Theory, Transfer Learning, Deep Learning

Boosting Adversarial Training with Hypersphere Embedding

Tianyu Pang, Xiao Yang, Yinpeng Dong and Kun Xu, Jun Zhu, Hang Su

Keywords Abstract Paper

Representation Learning via Invariant Causal Mechanisms

Jovana Mitrovic, Brian McWilliams, Jacob C Walker and Lars Buesing, Charles Blundell

Keywords Abstract Paper

Self-supervised Learning, Representation Learning, Causality, Contrastive Methods

DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks

Boris van Breugel, Trent Kyono, Jeroen Berrevoets, Mihaela van der Schaar

Keywords Abstract Paper

machine learning, generative model, causality, fairness

Repulsive Deep Ensembles are Bayesian

Francesco D'Angelo, Vincent Fortuin

Keywords Abstract Paper

deep learning, optimization

GSPL: A Succinct Kernel Model for Group-Sparse Projections Learning of Multiview Data

Danyang Wu, Jin Xu, Xia Dong and Meng Liao, Rong Wang, Feiping Nie, Xuelong Li

Keywords Abstract Paper

Machine Learning, Learning Sparse Models, Multi-instance; Multi-label; Multi-view learning, Unsupervised Learning

Learning Fair Representations for Kernel Models

Zilong Tan, Samuel Yeom, Matt Fredrikson, Ameet Talwalkar

Keywords Abstract Paper

Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation

Haoxiang Wang, Han Zhao, Bo Li

Keywords Abstract Paper

Algorithms, Multitask, Transfer, and Meta Learning

Latent Independent Excitation for Generalizable Sensor-based Cross-Person Activity Recognition

Hangwei Qian, Sinno Jialin Pan, Chunyan Miao

Keywords Abstract Paper

Consistent Non-Parametric Methods for Maximizing Robustness

Robi Bhattacharjee, Kamalika Chaudhuri

Keywords Paper

Xinyi Xu, Lingjuan Lyu, Xingjun Ma and
Chenglin Miao, Chuan Sheng Foo, Bryan Kian Hsiang Low

Keywords Paper

Keywords Paper

Keywords Paper

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

Jiaheng Wei, Zuyue Fu, Yang Liu and
Xingyu Li, Zhuoran Yang, Zhaoran Wang

Keywords Paper

Jianhao Wang, Zhizhou Ren, Terry Liu and
Yang Yu, Chongjie Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Gongfan Fang, Jie Song, Xinchao Wang and
Chengchao Shen, Xingen Wang, Mingli Song

Keywords Paper

Keywords Paper

Tianyu Pang, Xiao Yang, Yinpeng Dong and
Kun Xu, Jun Zhu, Hang Su

Keywords Paper

Jovana Mitrovic, Brian McWilliams, Jacob C Walker and
Lars Buesing, Charles Blundell

Keywords Paper

Keywords Paper

Keywords Paper

Danyang Wu, Jin Xu, Xia Dong and
Meng Liao, Rong Wang, Feiping Nie, Xuelong Li

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yasaman Esfandiari, Sin Yong Tan, Zhanhong Jiang and
Aditya Balu, Ethan Herron, Chinmay Hegde, Soumik Sarkar

Keywords Paper

Qipeng Guo, Zhijing Jin, Ziyu Wang and
Xipeng Qiu, Weinan Zhang, Jun Zhu, Zheng Zhang, Wipf David

Keywords Paper

Ruiyi Zhang, Changyou Chen, Zhe Gan and
Zheng Wen, Wenlin Wang, Lawrence Carin

Keywords Paper

Andrew Jaegle, Yury Sulsky, Arun Ahuja and
Jake Bruce, Rob Fergus, Greg Wayne

Keywords Paper

Keywords Paper

Keywords Paper

Anilesh K. Krishnaswamy, Haoming Li, David Rein and
Hanrui Zhang, Vincent Conitzer

Keywords Paper

Keywords Paper

Keywords Paper

Jianhao Wang, Zhizhou Ren, Beining Han and
Jianing Ye, Chongjie Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Nicolas Papernot, Abhradeep Thakurta, Shuang Song and
Steve Chien, Úlfar Erlingsson

Keywords Paper