Naive Feature Selection: Sparsity in Naive Bayes

Abstract: Due to its linear complexity, naive Bayes classification remains an attractive supervised learning method, especially in very large-scale settings. We propose a sparse version of naive Bayes, which can be used for feature selection. This leads to a combinatorial maximum-likelihood problem, for which we provide an exact solution in the case of binary data, or a bound in the multinomial case. We prove that our bound becomes tight as the marginal contribution of additional features decreases. Both binary and multinomial sparse models are solvable in time almost linear in problem size, representing a very small extra relative cost compared to the classical naive Bayes. Numerical experiments on text data show that the naive Bayes feature selection method is as statistically effective as state-of-the-art feature selection methods such as recursive feature elimination, l_1-penalized logistic regression and LASSO, while being orders of magnitude faster. For a large data set, having more than with 1.6 million training points and about 12 million features, and with a non-optimized CPU implementation, our sparse naive Bayes model can be trained in less than 15 seconds.

06/12/2021

Naive Feature Selection: Sparsity in Naive Bayes

Armin Askari, Alexandre d'Aspremont, Laurent El Ghaoui

Comments

Similar Papers

RETRIEVE: Coreset Selection for Efficient and Robust Semi-Supervised Learning

Krishnateja Killamsetty, Xujiang Zhao, Feng Chen, Rishabh Iyer

Keywords Abstract Paper

optimization, semi-supervised learning

Minimizing FLOPs to Learn Efficient Sparse Representations

Biswajit Paria, Chih-Kuan Yeh, Ian E.H. Yen and Ning Xu, Pradeep Ravikumar, Barnabás Póczos

Keywords Abstract Paper

sparse embeddings, deep representations, metric learning, regularization

Perturb-and-max-product: Sampling and learning in discrete energy-based models

Miguel Lazaro-Gredilla, Antoine Dedieu, Dileep George

Keywords Abstract Paper

generative model, graph learning

Gradient descent in RKHS with importance labeling

Tomoya Murata, Taiji Suzuki

Keywords Abstract Paper

Exponential convergence rates of classification errors on learning with SGD and random features

Shingo Yashima, Atsushi Nitanda, Taiji Suzuki

Keywords Abstract Paper

Shapeshifter: a Parameter-efficient Transformer using Factorized Reshaped Matrices

Aliakbar Panahi, Seyran Saeedi, Tom Arodz

Keywords Abstract Paper

transformers

FedDR – Randomized Douglas-Rachford Splitting Algorithms for Nonconvex Federated Composite Optimization

Quoc Tran Dinh, Nhan H Pham, Dzung Phan, Lam Nguyen

Keywords Abstract Paper

optimization, federated learning

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Dylan J Foster, Akshay Krishnamurthy

Keywords Abstract Paper

theory, reinforcement learning and planning, bandits, online learning

Estimating decision tree learnability with polylogarithmic sample complexity

Guy Blanc, Neha Gupta, Jane Lange, Li-Yang Tan

Keywords Abstract Paper

Learned Robust PCA: A Scalable Deep Unfolding Approach for High-Dimensional Outlier Detection

HanQin Cai, Jialin Liu, Wotao Yin

Keywords Abstract Paper

deep learning, machine learning

One-round communication efficient distributed m-estimation

Yajie Bao, Weijia Xiong

Keywords Abstract Paper

SAT-based Decision Tree Learning for Large Data Sets

Andre Schidler, Stefan Szeider

Keywords Abstract Paper

Robust Unsupervised Learning via L-statistic Minimization

Andreas Maurer, Daniela Angela Parletta, Andrea Paudice, Massimiliano Pontil

Keywords Abstract Paper

Theory, Statistical Learning Theory

Memory and Computation-Efficient Kernel SVM via Binary Embedding and Ternary Model Coefficients

Zijian Lei, Liang Lan

Keywords Abstract Paper

Robust Density Estimation from Batches: The Best Things in Life are (Nearly) Free

Ayush Jain, Alon Orlitsky

Keywords Abstract Paper

Theory, Statistical Learning Theory

A Novel Sequential Coreset Method for Gradient Descent Algorithms

Jiawei Huang, Ruomin Huang, wenjie liu and Nikolaos Freris, Hu Ding

Keywords Abstract Paper

Optimization

Fast Multi-label Learning

Xiuwen Gong, Dong Yuan, Wei Bao

Keywords Abstract Paper

Machine Learning, Multi-instance; Multi-label; Multi-view learning

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Abstract Paper

deep learning, information redundancy, pruning

Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach

Malik Tiomoko, Hafiz Tiomoko Ali, Romain Couillet

Keywords Abstract Paper

Transfer Learning, Random Matrix Theory, Multi Task Learning

Learning Compositional Sparse Gaussian Processes with a Shrinkage Prior

Anh Tong, Toan M Tran, Hung Bui, Jaesik Choi

Keywords Abstract Paper

Structured Prediction with Partial Labelling through the Infimum Loss

Vivien Cabannnes, Francis Bach, Alessandro Rudi

Keywords Abstract Paper

Keywords Paper

Biswajit Paria, Chih-Kuan Yeh, Ian E.H. Yen and
Ning Xu, Pradeep Ravikumar, Barnabás Póczos

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jiawei Huang, Ruomin Huang, wenjie liu and
Nikolaos Freris, Hu Ding

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Beidi Chen, Tri Dao, Eric Winsor and
Zhao Song, Atri Rudra, Christopher Ré

Keywords Paper

Yue Sun, Adhyyan Narang, Ibrahim Gulluk and
Samet Oymak, Maryam Fazel

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper