12/07/2020

Feature Selection using Stochastic Gates

Yutaro Yamada, Ofir Lindenbaum, Sahand Negahban, Yuval Kluger

Keywords: Supervised Learning

Abstract: Feature selection problems have been extensively studied in the setting of linear estimation, for instance LASSO, but less emphasis has been placed on feature selection for neural networks. In this study, we propose a method for feature selection in non-linear function estimation problems. The new procedure is based on directly penalizing the $\ell_0$ norm of features, or the count of the number of selected features. Our $\ell_0$ based regularization relies on a continuous relaxation of the Bernoulli distribution, which allows our model to learn the parameters of the approximate Bernoulli distributions via gradient descent. The proposed framework simultaneously learns a non-linear regression or classification function while selecting a small subset of features. We provide an information-theoretic justification for incorporating Bernoulli distribution for feature selection. Furthermore, we evaluate our method using synthetic and real-life data and demonstrate that our approach outperforms other commonly used methods in terms of predictive performance and feature selection.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers