Sparse Learning with CART

Abstract: Decision trees with binary splits are popularly constructed using Classification and Regression Trees (CART) methodology. For regression models, this approach recursively divides the data into two near-homogenous daughter nodes according to a split point that maximizes the reduction in sum of squares error (the impurity) along a particular variable. This paper aims to study the statistical properties of regression trees constructed with CART. In doing so, we find that the training error is governed by the Pearson correlation between the optimal decision stump and response data in each node, which we bound by constructing a prior distribution on the split points and solving a nonlinear optimization problem. We leverage this connection between the training error and Pearson correlation to show that CART with cost-complexity pruning achieves an optimal complexity/goodness-of-fit tradeoff when the depth scales with the logarithm of the sample size. Data dependent quantities, which adapt to the dimensionality and latent structure of the regression model, are seen to govern the rates of convergence of the prediction error.

06/12/2020

Sparse Learning with CART

Jason Klusowski

Comments

Similar Papers

Learning by Minimizing the Sum of Ranked Range

Shu Hu, Yiming Ying, xin wang, Siwei Lyu

Keywords Abstract Paper

Algorithms -> Sparsity and Compressed Sensing, Theory -> Frequentist Statistics

Decision Trees for Decision-Making under the Predict-then-Optimize Framework

Adam Elmachtoub, Jason Cheuk Nam Liang, Ryan McNellis

Keywords Abstract Paper

Supervised Learning

A Wasserstein Minimax Framework for Mixed Linear Regression

Theo Diamandis, Yonina Eldar, Alireza Fallah and Farzan Farnia, Asuman Ozdaglar

Keywords Abstract Paper

Algorithms, Multimodal Learning

Optimal Decision Trees for Nonlinear Metrics

Emir Demirović, Peter J. Stuckey

Keywords Abstract Paper

Localization, Convexity, and Star Aggregation

Suhas Vijaykumar

Keywords Abstract Paper

theory, online learning

Multi-level Gaussian Graphical Models Conditional on Covariates

Gi Bum Kim, Seyoung Kim

Keywords Abstract Paper

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Abstract Paper

Novel Upper Bounds for the Constrained Most Probable Explanation Task

Tahrima Rahman, Sara Rouhani, Vibhav Gogate

Keywords Abstract Paper

optimization, adversarial robustness and security, graph learning, interpretability

Fast and Accurate Ranking Regression

Ilkay Yildiz, Jennifer Dy, Deniz Erdogmus and Jayashree Kalpathy-Cramer, Susan Ostmo, J. Peter Campbell, Michael F. Chiang, Stratis Ioannidis

Keywords Abstract Paper

Convex Polytope Trees and its Application to VAE

Mohammadreza Armandpour, Ali Sadeghian, Mingyuan Zhou

Keywords Abstract Paper

machine learning

Information-Theoretic Generalization Bounds for Stochastic Gradient Descent

Gergely Neu, Gintare Karolina Dziugiate, Mahdi Haghifam, Daniel M. Roy

Keywords Abstract Paper

Combining Preference Elicitation with Local Search and Greedy Search for Matroid Optimization

Nawal Benabbou, Cassandre Leroy, Thibaut Lust, Patrice Perny

Keywords Abstract Paper

On multilevel monte carlo unbiased gradient estimation for deep latent variable models

Yuyang Shi, Rob Cornish

Keywords Abstract Paper

Expert Learning through Generalized Inverse Multiobjective Optimization: Models, Insights and Algorithms

Chaosheng Dong, Bo Zeng

Keywords Abstract Paper

Learning Theory

Meta-learning with Stochastic Linear Bandits

Leonardo Cella, Alessandro Lazaric, Massimiliano Pontil

Keywords Abstract Paper

Transfer, Multitask and Meta-learning

Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction

Kenta Niwa, Guoqiang Zhang, W. Bastiaan Kleijn and Noboru Harada, Hiroshi Sawada, Akinori Fujino

Keywords Abstract Paper

Optimization, Distributed and Parallel Optimization

Zeroth-Order Non-Convex Learning via Hierarchical Dual Averaging

Amélie Héliou, Matthieu Martin, Panayotis Mertikopoulos, Thibaud J Rahier

Keywords Abstract Paper

Optimization, Non-Convex Optimization

The Last-Iterate Convergence Rate of Optimistic Mirror Descent in Stochastic Variational Inequalities

Waïss Azizian, Franck Iutzeler, Jérôme Malick, Panayotis Mertikopoulos

Keywords Abstract Paper

Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity

Shaocong Ma, Ziyi Chen, Yi Zhou, Shaofeng Zou

Keywords Abstract Paper

Machine Learning, Reinforcement Learning, Optimization

Model Selection for Bayesian Autoencoders

Ba-Hien Tran, Simone Rossi, Dimitrios Milios and Pietro Michiardi, Edwin Bonilla, Maurizio Filippone

Keywords Abstract Paper

optimization, self-supervised learning, generative model, representation learning

Learning Binary Decision Trees by Argmin Differentiation

Valentina Zantedeschi, Matt J. Kusner, Vlad Niculae

Keywords Abstract Paper

Deep Learning

Keywords Paper

Keywords Paper

Theo Diamandis, Yonina Eldar, Alireza Fallah and
Farzan Farnia, Asuman Ozdaglar

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Ilkay Yildiz, Jennifer Dy, Deniz Erdogmus and
Jayashree Kalpathy-Cramer, Susan Ostmo, J. Peter Campbell, Michael F. Chiang, Stratis Ioannidis

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Kenta Niwa, Guoqiang Zhang, W. Bastiaan Kleijn and
Noboru Harada, Hiroshi Sawada, Akinori Fujino

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Ba-Hien Tran, Simone Rossi, Dimitrios Milios and
Pietro Michiardi, Edwin Bonilla, Maurizio Filippone

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Clément L Canonne, Gautam Kamath, Audra McMillan and
Jonathan Ullman, Lydia Zakynthinou

Keywords Paper

Keywords Paper

Keywords Paper

Bruno Loureiro, Gabriele Sicuro, Cedric Gerbelot and
Alessandro Pacco, Florent Krzakala, Lenka Zdeborová

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper