Estimation and Inference with Trees and Forests in High Dimensions

09/07/2020

Estimation and Inference with Trees and Forests in High Dimensions

Vasilis Syrgkanis, Emmanouil Zampetakis

Keywords: High-dimensional statistics, Excess risk bounds and generalization error bounds, Regression

Abstract Paper Similar Papers

Abstract: We analyze the finite sample mean squared error (MSE) performance of regression trees and forests in the high dimensional regime with binary features, under a sparsity constraint. We prove that if only $r$ of the $d$ features are relevant for the mean outcome function, then shallow trees built greedily via the CART empirical MSE criterion achieve MSE rates that depend only logarithmically on the ambient dimension $d$. We prove upper bounds, whose exact dependence on the number relevant variables $r$ depends on the correlation among the features and on the degree of relevance. For strongly relevant features, we also show that fully grown honest forests achieve fast MSE rates and their predictions are also asymptotically normal, enabling asymptotically valid inference that adapts to the sparsity of the regression function.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at COLT 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

12/07/2020

Smaller, more accurate regression forests using tree alternating optimization

Arman Zharmagambetov, Miguel Carreira-Perpinan

Keywords Paper

Supervised Learning

0

0

0

0

13:35

26/04/2020

Generalized Convolutional Forest Networks for Domain Generalization and Visual Recognition

Jongbin Ryu, Gitaek Kwon, Ming-Hsuan Yang, Jongwoo Lim

Keywords Paper

0

0

0

0

4:09

26/08/2020

Censored Quantile Regression Forest

Alexander Hanbo Li, Jelena Bradic

Keywords Paper

0

0

0

0

15:09

18/07/2021

Machine Unlearning for Random Forests

Jonathan Brophy, Daniel Lowd

Keywords Paper

Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

5:05

19/10/2020

Optimal end-biased histograms for hierarchical data

Rachel Behar, Sara Cohen

Keywords Paper

hierarchical data, end-biased, histograms

0

0

0

0

4:42

06/12/2020

An Efficient Adversarial Attack for Tree Ensembles

Chong Zhang, Huan Zhang, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

3:23

06/12/2020

Towards Convergence Rate Analysis of Random Forests for Classification

Wei Gao, Zhi-Hua Zhou

Keywords Paper

0

0

0

0

3:19

06/12/2021

From global to local MDI variable importances for random forests and when they are Shapley values

Antonio Sutera, Gilles Louppe, Van Anh Huynh-Thu and
Louis Wehenkel, Pierre Geurts

Keywords Paper

theory, machine learning, interpretability

0

0

0

0

8:32

13/04/2021

Sharp analysis of a simple model for random forests

Jason Klusowski

Keywords Paper

0

0

0

0

3:00

02/02/2021

Optimal Decision Trees for Nonlinear Metrics

Emir Demirović, Peter J. Stuckey

Keywords Paper

0

0

0

0

17:38

06/12/2020

Second Order PAC-Bayesian Bounds for the Weighted Majority Vote

Andres Masegosa, Stephan Lorenzen, Christian Igel, Yevgeny Seldin

Keywords Paper

0

0

0

0

3:21

04/08/2021

Quantifying Variational Approximation for Log-Partition Function

Romain Cosson, Devavrat Shah

Keywords Paper

0

0

0

0

16:54

06/12/2021

Robustifying Algorithms of Learning Latent Trees with Vector Variables

Fengzhuo Zhang, Vincent Tan

Keywords Paper

theory, graph learning

0

0

0

0

13:21

13/04/2021

Hierarchical clustering via sketches and hierarchical correlation clustering

Danny Vainstein, Vaggos Chatziafratis, Gui Citovsky and
Anand Rajagopalan, Mohammad Mahdian, Yossi Azar

Keywords Paper

0

0

0

0

3:03

18/07/2021

Analyzing the tree-layer structure of Deep Forests

Ludovic Arnould, Claire Boyer, Erwan Scornet

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:21

18/07/2021

Optimal Counterfactual Explanations in Tree Ensembles

Axel Parmentier, Thibaut Vidal

Keywords Paper

Social Aspects of Machine Learning, Fairness, Accountability, and Transparency

0

0

0

0

5:19

18/07/2021

SGA: A Robust Algorithm for Partial Recovery of Tree-Structured Graphical Models with Noisy Samples

Anshoo Tandon, Aldric Han, Vincent Tan

Keywords Paper

Probabilistic Methods, Graphical Models

0

0

0

0

5:13

08/07/2020

Modal Logics with Composition on Finite Forests: Expressivity and Complexity

Bartosz Bednarczyk, Stéphane Demri, Raul Fervari, Alessio Mansutti

Keywords Paper

static ambient logic, separation logic, graded modal logic, expressive power, complexity, modal logic on trees

0

0

0

0

24:17

06/12/2020

From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering

Ines Chami, Albert Gu, Vaggos Chatziafratis, Chris Ré

Keywords Paper

0

0

0

0

3:22

06/12/2021

Directed Probabilistic Watershed

Enrique Fita Sanmartin, Sebastian Damrich, Fred Hamprecht

Keywords Paper

graph learning, semi-supervised learning

0

0

0

0

13:07

04/07/2020

Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing

Daniel Fernández-González, Carlos Gómez-Rodríguez

Keywords Paper

Sequence-to-sequence parsing, Enriched Linearization, Faster Parsing, Top-down linearizations

0

0

0

0

6:47

06/12/2020

Inference for Batched Bandits

Kelly Zhang, Lucas Janson, Susan Murphy

Keywords Paper

0

0

0

0

3:10

06/12/2021

The Adaptive Doubly Robust Estimator and a Paradox Concerning Logging Policy

Masahiro Kato, Kenichiro McAlinn, Shota Yasui

Keywords Paper

machine learning, causality

0

0

0

0

14:41

06/12/2020

Locally-Adaptive Nonparametric Online Learning

Ilja Kuzborskij, Nicolò Cesa-Bianchi

Keywords Paper

Algorithms -> Kernel Methods, Algorithms -> Metric Learning

0

0

0

0

3:16

04/08/2021

Spectral Planting and the Hardness of Refuting Cuts, Colorability, and Communities in Random Graphs

Afonso S Bandeira, Jess Banks, Dmitriy Kunisky and
Christopher Moore, Alex Wein

Keywords Paper

0

0

0

0

18:02

06/12/2020

Improved Variational Bayesian Phylogenetic Inference with Normalizing Flows

Cheng Zhang

Keywords Paper

0

0

0

0

3:20

04/08/2021

Breaking The Dimension Dependence in Sparse Distribution Estimation under Communication Constraints

Wei-Ning Chen, Peter Kairouz, Ayfer Ozgur

Keywords Paper

0

0

0

0

15:28

18/07/2021

Optimal Off-Policy Evaluation from Multiple Logging Policies

Nathan Kallus, Yuta Saito, Masatoshi Uehara

Keywords Paper

Probabilistic Methods, Causal Inference

0

0

0

0

5:24

13/04/2021

Flow-based alignment approaches for probability measures in different spaces

Tam Le, Nhat Ho, Makoto Yamada

Keywords Paper

0

0

0

0

3:04

20/08/2020

Parsing with Zippers (Functional Pearl)

Pierce Darragh, Michael D. Adams

Keywords Paper

Zippers, Parsing with Derivatives, Parsing, Derivatives

0

0

0

0

14:58

06/12/2020

Joints in Random Forests

Alvaro Correia, Robert Peharz, Cassio de Campos

Keywords Paper

0

0

0

0

2:28

06/12/2020

Decision trees as partitioning machines to characterize their generalization properties

Jean-Samuel Leboeuf, Frédéric LeBlanc, Mario Marchand

Keywords Paper

0

0

0

0

2:38

06/12/2020

Tree! I am no Tree! I am a low dimensional Hyperbolic Embedding

Rishi Sonthalia, Anna Gilbert

Keywords Paper

Algorithms -> Large Scale Learning; Applications -> Natural Language Processing; Applications -> Object Recognition, Data, Challenges, Implementations, and Software -> Software Toolkits

0

0

0

0

3:23

13/04/2021

Nonparametric variable screening with optimal decision stumps

Jason Klusowski, Peter Tian

Keywords Paper

0

0

0

0

3:02

13/04/2021

Interpretable random forests via rule extraction

Clément Bénard, Gérard Biau, Sébastien Veiga, Erwan Scornet

Keywords Paper

0

0

0

0

3:19

04/08/2021

Benign Overfitting of Constant-Stepsize SGD for Linear Regression

Difan Zou, Jingfeng Wu, Vladimir Braverman and
Quanquan Gu, Sham Kakade

Keywords Paper

0

0

0

0

18:27

26/08/2020

Additive Tree-Structured Covariance Function for Conditional Parameter Spaces in Bayesian Optimization

Xingchen Ma, Matthew Blaschko

Keywords Paper

0

0

0

0

9:57

01/07/2020

Integrating Graph-Based and Transition-Based Dependency Parsers in the Deep Contextualized Era

Agnieszka Falenska, Anders Björkelund, Jonas Kuhn

Keywords Paper

0

0

0

0

14:03

06/12/2021

The Benefits of Implicit Regularization from SGD in Least Squares Problems

Difan Zou, Jingfeng Wu, Vladimir Braverman and
Quanquan Gu, Dean Foster, Sham Kakade

Keywords Paper

optimization, machine learning

0

0

0

0

16:05

06/12/2021

Probabilistic Forecasting: A Level-Set Approach

Hilaf Hasson, Bernie Wang, Tim Januschowski, Jan Gasthaus

Keywords Paper

deep learning

0

0

1

1

14:41