Sharp analysis of a simple model for random forests

13/04/2021

Sharp analysis of a simple model for random forests

Jason Klusowski

Keywords:

Abstract Paper Similar Papers

Abstract: Random forests have become an important tool for improving accuracy in regression and classification problems since their inception by Leo Breiman in 2001. In this paper, we revisit a historically important random forest model, called centered random forests, originally proposed by Breiman in 2004 and later studied by Gérard Biau in 2012, where a feature is selected at random and the splits occurs at the midpoint of the node along the chosen feature. If the regression function is d-dimensional and Lipschitz, we show that, given access to n observations, the mean-squared prediction error is O((n(\log n)^{(d-1)/2})^{-\frac{1}{d\log2+1}}). This positively answers an outstanding question of Biau about whether the rate of convergence for this random forest model could be improved beyond O(n^{-\frac{1}{d(4/3)\log2+1}}). Furthermore, by a refined analysis of the approximation and estimation errors for linear models, we show that our new rate cannot be improved in general. Finally, we generalize our analysis and improve current prediction error bounds for another random forest model, called median random forests, in which each tree is constructed from subsampled data and the splits are performed at the empirical median along a chosen feature.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AISTATS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Towards Convergence Rate Analysis of Random Forests for Classification

Wei Gao, Zhi-Hua Zhou

Keywords Paper

0

0

0

0

3:19

06/12/2021

From global to local MDI variable importances for random forests and when they are Shapley values

Antonio Sutera, Gilles Louppe, Van Anh Huynh-Thu and
Louis Wehenkel, Pierre Geurts

Keywords Paper

theory, machine learning, interpretability

0

0

0

0

8:32

18/07/2021

SGA: A Robust Algorithm for Partial Recovery of Tree-Structured Graphical Models with Noisy Samples

Anshoo Tandon, Aldric Han, Vincent Tan

Keywords Paper

Probabilistic Methods, Graphical Models

0

0

0

0

5:13

06/12/2020

Improved Variational Bayesian Phylogenetic Inference with Normalizing Flows

Cheng Zhang

Keywords Paper

0

0

0

0

3:20

12/07/2020

Smaller, more accurate regression forests using tree alternating optimization

Arman Zharmagambetov, Miguel Carreira-Perpinan

Keywords Paper

Supervised Learning

0

0

0

0

13:35

06/12/2020

Second Order PAC-Bayesian Bounds for the Weighted Majority Vote

Andres Masegosa, Stephan Lorenzen, Christian Igel, Yevgeny Seldin

Keywords Paper

0

0

0

0

3:21

13/04/2021

Wasserstein random forests and applications in heterogeneous treatment effects

Qiming Du, Gérard Biau, Francois Petit, Raphaël Porcher

Keywords Paper

0

0

0

0

2:18

26/08/2020

Censored Quantile Regression Forest

Alexander Hanbo Li, Jelena Bradic

Keywords Paper

0

0

0

0

15:09

18/07/2021

Analyzing the tree-layer structure of Deep Forests

Ludovic Arnould, Claire Boyer, Erwan Scornet

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:21

26/04/2020

Generalized Convolutional Forest Networks for Domain Generalization and Visual Recognition

Jongbin Ryu, Gitaek Kwon, Ming-Hsuan Yang, Jongwoo Lim

Keywords Paper

0

0

0

0

4:09

06/12/2020

Fairness constraints can help exact inference in structured prediction

Kevin Bello, Jean Honorio

Keywords Paper

0

0

0

0

3:11

06/12/2021

Directed Probabilistic Watershed

Enrique Fita Sanmartin, Sebastian Damrich, Fred Hamprecht

Keywords Paper

graph learning, semi-supervised learning

0

0

0

0

13:07

13/04/2021

Hierarchical clustering via sketches and hierarchical correlation clustering

Danny Vainstein, Vaggos Chatziafratis, Gui Citovsky and
Anand Rajagopalan, Mohammad Mahdian, Yossi Azar

Keywords Paper

0

0

0

0

3:03

02/02/2021

Optimal Decision Trees for Nonlinear Metrics

Emir Demirović, Peter J. Stuckey

Keywords Paper

0

0

0

0

17:38

06/12/2020

An Efficient Adversarial Attack for Tree Ensembles

Chong Zhang, Huan Zhang, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

3:23

09/07/2020

Estimation and Inference with Trees and Forests in High Dimensions

Vasilis Syrgkanis, Emmanouil Zampetakis

Keywords Paper

High-dimensional statistics, Excess risk bounds and generalization error bounds, Regression

0

0

0

0

13:41

18/11/2020

Deep dynamic boosted forest

Haixin Wang, Xingzhang Ren, Jinan Sun and
Wei Ye, Long Chen, Muzhi Yu, Shikun Zhang

Keywords Paper

0

0

0

0

10:54

13/04/2021

Entropy partial transport with tree metrics: Theory and practice

Tam Le, Truyen Nguyen

Keywords Paper

0

0

0

0

3:05

06/12/2020

Probabilistic Inference with Algebraic Constraints: Theoretical Limits and Practical Approximations

Zhe Zeng, Paolo Morettin, Fanqi Yan and
Antonio Vergari, Guy Van den Broeck

Keywords Paper

0

0

0

0

3:17

06/12/2020

Smooth And Consistent Probabilistic Regression Trees

Sami Alkhoury, Emilie Devijver, Marianne Clausel and
Myriam Tami, Eric Gaussier, georges Oppenheim

Keywords Paper

0

0

0

0

3:11

26/08/2020

Additive Tree-Structured Covariance Function for Conditional Parameter Spaces in Bayesian Optimization

Xingchen Ma, Matthew Blaschko

Keywords Paper

0

0

0

0

9:57

04/08/2021

Quantifying Variational Approximation for Log-Partition Function

Romain Cosson, Devavrat Shah

Keywords Paper

0

0

0

0

16:54

06/12/2020

From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering

Ines Chami, Albert Gu, Vaggos Chatziafratis, Chris Ré

Keywords Paper

0

0

0

0

3:22

06/12/2021

Robustifying Algorithms of Learning Latent Trees with Vector Variables

Fengzhuo Zhang, Vincent Tan

Keywords Paper

theory, graph learning

0

0

0

0

13:21

04/08/2021

Stochastic block model entropy and broadcasting on trees with survey

Emmanuel Abbe, Elisabetta Cornacchia, Yuzhou Gu, Yury Polyanskiy

Keywords Paper

0

0

0

0

15:19

06/12/2020

Joints in Random Forests

Alvaro Correia, Robert Peharz, Cassio de Campos

Keywords Paper

0

0

0

0

2:28

18/07/2021

Machine Unlearning for Random Forests

Jonathan Brophy, Daniel Lowd

Keywords Paper

Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

5:05

04/08/2021

Breaking The Dimension Dependence in Sparse Distribution Estimation under Communication Constraints

Wei-Ning Chen, Peter Kairouz, Ayfer Ozgur

Keywords Paper

0

0

0

0

15:28

06/12/2020

Tree! I am no Tree! I am a low dimensional Hyperbolic Embedding

Rishi Sonthalia, Anna Gilbert

Keywords Paper

Algorithms -> Large Scale Learning; Applications -> Natural Language Processing; Applications -> Object Recognition, Data, Challenges, Implementations, and Software -> Software Toolkits

0

0

0

0

3:23

13/04/2021

Flow-based alignment approaches for probability measures in different spaces

Tam Le, Nhat Ho, Makoto Yamada

Keywords Paper

0

0

0

0

3:04

22/06/2020

Towards a better understanding of randomized greedy matching

Zhihao Gavin Tang, Xiaowei Wu, Yuhao Zhang

Keywords Paper

Randomized Primal Dual, Oblivious Matching, Randomized Greedy Matching

0

0

0

0

22:50

06/12/2021

The Adaptive Doubly Robust Estimator and a Paradox Concerning Logging Policy

Masahiro Kato, Kenichiro McAlinn, Shota Yasui

Keywords Paper

machine learning, causality

0

0

0

0

14:41

07/09/2020

Residual Likelihood Forests

Yan Zuo, Tom Drummond

Keywords Paper

Random Forests, Boosting, Classification, Ensemble Methods, Decision Forests, Decision Trees

0

0

0

0

9:00

06/12/2021

Implicit Generative Copulas

Tim Janke, Mohamed Ghanmi, Florian Steinke

Keywords Paper

deep learning, generative model

0

0

0

0

5:46

02/02/2021

Learning To Scale Mixed-Integer Programs

Timo Berthold, Gregor Hendel

Keywords Paper

0

0

0

0

18:15

12/07/2020

The Tree Ensemble Layer: Differentiability meets Conditional Computation

Hussein Hazimeh, Natalia Ponomareva, Rahul Mazumder and
Zhenyu Tan, Petros Mol

Keywords Paper

Supervised Learning

0

0

0

0

15:06

06/12/2020

Bandit Samplers for Training Graph Neural Networks

Ziqi Liu, Zhengwei Wu, Zhiqiang Zhang and
Jun Zhou, Shuang Yang, Le Song, Yuan Qi

Keywords Paper

0

0

0

0

3:15

12/07/2020

Low-Variance and Zero-Variance Baselines for Extensive-Form Games

Trevor Davis, Martin Schmid, Michael Bowling

Keywords Paper

Planning, Control, and Multiagent Learning

0

0

0

0

15:44

06/12/2021

Coresets for Decision Trees of Signals

Ibrahim Jubran, Ernesto Evgeniy Sanches Shayda, Ilan I Newman, Dan Feldman

Keywords Paper

machine learning

0

0

0

0

14:50

06/12/2020

Towards Scalable Bayesian Learning of Causal DAGs

Jussi Viinikka, Antti Hyttinen, Johan Pensar, Mikko Koivisto

Keywords Paper

Theory -> Learning Theory, Theory -> Frequentist Statistics

0

0

0

0

3:25