SAT-based Decision Tree Learning for Large Data Sets

Abstract: Decision trees of low depth are beneficial for understanding and interpreting the data they represent. Unfortunately, finding a decision tree of lowest depth that correctly represents given data is NP-hard. Hence known algorithms either (i) utilize heuristics that do not optimize the depth or (ii) are exact but scale only to small or medium-sized instances. We propose a new hybrid approach to decision tree learning, combining heuristic and exact methods in a novel way. More specifically, we employ SAT encodings repeatedly to local parts of a decision tree provided by a standard heuristic, leading to a global depth improvement. This allows us to scale the power of exact SAT-based methods to almost arbitrarily large data sets. We evaluate our new approach experimentally on a range of real-world instances that contain up to several thousand samples. In almost all cases, our method successfully decreases the depth of the initial decision tree; often, the decrease is significant.

06/12/2020

SAT-based Decision Tree Learning for Large Data Sets

Andre Schidler, Stefan Szeider

Comments

Similar Papers

Estimating decision tree learnability with polylogarithmic sample complexity

Guy Blanc, Neha Gupta, Jane Lange, Li-Yang Tan

Keywords Abstract Paper

Low-Variance and Zero-Variance Baselines for Extensive-Form Games

Trevor Davis, Martin Schmid, Michael Bowling

Keywords Abstract Paper

Planning, Control, and Multiagent Learning

Naive Feature Selection: Sparsity in Naive Bayes

Armin Askari, Alexandre d'Aspremont, Laurent El Ghaoui

Keywords Abstract Paper

Statistically and Computationally Efficient Linear Meta-representation Learning

Kiran Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh

Keywords Abstract Paper

optimization, meta learning, representation learning, few shot learning

Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach

Malik Tiomoko, Hafiz Tiomoko Ali, Romain Couillet

Keywords Abstract Paper

Transfer Learning, Random Matrix Theory, Multi Task Learning

Adaptive Reduced Rank Regression

Qiong Wu, Felix MF Wong, Yanhua Li and Zhenming Liu, Varun Kanade

Keywords Abstract Paper

FedDR – Randomized Douglas-Rachford Splitting Algorithms for Nonconvex Federated Composite Optimization

Quoc Tran Dinh, Nhan H Pham, Dzung Phan, Lam Nguyen

Keywords Abstract Paper

optimization, federated learning

Exponential convergence rates of classification errors on learning with SGD and random features

Shingo Yashima, Atsushi Nitanda, Taiji Suzuki

Keywords Abstract Paper

Optimization Methods for Interpretable Differentiable Decision Trees Applied to Reinforcement Learning

Andrew Silva, Matthew Gombolay, Taylor Killian and Ivan Jimenez, Sung-Hyun Son

Keywords Abstract Paper

Minimizing FLOPs to Learn Efficient Sparse Representations

Biswajit Paria, Chih-Kuan Yeh, Ian E.H. Yen and Ning Xu, Pradeep Ravikumar, Barnabás Póczos

Keywords Abstract Paper

sparse embeddings, deep representations, metric learning, regularization

Nonparametric variable screening with optimal decision stumps

Jason Klusowski, Peter Tian

Keywords Abstract Paper

NoisyCUR: An algorithm for two-cost budgeted matrix completion

Dong Hu, Alex Gittens, Malik Magdon-Ismail

Keywords Abstract Paper

matrix completion, low-rank approximation, nuclear norm minimization

Learning Compositional Sparse Gaussian Processes with a Shrinkage Prior

Anh Tong, Toan M Tran, Hung Bui, Jaesik Choi

Keywords Abstract Paper

Optimal Decision Trees for Nonlinear Metrics

Emir Demirović, Peter J. Stuckey

Keywords Abstract Paper

Interpretable random forests via rule extraction

Clément Bénard, Gérard Biau, Sébastien Veiga, Erwan Scornet

Keywords Abstract Paper

Realistic evaluation of transductive few-shot learning

Olivier Veilleux, Malik Boudiaf, Pablo Piantanida, Ismail Ben Ayed

Keywords Abstract Paper

optimization, machine learning, few shot learning

Upper bounds for Model-Free Row-Sparse Principal Component Analysis

Guanyi Wang, Santanu Dey

Keywords Abstract Paper

Optimization - General

Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning

Vivien Cabannes, Loucas Pillaud-Vivien, Francis Bach, Alessandro Rudi

Keywords Abstract Paper

machine learning, kernel methods, semi-supervised learning

Locally-Adaptive Nonparametric Online Learning

Ilja Kuzborskij, Nicolò Cesa-Bianchi

Keywords Abstract Paper

Algorithms -> Kernel Methods, Algorithms -> Metric Learning

Linearly Convergent Frank-Wolfe without Line-Search

Fabian Pedregosa, Geoffrey Negiar, Armin Askari, Martin Jaggi

Keywords Abstract Paper

One-round communication efficient distributed m-estimation

Yajie Bao, Weijia Xiong

Keywords Abstract Paper

Learning Online Algorithms with Distributional Advice

Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos and Ali Vakilian, Nikos Zarifis

Keywords Abstract Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Qiong Wu, Felix MF Wong, Yanhua Li and
Zhenming Liu, Varun Kanade

Keywords Paper

Keywords Paper

Keywords Paper

Andrew Silva, Matthew Gombolay, Taylor Killian and
Ivan Jimenez, Sung-Hyun Son

Keywords Paper

Biswajit Paria, Chih-Kuan Yeh, Ian E.H. Yen and
Ning Xu, Pradeep Ravikumar, Barnabás Póczos

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos and
Ali Vakilian, Nikos Zarifis

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper