Method: pruning with L1-regularization
Conclusion and perspectives
Results
Supervised learning is often applied on high-dimensional data:
3D Image segmentation (Kinect) Stability prediction in electrical network Genome studies
...
A robust supervised learning method: ensemble of randomized
decision trees (eg., extremely randomized trees , Geurts et al., 2006). Each decision tree is a hierarchical set of questions which leads
to a prediction.
High dimensional data often requires
huge ensembles of randomized trees
to achieve good performance, i.e.
- a high number of trees,
- each tree is fully developed (no pruning)
.
Huge amount of storage
Variable selection is performed by applying L1-norm regularization (LASSO) on the new training samples :
Regularization preserves
accuracy or even can
improve it.
Drastic pruning.
Our experiments show that it is possible to drastically prune ensembles of randomized trees while preserving or even
improving their accuracy.
Interested? Email me at [email protected] or go on www.ajoly.org
Original forest has 30000
test nodes
Robust with respect to the number of trees and the pre-pruning parameter
Each tree node is associated with a prediction weight and a node characteristic function equals to 1 if the sample reaches the node, 0 otherwise.
1
2 3
4 5
Example for one tree and one sample:
The solution leads to prune and to reweight the
randomized tree ensemble. A test node can be deleted if the weights of all its descendants are equal to zero.
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.91 1.1 0 0.5 1 1.5 2 2.5 3 3.5 Quadratic error t
Friedman1 problem M=100 nmin=1 K=10 Regularised ET ET 0 100 200 300 400 500 600 700 800 900 0 0.5 1 1.5 2 2.5 3 3.5 Number of test no des t
Friedman1 problem M=100 nmin=1 K=10 Regularised ET 10 100 1000 10000 100000 1 10 100 1000
Friedman1 problem nmin=10 K=10
Regularised ET ET 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 1 10 100 1000
Friedman1 problem nmin=10 K=10
Regularised ET ET 0.150.2 0.250.3 0.350.4 0.450.5 0.550.6 0.65 1 10 100 1000 Friedman1 problem M=100 K=10 Regularized ET ET E rr o r Nu mber o f t est n odes
Forest size Pre-pruning effect
Pruning randomized trees
with L1-norm regularization
Arnaud Joly, François Schnitzler, Pierre Geurts, Louis Wehenkel
Systems and modeling, Department of EE and CS, University of Liège
Motivation
Goal of Supervised learning: from a dataset of input-output pairs to learn a function to predict the output for any (new) input .
The function thus embeds the original input space into a new space with the total number of tree nodes.
Future research:
- how to avoid the generation of a huge randomized tree ensemble prior to regularization?
- Study the links with compressed sensing
10 100 1000 10000 100000 1 10 100 1000 Friedman1 problem M=100 K=10 Regularized ET ET