• Aucun résultat trouvé

Pruning randomized trees with L1-norm regularization

N/A
N/A
Protected

Academic year: 2021

Partager "Pruning randomized trees with L1-norm regularization"

Copied!
1
0
0

Texte intégral

(1)

Method: pruning with L1-regularization

Conclusion and perspectives

Results

Supervised learning is often applied on high-dimensional data:

3D Image segmentation (Kinect) Stability prediction in electrical network Genome studies

...

A robust supervised learning method: ensemble of randomized

decision trees (eg., extremely randomized trees , Geurts et al., 2006). Each decision tree is a hierarchical set of questions which leads

to a prediction.

High dimensional data often requires

huge ensembles of randomized trees

to achieve good performance, i.e.

- a high number of trees,

- each tree is fully developed (no pruning)

.

Huge amount of storage

Variable selection is performed by applying L1-norm regularization (LASSO) on the new training samples :

Regularization preserves

accuracy or even can

improve it.

Drastic pruning.

Our experiments show that it is possible to drastically prune ensembles of randomized trees while preserving or even

improving their accuracy.

Interested? Email me at [email protected] or go on www.ajoly.org

Original forest has 30000

test nodes

Robust with respect to the number of trees and the pre-pruning parameter

Each tree node is associated with a prediction weight and a node characteristic function equals to 1 if the sample reaches the node, 0 otherwise.

1

2 3

4 5

Example for one tree and one sample:

The solution leads to prune and to reweight the

randomized tree ensemble. A test node can be deleted if the weights of all its descendants are equal to zero.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.91 1.1 0 0.5 1 1.5 2 2.5 3 3.5 Quadratic error t

Friedman1 problem M=100 nmin=1 K=10 Regularised ET ET 0 100 200 300 400 500 600 700 800 900 0 0.5 1 1.5 2 2.5 3 3.5 Number of test no des t

Friedman1 problem M=100 nmin=1 K=10 Regularised ET 10 100 1000 10000 100000 1 10 100 1000

Friedman1 problem nmin=10 K=10

Regularised ET ET 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 1 10 100 1000

Friedman1 problem nmin=10 K=10

Regularised ET ET 0.150.2 0.250.3 0.350.4 0.450.5 0.550.6 0.65 1 10 100 1000 Friedman1 problem M=100 K=10 Regularized ET ET E rr o r Nu mber o f t est n odes

Forest size Pre-pruning effect

Pruning randomized trees

with L1-norm regularization

Arnaud Joly, François Schnitzler, Pierre Geurts, Louis Wehenkel

Systems and modeling, Department of EE and CS, University of Liège

Motivation

Goal of Supervised learning: from a dataset of input-output pairs to learn a function to predict the output for any (new) input .

The function thus embeds the original input space into a new space with the total number of tree nodes.

Future research:

- how to avoid the generation of a huge randomized tree ensemble prior to regularization?

- Study the links with compressed sensing

10 100 1000 10000 100000 1 10 100 1000 Friedman1 problem M=100 K=10 Regularized ET ET

Références

Documents relatifs

[r]

[r]

 Les éléments du namespace xsl sont des instructions qui copient des données du document source dans le résultat.  Les autres éléments sont inclus tels quels dans

For instance, the nodes in a social network can be labeled by the individual’s memberships to a given social group ; the node of protein interaction networks can be labeled with

We also introduce an approximation algorithm for the problem variant with unit connection costs and unit attack costs, and a specific integer linear model for the case where all

Figure 2 (a) in Section 4 provides an example il- lustrating the correlations between the degree de-coupled PageR- ank (D2PR) scores and external evidence for different values of p

If other nodes (or any Linked Data application) browses the DSSN the activity streams, data & media artefacts and any RDF resource can be retrieved via standard

Our method is independent of the classification method (Support Vector Machine, K-Nearest Neighbours, Neural Network, etc.), however, in any case we need to transform each