Francisco J. Ferrer Troyano
Jesús S. Aguilar, José C. Riquelme University of Sevilla
FACIL:
Incremental Rule Learning from Data Streams
Contents
Motivation
FACIL
The hybrid knowledge model
Three user parameters
IIL updating
The moderate growth distance
Implicit and explicit forgetting heuristics
The multi-strategy classification method
Experiments
Future work
Motivation
Why Classify? Why Rules?
Heterogeneous Data Sources ÆNoise, Missing Values, Inconsistency High-Speed DataÆFast algorithms, approximate answers Open-Ended Data ÆOn-line learning… complex models?
FACIL – The hybrid knowledge model
Pres.
0 122
mm / Hg
25 35 75 90
Ritmo
50 210
puls/min
60 80 130 150
e1 a1 e2
a2 a3
e3 Frontera de Decisión
Primer enemigo recibido Amigo obtenido para el 2º enemigo
60 - 150
25-90
Ejemplos Enemigos Ejemplos Amigos
Sop-Pos:96 % Sop.Rel.=20% Prec.=98.5%
Edad 12 - 20
56 Intervalos
Ritmo
50 210
puls/min
e3
a2 a1
Pres.
0 122
mm / Hg
e1 e1
a3
a1 - a3 e1 - e3 21 - 29 30 - 38 39 - 47 48 - 56
e1 a1 e3 a3 a2 e2 12
Border examples inside rules
FACIL – The Algorithm
Three user parameters
Maximum number of rules / label
Æ necessary to bound the computational complexity
Minimum accuracy (purity) / rule: p/(p+n) Æ anytime,approximate answers
Maximum growth / dimension
FACIL – IIL updating
IIL: Every example Æ 1/3 cases:
1. Success 2. Anomaly 3. Novelty
General IL Approach
Successive Episodes ÆUpdating: Mt= f(Vt,Mt-1) Window of examples ÆVt: New [and] past examples
Éxito
0.7 0.6 0.1
0.4 0.5 0.7
A
A
1.0 0.2
a A
e a
FACIL – IIL / Success
0.7 0.6 0.1
0.4 0.5 0.7
1.0 0.2
X
a1 e1
a3
a2
e3
e2
a4
FACIL – IIL / Anomaly
0.7 0.6 0.1
0.4 0.5 0.7
1.0 0.2
Anomaly
FACIL – Learning Approach / Case 3
Selection:
The rule implying the minimum growth FACIL – IIL / Novelty: The GrowthDistance
0.7 0.9
0.1 0.1 0.5 0.9
1.0
R
X
1.0
d(R,x) = 0.2+0.4
B C
A 0.1 0.4 0.6
D
X
1.0
R R
d(R,x) = 0.25+0.2
0.7 0.6 0.1
0.4 0.5 0.7
1.0 0.2
0.7 0.6 0.1
0.4 0.5 0.7
R
R
1.0 0.2
Max. Dif. = 15%
G(R
a,x)= 0.2 G(R
b,x)= 0.2
XFACIL – IIL / Novelty: The Moderate Growth Distance
FACIL – Implicit and Explicit Forgetting Heuristics
reglas en t=16
ejemplos t=35
B
A
A B
reglas en t=34
B
A
e16 B e16, e29
e19, e28
e34
último ejemplo
e28 e19
e16 e16
e34
t=[17,33]
e29
e16 e34
multiple windows
A
A
A B B
A
FACIL – Multi-strategy Classification
?
A? ?
AB B
B
Experiments – General Purpose Classifier Evaluation
FACIL as a UCI Repository
10 Folds Cross Validation, 10 times, t-student (0.05)
Implicit Forgetting Heuristic
Synthetic Data Streams
Change Magnitude = ±0.1 for 40% attributes every 104examples INPUT: 105examples - 5% class noise: 900 training / 100 testing Both Implicit and Explicit Forgetting Heuristics
UCI Repository – Accuracy
50 60 70 80 90 100
Balance Breast C, Glass Heart S, Ionosphere Iris Diabetes Sonar Vehicle Vowel Waveform Wine C4.5Rules FACIL
Numerical Attributes
Num. & Symbolic Atts.
40 50 60 70 80 90 100
Anneal Autos
Clevelan d
Credit Rating German Credit
Hep atitis
Horse Colic Zoo
Audoilogy Breast C
ancer Mushroom
Primary Tumor
Soybean Vote C4.5Rules FACIL
UCI Repository – Accuracy
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
Balance Breast C, Glass Heart S, Ionosphere Iris Diabetes Sonar Vehicle Vowel Waveform Wine C4.5Rules FACIL
Numerical Attributes
UCI Repository – Number of rules
0 10 20 30 40 50 60 70 80
Anneal Autos Cleveland
Credit Rating German Credit
Hepat itis
Horse Colic Zoo Audoi
logy
Breast Ca ncer
Mushroom Primary Tum
or
Soybean Vote
C4.5Rules FACIL Num. & Simb. Atts.
UCI Repository – Number of rules
0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 0,45
10 20 30 40 50 60 70 80 90 100
Valores límite para los tres parámetros
En función del máximo número de reglas En función del máximo crecimiento permitido En función del umbral de mínima pureza
Time (seconds)
UCI Repository – Parametric Sensitivity
81 81,5 82 82,5 83 83,5 84 84,5 85 85,5 86 86,5 87
10 20 30 40 50 60 70 80 90 100
Valores límite para los tres parámetros
En función del máximo número de reglas En función del máximo crecimiento permitido En función del umbral de mínima pureza
Accuracy (%)
UCI Repository – Parametric Sensitivity
24 29 34 39 44 49 54 59 64 69 74
10 20 30 40 50 60 70 80 90 100
Valores límite para los tres parámetros
En función del máximo número de reglas En función del máximo crecimiento permitido En función del umbral de mínima pureza
Number of rules
UCI Repository – Parametric Sensitivity
Máx. Growth = 50% Máx. Growth = 75%
Nº Attributes Accuracy Time Nº Rules ≤25 Accuracy Time Nº Rules ≤25
10 96,36 14 10 98.32 20 7
20 93,25 63 9 94.57 35 12
30 91,04 124 3 89.26 53 10
40 89,7 221 2 89.19 67 7
50 83,88 277 4 87.72 77 10
Moving Hyperplane – Accuracy, time, and number of rules
Î>180 examples / second
Î>3500 examples / second Î>2500 examples / second
Î>650 examples / second
0 50 100 150 200 250 300 350 400 450 500 550 600
10 20 30 40 50
Max Crec. = 50 %, Max Nº Reglas = 50 Max Crec. = 75 %, Max Nº Reglas = 100
Time
Moving Hyperplane – Sensitivity to the number of attributes
73 78 83 88 93 98
1000 10000 20000 30000 40000 50000
Max Crec. = 75 %, Max Nº Reglas = 25 Max Crec. = 100 %, Max Nº Reglas = 10
Accuracy
Moving Hyperplane – Sensitivity to the number of examples
Sensitivity to attribute
Removing / Recovering attributes To adapt the Growth distance:
Changing attribute weights Æimprove the selection of rules
Reordering attributes according influence Æspeed-up the updating process
Processing streams with a variable number of attributes
To evaluate alternative distances between a rule and an example
PreProcessing streams? ÆMultiple Classification Future Work
Francisco Ferrer
LSI - University of Seville [email protected]