• Aucun résultat trouvé

• DRBMs slightly superior overall, and less sensitive to the choice of preprocessing hyperparameters than SVMs.

N/A
N/A
Protected

Academic year: 2022

Partager "• DRBMs slightly superior overall, and less sensitive to the choice of preprocessing hyperparameters than SVMs."

Copied!
1
0
0

Texte intégral

(1)

Conclusion

• Tackling a challenging detonation type classification task

• Proposed methodology to properly train and evaluate classifiers

• Application to SVMs and DRBMs

• DRBMs slightly superior overall, and less sensitive to the choice of preprocessing hyperparameters than SVMs.

• Bengio, Y. (2009). Learning Deep Archi- tectures for AI. Foundations and Trends in Machine Learning, to appear.

• Cortes, C. and V. Vapnik (1995). Support- Vector Networks. Mach. Learn. 20(3), 273–

297.

• Larochelle, H. and Y. Bengio (2008). Clas- sification using discriminative restricted Boltzmann machines. In A. M

C

C

AL

-

LUM

and S. R

OWEIS

(Eds.), Proceedings of the 25th Annual International Conference on Machine Learning (ICML 2008), pp. 536–

543. Omnipress.

Sensitivity to Preprocessing Choice

• Performance can vary a lot depending on various parameters govern- ing data preprocessing, e.g. type of segmentation, number and size of signal windows for Fast Fourier Transform, etc.

Effect of Preprocessing on Accuracy (SVMs, Realistic−by−Day)

Accuracy

Density

0 5 10 15 20 25

0.4 0.5 0.6 0.7 0.8

● ●

●●

: Log False

: FFT Windows1

● ● ●●●●●

: Log True

: FFT Windows1

: Log False

: FFT Windows2

0 5 10 15 20 25

● ●● ●●●● ●

: Log True

: FFT Windows2

0 5 10 15 20 25

●●●

●●

● ● ●

●●

: Log False

: FFT Windows4

● ●● ●

● ●● ●

● ●●●●●

: Log True

: FFT Windows4

: Log False

: FFT Windows6

0.4 0.5 0.6 0.7 0.8

0 5 10 15 20 25

● ●

●●

● ●

●●

: Log True

: FFT Windows6

Segmentation: ApPeakNoTrunc ARLTruncated

Example of SVM accuracy distribution in the R EALISTIC - BY -D AY setting when varying some preprocessing parameters.

• The accuracy residual measures the amount of variation in accuracy per- formance that is due to varying the preprocessing parameters (type of segmentation, number and size of windows for Fast Fourier Trans- form, ...), while keeping the model hyper-parameters fixed

• We plot the distribution of residuals for two sets of experiments: on the left in the R EALISTIC - BY -D AY setting and for a single type of seg- mentation (called ARLTruncated ), and on the right in the R EALIS -

TIC - BY -R ANGE setting and for all kinds of segmentations being tried.

Model Sensitivity to Preprocessing Variations (Realistic−by−Day, ARLTruncated)

Accuracy Residual

Density

0 10 20 30 40

−0.10 −0.05 0.00 0.05

● ●● ● ●● ●●

SVM

0 10 20 30 40

● ●●●● ●

● ●

● ●● ●● ●● ● ●●●

DRBM

Model Sensitivity to Preprocessing Variations (Realistic−by−Day and −Range, all segmentations)

Accuracy Residual

Density

0 5 10 15

−0.1 0.0 0.1

●● ●● ●● ●● ●

● ●

● ●●●●● ●●●

● ●● ●● ●●●

● ●

SVM

0 5 10 15

●●

●●●● ●

●●●●● ● ●● ●● ● ●● ●● ●●●●● ●● ●● ●●●

● ●●●●●● ●● ●● ●

● ●●●●● ●● ●● ●●●

● ●● ●● ●● ●● ●●●●●● ●

DRBM

• Residuals of SVMs exhibit a greater variance than those of DRBMs (dif- ference is statistically significant)

• Lower variability ⇒ performance is more reliably estimated ⇒ DRBMs are particularly useful when not much data is available

Experimental Results:

R EALISTIC - BY -R ANGE

Model Type

Accuracy

0.70 0.75 0.80

SVM DRBM

Box-plot over statistically-indistinguishable hyperparameter values for each model type. DRBMs are superior to SVMs both in mean perfor- mance (statistically significant) and in robustness (lower variance).

Experimental Results:

R EALISTIC - BY -D AY

Model Type

Accuracy

0.75 0.76 0.77 0.78 0.79 0.80

SVM DRBM

Box-plot over statistically-indistinguishable hyperparameter values for each model type. DRBMs are superior to SVMs both in mean perfor- mance (statistically significant) and in robustness (lower variance).

Experimental Setting

• 5 repetitions of 5-fold cross-validation w.r.t. split constraints defined by the R EALISTIC - BY -D AY and R EALISTIC - BY -R ANGE settings

• Computation of the normalized classification accuracy to compensate for class imbalance

• Trying a wide range of hyperparameter values, reporting results for those leading to statistically indistinguishable performance compared to the best

Classification Algorithm 2: DRBMs

0 0 1 0

W U

h

y

!

y z

Restricted Boltzmann Machine modeling the joint distribution of inputs z and target class y

A Discriminative Restricted Boltzmann Machine combines generative and discriminative training criteria by minimizing

− X

z(i)∈T

log p(y

(i)

, z

(i)

) − λ X

z(i)∈T

log p(y

(i)

|z

(i)

)

where the model’s joint probability is

p(y, z) ∝ exp (h

0

Wz + b

0

z + c

0

h + d

0

~ y + h

0

U~ y)

Training performed by contrastive divergence for the generative part and stochastic gradient descent for the discriminative part.

Classification Algorithm 1: SVMs

Find function f in reproducing kernel Hilbert space H

K

associated to ker- nel K by:

argmin

f∈H

K

C X

z(i)∈T

1 − y

(i)

f (z

(i)

)

+

| {z }

hinge loss (bias)

+ 1 2

f

2

| {z }

margin (variance)

Kernels being used:

• Linear: K (z

(1)

, z

(2)

) = z

(1)0

z

(2)

• Polynomial: K (z

(1)

, z

(2)

) = (r + z

(1)0

z

(2)

)

p

• Radial Basis Function (RBF): K (z

(1)

, z

(2)

) = e

−γkz(1)−z(2)k2

Hyperparameters specific to each kernel are automatically chosen based on an internal three-fold cross-validation on the training set.

Methodological Issues

Data splits should mimic the kinds of variations expected between training data and test field data. Two settings considered:

R EALISTIC - BY -D AY : Test and train recordings cannot come from the same date

R EALISTIC - BY -R ANGE : Test and train recordings cannot come from the same sensor array

Number of groups of recordings defined by these constraints (ran- domly assigned to the training or test data by repeated 5-fold cross- validation):

Split Method MORTAR RPG ROCKET R EALISTIC - BY -D AY 5 7 2 R EALISTIC - BY -R ANGE 11 5 3

Hyperparameters cannot be reliably estimated through double cross- validation (low amount of data and splitting constraints) ⇒

1. ANalysis Of VAriance (ANOVA) is performed to determine statistically-significant main effects and interactions

2. Hyperparameters values leading to a performance not statistically significantly different from the best performance are kept

3. Models are compared by the distribution of their performance over the hyperparameters being kept

Summary of Available Data

Proving Ground MORTAR RPG ROCKET Total

APG 197 28 31 256

Dahlgren 0 7 0 7

Yuma 373 0 0 373

Total 570 35 31 636

• Class imbalance (90% MORTAR )

• Low variability in recording conditions (e.g. all ROCKET s are from APG)

The Task

• Use machine learning and advanced signal processing al- gorithms to distriminate launch signals from three weapon classes: MORTAR , ROCKET , and rocket-propelled grenades ( RPG s)

• Many factors affect signal propagation: (1) the distance be- tween receiver and source, (2) the presence of obstacles on the terrain and nature of the ground, (3) the amplitude of the source, (4) the time of day, and (5) the meteorological condi- tions (cloud cover, wind, and humidity)

• Comparison of two classifiers: Support Vector Machine (SVM:

a classical non-parametric discriminant classifier), and Dis- criminative Restricted Boltzmann Machine (DRBM: a recently proposed hybrid that combines a discriminant criterion and a generative criterion)

• Need a carefully-designed experimental setting in order to properly evaluate the generalization performance of classifiers as if they were deployed on the field

Abstract

Machine learning classification algorithms are relevant to a large number of Army classification problems, including the determi- nation of a weapon class from an acoustic signature of a tran- sient. However, much such work has been focused on classifi- cation of events from small weapons used for asymmetric war- fare, which have been of importance in recent years. In this work we consider classification of very different weapon classes, such as mortar, rockets and RPGs, which are difficult to reliably classify with standard techniques since they tend to have sim- ilar acoustic signatures. To address this problem, we compare two recently-introduced state-of-the-art machine learning algo- rithms, Support Vector Machines and Discriminative Restricted Boltzmann Machines, and develop how to use them to solve this difficult acoustic classification task. We obtain classification ac- curacy results that could make these techniques suitable for field- ing on autonomous devices. Discriminative Restricted Boltzmann Machines appear to yield slightly better accuracy than Support Vector Machines, and are less sensitive to the choice of signal pre- processing and model hyperparameters. Importantly, we also ad- dress methodological issues that one faces in order to rigorously compare several classifiers on limited data collected from field tri- als; these questions are of significance to any application of ma- chine learning methods to Army problems.

Nicolas Chapados and Olivier Delalleau ApSTAT Technologies

4200 Boul. St-Laurent, suite 408 Montreal, QC, H2W 2R2, Canada Yoshua Bengio

University of Montreal

P.O. Box 6128, succ. Centre-Ville Montreal, QC, H3C 3J7, Canada Vincent Mirelli and Stephen Tenney

U.S. Army Research Laboratory 2800 Powder Mill Road

Adelphi, MD, 20873-1197

Références

Documents relatifs

abstract geometrical computation, analog/continuous computation, black hole model, cellular automata, computability and signal

In this paper, a parallel implementation of support vector machines (SVM) method is applied for the first time to the prediction of energy consumption in multiple buildings based

— Tout algorithme pouvant s’exprimer à partir de produits scalaires peut bénéficier du coup du noyau en vue d’une extension au cas non-linéaire. — Grâce au coup du noyau,

Discriminative Restricted Boltzmann Machines appear to yield slightly better accuracy than Support Vector Machines, and are less sensitive to the choice of signal preprocessing

34 Antioch, Seleukeia in Pieria, Seleukeia on the Kalykadnos, Tarsos (royal work- shop), Tarsos (civic workshop, see Houghton, Lorber and Hoover 2008, 2285-6, 2288-9),

In order to be able to use a dot product as a similarity measure, we therefore first need to represent the patterns as vectors in some dot product space H , called the feature

connected surfaces the graph is a tree, the end points of which are conjugate to the origin of the cut locus and are cusps of the locus of first conjugate

Index Terms— Autoregressive Modeling, Principal Components Analysis, Support Vector Machine, Particle Swarm Optimization, Wavelet Packet, Fault Diagnosis,