• Aucun résultat trouvé

The Statistical Learning Theory in Practical Problems

N/A
N/A
Protected

Academic year: 2022

Partager "The Statistical Learning Theory in Practical Problems"

Copied!
25
0
0

Texte intégral

(1)

DIG Seminar. September 6

th

, 2018

The Statistical Learning Theory in Practical Problems

Rodrigo Fernandes de Mello Universidade de São Paulo

Instituto de Ciências Matemáticas e de Computação Departamento de Ciências de Computação

http://www.icmc.usp.br/~mello mello@icmc.usp.br

(2)

Theoretical Aspects

Interested in introducing my research interests

(3)

Theoretical Aspects

Interested in introducing my research interests

What a better way than the Statistical Learning Theory?

(4)

Statistical Learning Theory

MLP

SVM

ID3 KNN

J48

C4.5

DWNN CNN

TensorFlow

So many classification algorithms:

How can we assess any of those?

(5)

Statistical Learning Theory

MLP

SVM

ID3 KNN

J48

C4.5

DWNN CNN

TensorFlow

So many classification algorithms:

How can we assess any of those?

K-fold cross validation, leave-one-out, ...

How can we prove any of those learn?

(6)

Statistical Learning Theory

Vapnik proposed the Statistical Learning Theory

Defined in the context of supervised learning

Learning guarantees and conditions

What is the main call?

Sample error Expected error

This is the concept of Generalization

(7)

Statistical Learning Theory

So what is generalization?

Consider someone has given you a textbook

(8)

Statistical Learning Theory: Bias-Variance Dilemma

An example based on regression:

Figure from Luxburg and Scholkopf, Statistical Learning Theory: Models, Concepts, and Results

Which function is the best?

To answer it we need to

assess their Expected Risks

(9)

Statistical Learning Theory: Bias-Variance Dilemma

The dichotomy associated to the Bias-Variance Dilemma

Fall

F

Fall

F

(10)

Distance-Weighted Nearest Neighbors

Based on the same principles as the k-Nearest Neighbors

Attribute 1 Attribute 2

(11)

Distance-Weighted Nearest Neighbors

Based on the same principles as the k-Nearest Neighbors

New instance

Attribute 1 Attribute 2

(12)

Distance-Weighted Nearest Neighbors

Based on the same principles as the k-Nearest Neighbors

New instance

Setting K=3

Setting K=5 Setting K=6

(13)

Distance-Weighted Nearest Neighbors

It is based on Radial functions centered at the new instance a.k.a. query point

Classification output:

Given the weight function:

(14)

Distance-Weighted Nearest Neighbors

After implementing, test it on this simple example of an identity function:

Two main questions:

- What happpens if sigma is too big?

- What happens if sigma is too small?

So, how can we define the best value for sigma?

(15)

Distance-Weighted Nearest Neighbors

When sigma tends to infinity

Fall The space of admissible

functions (bias) will contain a single function

In this case, the average function

(16)

Distance-Weighted Nearest Neighbors

When sigma tends to infinity

(17)

Distance-Weighted Nearest Neighbors

When sigma tends to zero

Fall The space of admissible

functions (bias) will tend to the whole space

What is the problem with that?

It will most probably contain at least one memory-based

classifier

(18)

Distance-Weighted Nearest Neighbors

When sigma tends to zero

(19)

Theoretical Aspects

Putting in simple terms:

(20)

Theoretical Aspects

Putting in simple terms:

Interested in proving learning in different scenarios

(21)

Theoretical Aspects

Putting in simple terms:

Interested in proving learning in different scenarios

Building up a theoretical framework to prove time series modeling

Idea: use it to ensure concept drift detection

(22)

Theoretical Aspects

Putting in simple terms:

Interested in proving learning in different scenarios

Building up a theoretical framework to prove time series modeling

Idea: use it to ensure concept drift detection

Theoretical Framework to design Deep Learning architectures

(23)

Living in the Past...

Some practical results:

Bionic hand

Hardware under development at the northeast of Brazil

Mainly to support people living in extreme poverty

Time Series Visualization tool

www.tsviz.com.br

Cover song identification

Supported the ECAD system, Brazil

Copyright office of songs in Brazil

Go Digital, Brazil – incorporated by Acxiom, USA

Machine Learning for geopositioning systems

Marketing based on ML

(24)

References

Vapnik, V. The Nature of the Statistical Learning, Springer, 2011

Luxburg and Scholkopf, Statistical Learning Theory: Models, Concepts, and Results. Handbook of the History of Logic.

Volume 10: Inductive Logic. Volume Editors: Dov M. Gabbay, Stephan Hartmann and John Woods, Elsevier, 2009

Schölkopf, B., Smola, A. J., Learning With Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT, 2002

(25)

References

Références

Documents relatifs

Theorem 5, another negative result about bloat, and shows that a relevant choice of VC-dimension depending upon the sample size ensures Universal Consistency, but leads (for

I Specialized artificial intelligences based on Deep Learning have started to outperform humans on certain tasks.. I Training time can accelerated using GPUs 4 5 or

Le Cam’s theory has found applications in several problem in statistical decision theory and it has been developed, for example, for non- parametric regression, nonparametric

a) In the present study, prosody has not been measured. Although, the participants in the experimental group are taught about the role of using the prosodic

We conclude the section with a general lower bound, via a reduction to PAC learning, relating the query class’s VC dimension, attack accuracy, and number of queries needed for

Here, the linear regression suggests that on the average, the salary increases with the number of years since Ph.D:.. • the slope of 985.3 indicates that each additional year since

For heavy babies (and also light babies), the weight of the babies is relatively lower on the borders of the mother’s age distribution... Quantile

Automata and formal languages, logic in computer science, compu- tational complexity, infinite words, ω -languages, 1-counter automaton, 2-tape automaton, cardinality problems,