Fouille de données et Apprentissage statistique

(1)

HAL Id: hal-00869038

https://hal.inria.fr/hal-00869038

Submitted on 2 Oct 2013

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Fouille de données et Apprentissage statistique

Frédéric Precioso

To cite this version:

Frédéric Precioso. Fouille de données et Apprentissage statistique. E-Santé de Proximité (ESP 2013), May 2013, Roquefort-Les-Pins, France. �hal-00869038�

(2)

PRÉSENTATION ESP

LABORATOIRE I3S –

EQUIPE KEIA

F. PRECIOSO

Fouille de données et Apprentissage statistique

(3)

Laboratoire I3S - Equipe KEIA



Enseignants-chercheurs



Nicolas Pasquier



Célia da Costa Pereira



Christel Dartigues-Pallez



Denis Pallez



Frédéric Precioso



Doctorants



Kartick Chandra MONDAL, EMMA (Erasmus Mundus Mobility for

Asia), Data mining algorithms for bioinformatics, 2009-2013



Somsack INTHASONE, EMMA (Erasmus Mundus Mobility for Asia),

Knowledge extraction techniques for biodiversity studies,

2012-2015



Romaric Pighetti, MESR, Recherche d’information par le contenu en

combinant algorithmes évolutionnaires et SVM, 2012-2015

(4)

Activités



Problèmes

 Clustering : regroupement de données “similaires”

 Classification : construction de modèles explicites ou implicites de classes  Associations : relations de co-occurrence

 Décision pour des données dynamiques et Planning  Apprentissage Interactif



Méthodes

 Algorithmes Evolutionaires

 Machines à Vecteurs de Support  Systèmes Multi-Agents

 Boosting

 Réseaux de Neurones artificiels

 Arbres de décision et Forêts aléatoires  Treillis de Galois et closure

 Inférence Bayésienne Naïve  …

(5)

Information retrieval from The Big Data

(6)

CBIR systems: bridging the “semantic gap”

(7)

Information mining in The Big Data

 Increasing multimedia information stream…  Mega-databases:

 YouTube = May 2010, more videos uploaded in 60 days than filmed by the 3 biggest US majors in 60 years [1]; 20 h of video uploaded per minute in May 2009, 24 h per minute in March 2010, to reach 1 h per minute few months ago [1, 4]

 Flickr = 3000 images uploaded every minute to reach 5 billions in September 2010 [1]

 The Digital Universe was about 281 ExaBytes (281 Billions of GigaBytes) in 2007; it

increased 10 times since then [2]

 Images and videos, captured by several billions of devices (cameras, webcams,

phones) represent the biggest part of this incredibly big amount of data [3]

 [1] website-monitoring.com

 [2] “digital universe.”, http://eon.businesswire.com/releases/information/digital/prweb509640.htm

 [3] “major source.” , http ://www.emc.com/collateral/analyst-reports/diverse-exploding-digitaluniverse.pdf  [4] http://www.onehourpersecond.com/

(8)

CBIR systems

7

Search Engine

(9)

Information mining in The Big Data



Text Mining in short messages (tweets, SMS,…)



Human action recognition



Large-scale image and video retrieval

(10)

Information mining in The Big Data

9

(11)

e-Health

(12)

Health-Autonomy

(13)

Health 2.0

Home Patient Sensors Analysis (KB system) Actuators Storage (Cloud) Home Patient Sensors Actuators Hospital / Nursery Patient Sensors Actuators Patient Sensors Actuators Physician Patient Sensors Actuators 12

Ontologies describing application-specific knowledge

Large heterogeneous sensor data Patient/context data

(14)

Laboratoire I3S - Equipe KEIA



Nicolas Pasquier



Célia da Costa Pereira



Christel Dartigues-Pallez



Denis Pallez



Fouille de données et Apprentissage statistique

PRÉSENTATION ESP

LABORATOIRE I3S –

EQUIPE KEIA

F. PRECIOSO

Fouille de données et Apprentissage statistique

Laboratoire I3S - Equipe KEIA

Enseignants-chercheurs

Nicolas Pasquier

Célia da Costa Pereira

Christel Dartigues-Pallez

Denis Pallez

Frédéric Precioso

Doctorants

Kartick Chandra MONDAL, EMMA (Erasmus Mundus Mobility for

Asia), Data mining algorithms for bioinformatics, 2009-2013

Somsack INTHASONE, EMMA (Erasmus Mundus Mobility for Asia),

Knowledge extraction techniques for biodiversity studies,

2012-2015

Romaric Pighetti, MESR, Recherche d’information par le contenu en

combinant algorithmes évolutionnaires et SVM, 2012-2015

Activités

Problèmes

Méthodes

Information retrieval from The Big Data

CBIR systems: bridging the “semantic gap”

Information mining in The Big Data

CBIR systems

Information mining in The Big Data

Text Mining in short messages (tweets, SMS,…)

Human action recognition

Large-scale image and video retrieval

Information mining in The Big Data

e-Health

Health-Autonomy

Health 2.0

Laboratoire I3S - Equipe KEIA

Nicolas Pasquier

Célia da Costa Pereira

Christel Dartigues-Pallez

Denis Pallez

Frédéric Precioso

Merci !

Avez-vous des questions ?