Récemment recherché

Aucun résultat trouvé

Étiquettes

Aucun résultat trouvé

Document

Aucun résultat trouvé

Accueil Écoles Thèmes

Connexion

Application of sequential pattern mining to the analysis of visitor trajectories

Partager "Application of sequential pattern mining to the analysis of visitor trajectories"

N/A

N/A

Protected

Année scolaire: 2021

Info

Protected

Academic year: 2021

Partager "Application of sequential pattern mining to the analysis of visitor trajectories"

Copied!

2

0

0

2

0

0

Chargement.... (Voir le texte intégral maintenant)

Télécharger maintenant ( 2 Page )

Texte intégral

(1)

HAL Id: hal-01667442

https://hal.inria.fr/hal-01667442

Submitted on 19 Dec 2017

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Application of sequential pattern mining to the analysis of visitor trajectories

Nyoman Juniarta, Amedeo Napoli

To cite this version:

Nyoman Juniarta, Amedeo Napoli. Application of sequential pattern mining to the analysis of visitor trajectories. BDA 2017 - 33ème conférence sur la Gestion de Données - Principes, Technologies et Applications, Nov 2017, Nancy, France. �hal-01667442�

(2)

Sequence clustering

In order to cluster a set of sequences, a similarity measure must be defined. Distance between any two sequences can be measured by sim _ACS [2], that counts all their common subsequences.

φ _D (s) is the number of all distinct subsequences of s, and φ _A (s ₁ , s ₂ ) is the number of all common subsequences between s ₁ and s ₂ .

Result

Using 15 clusters as hclust’s dendrogram cut, we obtained 4 big clusters and 11 small clusters, as shown in Figure 3.

Application of sequential pattern mining to the analysis of visitor trajectories

Acknowledgments

The thesis of Nyoman Juniarta is financed by the Région Grand-Est and the European project CrossCult.

Nyoman Juniarta, Amedeo Napoli

nyoman.juniarta@loria.fr, amedeo.napoli@loria.fr

Introduction

In this work, we demonstrate the proof of concept of clustering 254

visitors based on their trajectories in a museum. We used a real dataset from Haifa Museum (Figure 1), where each trajectory is treated as a

sequence of itemsets. We applied sim _ACS as a similarity measure between any two sequences.

Figure 1. An example of raw dataset from one visitor.

References

[1] Massimo Zancarano et al. 2007. Analyzing museum visitors’ behavior patterns. In International Conference on User Modeling. Springer, 238–246.

[2] Elias Egho et al. 2015. On measuring similarity for sequences of itemsets.

Data Mining and Knowledge Discovery 29, 3, 732–764.

Sequence

Sequence is an ordered list of itemsets. A sequence s = ⟨s ₁ , s ₂ , … s _m ⟩ is a subsequence of s’ = ⟨ s ₁ , s ₂ , … s _n ⟩ if there exist indices 1 ≤ i ₁ < i ₂ < i _m ≤ n such that s _j ⊆ s’ _ij for all j = 1…m and m ≤ n.

S = ⟨ {a, b}, {c, d}, {a, c, d} ⟩

Clustering of visitors

In order to group visitors according to their trajectories, each

trajectories was converted into a sequence of itemsets. An itemset corresponds to an artifact, and it possesses the information about

duration and distance between previous artifact. These two elements were discretized such that duration has values short and long, while distance has near and far. Example:

v ₁ = ⟨…, {short, near}, {short, far}, {short, near}, …⟩

v ₂ = ⟨…, {long, far}, {long, far}, {short, near}, …⟩

…

Having the sequences of itemsets and non-Euclidean similarity

measure, we applied hierarchical clustering over 254 sequences, using hclust from R software.

Figure 3. Result of hierarchical clustering over 254 visitors.

Behavior of visitors

According to [1], a visitor can be grouped as one of four defined

behavior: ant, grasshopper, butterfly, and fish, as illustrated in Figure 2.

These behaviors are described by the movement of a visitor around the artifacts in a museum.

Not subsequences

⟨ {b,c} ⟩

⟨ {d}, {a,b} ⟩

… Subsequences

⟨{a, b}, {d}⟩

⟨{c}, {a,c}⟩

…

ANT GRASSHOPPER BUTTERFLY FISH

START END ARTIFACT

… … …

12:47:23 12:47:29 GlassOvenVessels 12:48:34 12:48:39 WoodenTools

12:48:51 12:49:46 ReligionAndCult

… … …

Figure 2. An example of raw dataset from one visitor.

We calculated the presence of four possible itemsets in each cluster to map them into four visiting patterns. An ant corresponds to {long,

near} itemset, while fish contains {short, near}. The {long, far} itemset can be correlated with both butterfly and grasshopper, but the latter has relatively short sequence.

Furthermore, within the 11 small clusters, the four possible itemsets

have relatively the same support. This suggests that they correspond to visitors who frequently change their behavior while visiting the

museum.

Références

Télécharger maintenant ( PDF - 2 Page - 895.70 KB )

Documents relatifs

Sequential Pattern Mining using FCA and Pattern Structures for Analyzing Visitor Trajectories in a Museum

the mining of complex sequential data and dynamics in adapting two algorithms based on pattern structures, the analysis of the trajectories based on jumping emerging patterns

Sequential Pattern Mining using FCA and Pattern Structures for Analyzing Visitor Trajectories in a Museum

the mining of complex sequential data and dynamics in adapting two algorithms based on pattern structures, the analysis of the trajectories based on jumping emerging patterns

Sequential Pattern Mining within Formal Concept Analysis for Analyzing Visitor Trajectories

Several challenges are faced in this research work in the FCA framework: the mining of complex sequential data and dynamics in adapting two algorithms based on pattern structures,

On Projections of Sequential Pattern Structures (with an application on care trajectories)

{h[CH, {c, d}]i} is an alphabet-projected pattern for the pattern {ss 2 }, where alphabet lattice projection is given by projecting every element with medical pro- cedure b to

Contextual Sequential Pattern Mining

Indeed, contextual sequential pattern mining does not only aim to mine sequences that are specific to one class, but to extract all sequential patterns in a whole sequence database,

Efficiency Analysis of ASP Encodings for Sequential Pattern Mining Tasks

ASP has also been used for sequential pattern mining [Gebser et al., 2016, Guyet et al., 2014]. Gebser et al. [Gebser et al., 2016] have proposed, firstly, an efficient encoding

Declarative Sequential Pattern Mining of Care Pathways

For each patient, the S rules generate two sequences made of deliveries within respectively the 3 months before the first seizure (positive sequence) and the 3 to 6 months before

Sequential pattern mining for analyzing visitor trajectories

Nyoman Juniarta, Miguel Couceiro, Amedeo Napoli, and Chedy Raïssi Université de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy,

Téléchargez tous les documents en téléchargeant vos documents d'étude.

Votre document sera enrichi, partagé sur 123dok FR pour vous aider à étudier.

Documents relatifs

Du Système du monde à l'histoire de la terre

Du Système du monde à l'histoire de la terre

7

0

0

L'art en personne. Pour une histoire sociale du portrait

L'art en personne. Pour une histoire sociale du portrait

14

0

0

A Gabriel Theorem for Coherent Twisted Sheaves and Picard Group and 2-factoriality of O'Grady's Examples of Irreducible Symplectic Varieties

A Gabriel Theorem for Coherent Twisted Sheaves and Picard Group and 2-factoriality of O'Grady's Examples of Irreducible Symplectic Varieties

145

0

0

Improving Requirement Boilerplates Using Sequential Pattern Mining

Improving Requirement Boilerplates Using Sequential Pattern Mining

10

0

0

Automatic Validation of Terminology by Means of Formal Concept Analysis

Automatic Validation of Terminology by Means of Formal Concept Analysis

17

0

0

Sequence Mining within Formal Concept Analysis for Analyzing Visitor Trajectories

Sequence Mining within Formal Concept Analysis for Analyzing Visitor Trajectories

7

0

0

ARTheque - STEF - ENS Cachan | FISHERTECHNIK

ARTheque - STEF - ENS Cachan | FISHERTECHNIK

1

0

0

Vocabulaire des techniques de l'information et de la communication (TIC)

Vocabulaire des techniques de l'information et de la communication (TIC)

488

0

0