• Aucun résultat trouvé

Multiple-Resolution Stream Clustering Using Graph Maintenance

N/A
N/A
Protected

Academic year: 2022

Partager "Multiple-Resolution Stream Clustering Using Graph Maintenance"

Copied!
1
0
0

Texte intégral

(1)

Multiple-Resolution Stream Clustering Using Graph Maintenance

Marwan Hassani, Pascal Spaus, and Thomas Seidl

Data Management and Data Exploration Group RWTH Aachen University, Germany {hassani,spaus,seidl}@cs.rwth-aachen.de

Abstract. Challenges for clustering streaming data are getting contin- uously more sophisticated. They are driven by stream properties such as the continuous data arrival, the time-critical processing of objects, the evolution of the data streams, the presence of outliers and the vary- ing densities of the data. Due to the continuously evolving nature of the stream, it is crucial that stream clustering algorithms autonomously detect clusters whose number, shapes and densities vary as the stream flows. We present the first hierarchical density-based stream clustering algorithm based on cluster stability, calledHASTREAM [2] which is able to meet the above mentioned requirements.

We show additionally, that HASTREAM inherited efficiency issues as the main drawback of density-based hierarchical clustering algorithms, as these were not the scope of its contribution. We present then I-HASTREAM [1], a first density-based hierarchical clustering algorithm that has considerably less computational time compared to the first pre- sented algorithm. I-HASTREAM utilizes and introduces techniques from the graph theory domain to devise an incremental update of the under- lying model instead of repeatedly performing the expensive calculations of the huge graph. Specifically the Prim’s algorithm for constructing the minimal spanning tree is adopted by introducing novel, incremental maintenance of the tree by vertex and edge insertion and deletion. The extensive experimental evaluation study on real world datasets shows that I-HASTREAM is considerably faster than HASTREAM while de- livering almost the same clustering quality.

References

1. Marwan Hassani, Pascal Spaus, Alfredo Cuzzocrea, and Thomas Seidl. Adap- tive stream clustering using incremental graph maintenance. In BigMine 2015 at KDD’15, pages 49–64, 2015.

2. Marwan Hassani, Pascal Spaus, and Thomas Seidl. Adaptive multiple-resolution stream clustering. InMLDM ’14, pages 134–148, 2014.

Copyright c 2015 by the paper’s authors. Copying permitted only for private and academic purposes. In: R. Bergmann, S. Görg, G. Müller (Eds.): Proceedings of the LWA 2015 Workshops: KDML, FGWM, IR, and FGDB. Trier, Germany, 7.-9.

October 2015, published at http://ceur-ws.org

32

Références

Documents relatifs

If we choose a loss function, such as the biweight loss, that is bounded, then this will impose a minimum segment length on the segmentations that we infer using the penalized

Keywords Exploratory factor analysis; analytical decision making; data screening; factor extraction; factor rotation; number of factors; R statistical environment..

This section presents what are provably the most generic attacks against XSYND. We will only address the hardness of inverting the

This specific processing includes the inversion of the equation of motion (the gravity meter records the time and the position of the falling object during its fall; g is

While some efficient algorithms are known for processing interval sensor data in the presence of outliers, in general, the presence of outliers often turns easy-to-solve

Figure 10 represents a test to measure the execution time in the wide tree data set using negative queries.. We noticed that execution time increases with the increasing of the data

The offline stage analyses the distribution of µCs and creates the final clusters by a density based approach, that is, dense micro cluster that are close enough(connected) are said

Hôpital Européen Georges Pompidou, APHP, Paris, France; 2 INSERM, UMRS 1138, CRC, Équipe 22, Paris, France; 3 Université Paris Descartes, Sorbonne Paris Cité, Paris, France; 4