• Aucun résultat trouvé

Data Characteristics and their Relation to Closed Patterns Discovery Algorithms

N/A
N/A
Protected

Academic year: 2022

Partager "Data Characteristics and their Relation to Closed Patterns Discovery Algorithms"

Copied!
1
0
0

Texte intégral

(1)

Data characteristics and their relation to closed patterns discovery algorithms

Engelbert Mephu Nguifo

LIMOS, Clermont University, Blaise Pascal University and CNRS Clermont-Ferrand, France

mephu@isima.fr

Abstract. Closed frequent patterns discovery remains a challenge in data mining.

During the last decade, different works on data mining algorithms have based their performance evaluation on one dataset characteristic: its density (or on the contrary its sparseness). The incoming of massive datasets in different applications, points out the important goal to design efficient algorithms. The density measurement have shown to be a direction to reach such goal, especially when dealing with formal context of concept lattices. This talk will discuss this notion and describe some metrics defined to characterize dataset density for patterns discovery purpose.

References

1. Yahia, S.B., Hamrouni, T., Nguifo, E.M.: Frequent closed itemset based algorithms:

a thorough structural and analytical survey. SIGKDD Explor. Newsl. 8(1) (June 2006) 93–104

2. Boley, M., Grosskreutz, H.: Approximating the number of frequent sets in dense data. Knowl. Inf. Syst.21(1) (October 2009) 65–89

3. Emilion, R., L´evy, G.: Size of random galois lattices and number of closed frequent itemsets. Discrete Appl. Math.157(13) (July 2009) 2945–2957

4. Flouvat, F., Marchi, F., Petit, J.M.: A new classification of datasets for frequent itemsets. J. Intell. Inf. Syst.34(1) (February 2010) 1–19

5. Kuznetsov, S.O., Obiedkov, S.A.: Comparing performance of algorithms for gener- ating concept lattices. Journal of Experimental & Theoretical Artificial Intelligence 14(2-3) (2002) 189–216

6. Hamrouni, T., Yahia, S.B., Nguifo, E.M.: Looking for a structural characterization of the sparseness measure of (frequent closed) itemset contexts. Information Sciences (0) (2012) –

7. N´egrevergne, B., Termier, A., M´ehaut, J.F., Uno, T.: Discovering closed frequent itemsets on multicore: Parallelizing computations and optimizing memory accesses.

In: HPCS. (2010) 521–528

8. Uno, T., Asai, T., Uchida, Y., Arimura, H.: An efficient algorithm for enumerating closed patterns in transaction databases. In: In Proc. DS ’04, LNAI 3245. (2004) 16–31

Références

Documents relatifs

To analyze pseudo-programs consisting of programming language statements and calls to unimplemented procedures, such as operations on ADT's, we compute the running time as a

It can be defined as the process of discovering meaningful new correlation, patterns, and trends by digging into (mining) large amounts of data stored in warehouse, using

(1996) as an interactive and iterative process involving, more or less, the following steps: understanding the application domain, selecting the data, data cleaning and

To overcome this issue, the mentioned application domains have a common approach: (i) encode values of each data attribute as sequence of inte- gers using some kind of

 Follow random teleport links with probability 1.0 from dead-ends.  Adjust

In the scheme of the (unbalanced) priority search tree outlined above, each node p divides the plane into two parts along the line x = p.x. All nodes of the left subtree lie to

In this paper, we provide an experimental comparison of 10 algorithms on real-world data sets whose implementations are publicly available, two of them compute formal concepts (FCbO

The talk presents the history and present state of the GUHA method, its theoretical foundations and its relation and meaning for data mining. (Joint work