Evaluating the Privacy Implications of Frequent Itemset Disclosure
Texte intégral
Documents relatifs
The most time-consuming step of all frequent itemset mining algorithms is to decide the frequency of itemsets, i.e., whether an itemset is supported by at least a certain number
In this work we provide a study of metrics for the char- acterization of transactional databases commonly used for the benchmarking of itemset mining algorithms.. We argue that
A new concise repre- sentation, called compressed transaction-ids set (compressed tidset), and a single pass algorithm, called TR-CT (Top-k Regular frequent itemset mining based
While frequent itemsets will be guaranteed to include all source itemsets re- coverable at the minimum support threshold, they will also include all of their subsets, and
In this paper, we provide an experimental comparison of 10 algorithms on real-world data sets whose implementations are publicly available, two of them compute formal concepts (FCbO
Table 1 shows the different relations together with the size of their extension, the precision yielded by a manual evaluation of a sample of 100 tuples of each relation (P manual ),
The workshop proceedings contains 15 papers de- scribing 18 different algorithms that solve the frequent itemset mining problems. The source code of the im- plementations of
We have presented the Partial Counting algorithm for mining frequent item- sets from data streams with a space complexity proportional to the number of frequent itemsets generated