• Aucun résultat trouvé

ROCsearch --- An ROC-Guided Search Strategy for Subgroup Discovery

N/A
N/A
Protected

Academic year: 2022

Partager "ROCsearch --- An ROC-Guided Search Strategy for Subgroup Discovery"

Copied!
1
0
0

Texte intégral

(1)

ROCsearch — An ROC-guided Search Strategy for Subgroup Discovery

Marvin Meeng1, Wouter Duivesteijn2, and Arno Knobbe1

1 LIACS, Leiden University,{m.meeng,a.j.knobbe}@liacs.leidenuniv.nl

2 Fakult¨at f¨ur Informatik, LS VIII, Technische Universit¨at Dortmund, [email protected]

Subgroup Discovery (SD) aims to find coherent, easy-to-interpret subsets of the dataset at hand, where something exceptional is going on. Since the resulting subgroups are defined in terms of conditions on attributes of the dataset, this data mining task is ideally suited to be used by non-expert analysts. The typical SD approach uses a heuristic beam search, involving parameters that strongly influence the outcome. Unfortunately, these parameters are often hard to set properly for someone who is not a data mining expert; correct settings depend on properties of the dataset, and on the resulting search landscape. To remove this potential obstacle for casual SD users, we introduceROCsearch[1], a new ROC-based beam search variant for Subgroup Discovery.

On each search level of the beam search,ROCsearchanalyzes the interme- diate results in ROC space to automatically determine a sensible search width for the next search level. Thus, beam search parameter setting is taken out of the do- main expert’s hands, lowering the threshold for using Subgroup Discovery. Also, ROCsearchautomatically adapts its search behavior to the properties and re- sulting search landscape of the dataset at hand. Aside from these advantages, we also show that ROCsearch is an order of magnitude more efficient than traditional beam search, while its results are equivalent and on large datasets even better than traditional beam search results.

Acknowledgment This research is supported in part by the Deutsche Forschungs- gemeinschaft (DFG) within the Collaborative Research Center SFB 876 “Pro- viding Information by Resource-Constrained Analysis”, project C1.

References

1. M. Meeng, W. Duivesteijn, A. Knobbe, ROCsearch – An ROC-guided Search Strat- egy for Subgroup Discovery, Proc. SDM, pp. 704–712, 2014.

Copyright c 2014by the paper’s authors. Copying permitted only for private and academic purposes. In: T. Seidl, M. Hassani, C. Beecks (Eds.): Proceedings of the LWA 2014 Workshops: KDML, IR, FGWM, Aachen, Germany, 8-10 September 2014, published at http://ceur-ws.org

180

Références

Documents relatifs

Paul usually has cereals and coffee for breakfast ……….. They usually go out in the

A metamodel that contains those useful concepts for representing models with information about data mining experiments: data’s sources metadata, results of data mining algorithms,

From scientific publications, classical methods of expert finding described in the literature are based on information extraction methods [10] such as meta- data (title,

In this section, we briefly summarize necessary notation and basic concepts on subgroup discovery and the proposed approach for ontology validation using test ontologies.. 2.1

It is strongly NP -hard to compute a minimizer for the problem (k-CoSP p ), even under the input data restric- tions specified in Theorem 1.. Clearly, if a minimizer was known, we

In the light of the state of the art, we propose an original approach for expert finding consisting in combining text mining, more precisely machine learning algorithms applied on

Distributions of the optimal statistic S/N for HD (blue), monopole (orange), and dipole (green) spatial correlations, as induced by the posterior probability distri- butions

The tool can be used to support the DL users in extracting useful metadata involving descriptive statistics about the content of the datasets stored, can help data wranglers