• Aucun résultat trouvé

Word representations: A simple and general method for semi-supervised learning

N/A
N/A
Protected

Academic year: 2022

Partager "Word representations: A simple and general method for semi-supervised learning"

Copied!
63
0
0

Texte intégral

Références

Documents relatifs

, on sait aujourd'hui que ~es thèses manichéennes, souvent formulées sur la base d'une documentation fragmentaire, caricaturent la réalité. On prend aussi

The use of the KD-tree data structure enables efficient computation of the k-nearest neighbours (k-NN) of a pattern point, particularly for large data.. Experimental re- sults

These stages include: (1) document extraction (Reuters and non-Reuters articles) from our news repository; (2) local clustering based on duplicate document detection of identical

7.1 Details of inducing word representations The Brown clusters took roughly 3 days to induce, when we induced 1000 clusters, the baseline in prior work (Koo et al., 2008; Ratinov

Complete author clustering: We do a detailed analysis, where we need to identify the number k of different authors (clusters) in a collection and assign each docu- ment to exactly

Our main goal is to investigate whether word embeddings could perform well on a multi-topic author attribution task.. The semantic information in word embeddings has been shown

We test two ways of measuring clusterability: (1) existing measures from the machine learning literature that aim to measure the goodness of optimal k-means clusterings, and (2)

In [13] the authors show that O(Kn) similarity queries are both necessary and sufficient to achieve exact reconstruction of an arbitrary clustering with K clusters on n items. This