Corpus-Based methods for Short Text Similarity
Texte intégral
Documents relatifs
In this paper a method for pairwise text similarity on massive data-sets, using the Cosine Similarity metric and the tf-idf (Term Frequency- Inverse
Many experiments were conducted to find the best configuration for query expan- sion. Table 2 lists two of the best runs with the proposed approach for the same que- ries. Run 4
Therefore, this article proposes a method for determining information proximity in large arrays of text information, the distinctive feature of which is the use
For semantic cluster similarity we use an adapted form of a similarity measure from related work, where we combine idf scoring with word by word cosine similarity to calculate
Based on the obtained results we can conclude that considering complete document does not always increase the classification accuracy5. Instead, the accuracy depends on the nature
In this work, we address the question of which expression elements should be available in a query language to support users with different knowledge and experience in retrieving
The differences in classification performance of three common classifiers, across message lengths and across enhancement meth- ods, as measured by the F1 score for accuracy
Lexical features are extracted from all the legal documents and the simi- larity between each current case document and all the prior case documents are determined using