• Aucun résultat trouvé

News Representation with Multi-Word Features

N/A
N/A
Protected

Academic year: 2021

Partager "News Representation with Multi-Word Features"

Copied!
1
0
0

Texte intégral

(1)

News Representation

with Multi-Word Features

Mihail Minev1and Christoph Schommer2

1 University of [email protected]

2 University of [email protected]

Abstract. Information is commonly reflected in news articles. However, texts are unstructured and thus demanding to analyze automatically. To identify and cap- ture the facts in a news story we propose a novel approach, which utilizes natural language engineering. A combination of selected linguistic and statistical criteria enables the identification of grammatical units such as noun, verb, adjective, and adverb phrases. In literature, these entities are presumed to carry the meaning and the information expressed in English texts. In our study, we focus on determining multi-word features in articles related to the monetary policy conducted by the cen- tral bank in the USA, FED. The features are composed as attribute-value pairs, where the attributes represent grammatical units, which quantify the major event characteristics. The corresponding values are conditional expressions, which vary over time as facts evolve. The final set is aggregated over the corpus by the ap- plication of heuristic and syntax-based rules. Financial experts contributed to the project by providing expertise for the document interpretation.

References

LI, X. (2010): Understanding the semantic structure of noun phrase queries. InPro- ceedings of the 48th Annual Meeting of the Association for Computational Lin- guistics, ACL ’10, pages 1337–1345, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.

SHAFIEI, M. and WANG, S. and ZHANG, R. and MILIOS, E. and TANG, B. and TOUGAS, J. and SPITERI, R. (2007): Document representation and dimension reduction for text clustering. InData Engineering Workshop, 2007 IEEE 23rd International Conference on, pages 770 –779.

Keywords

INFORMATION EXTRACTION, FEATURE REPRESENTATION

Références

Documents relatifs

The result of this paper is to establish the mechanism of mutual transformation between programming language and natural language through the research and exploration of the

From Koizumi1 (a) to Noda (i), the networks with the Louvain communities highlighted – red and blue links depending on the PM’s and OL’s affiliation (respectively DPJ and LDP)

(E)—Detail of D showing the etched enamel; (F)—Polished and etched section showing the uniserial pattern of the inner enamel sublayer; (G)—Etched enamel of an in situ

Con- sequently, the documents serve as input for training vector representations, commonly termed as embed- dings, of data elements (e.g. entire relational tuples or

The best result for task 2 is obtained with word embedding trained on the same data, but the best result for China data is ob- tained when we used word embedding from another

In this paper we present what is fake news, importance of fake news, overall impact of fake news on different areas, different ways to detect fake news on social media,

Yet, whereas many different tools by different vendors promise companies to guarantee their compliance to GDPR in terms of consent management and keeping track of the personal data

In case the killer sequence recommender had fewer than request- ed recommendations, we did not add results from the baseline recommender.. This might explain why