• Aucun résultat trouvé

An Environment for Relation Mining over Richly Annotated Corpora: the case of GENIA

N/A
N/A
Protected

Academic year: 2022

Partager "An Environment for Relation Mining over Richly Annotated Corpora: the case of GENIA"

Copied!
1
0
0

Texte intégral

(1)

An Environment for Relation Mining over Richly Annotated Corpora:

the case of GENIA

Fabio Rinaldi

, Gerold Schneider

, Kaarel Kaljurand

, Michael Hess

, Martin Romacker

∗∗

Abstract

The biomedical domain is witnessing a rapid growth of the amount of published scientific results, which makes it increasingly difficult to filter the core information. There is a real need for support tools that ‘digest’ the published re- sults and extract the most important informa- tion.

We describe and evaluate an environment sup- porting the extraction of domain-specific re- lations, such as protein-protein interactions, from a richly-annotated corpus. We use full, deep-linguistic parsing and manually created, versatile patterns, expressing a large set of syn- tactic alternations, plus semantic ontology in- formation.

Institute of Computational Linguistics, IFI, University of Zurich, Switzerland, {rinaldi, gschneid, kalju, hess}@ifi.unizh.ch

∗∗Novartis Pharma AG, Basel, Switzerland martin.romacker@novartis.com

Références

Documents relatifs

In this paper, we present NIPS4Bplus, the first richly annotated birdsong audio dataset, that is comprised of recordings containing bird vocalisations along with their active

We present here the choices which were made within the framework of three oral corpora projects : Socio-linguistics studies on Orleans (ESLO), Phonology of the Contemporary

Recently, have been developed projects such as “Science Environment for Ecological Knowledge” (SEEK) and “Seman- tic Prototypes in Research Ecoinformatics” (SPIRE), which apply

PMLAB is an interactive programming environment for (exploratory) process mining computing and/or research on top of a process-oriented language..

This is specific to the way annotation is performed in the MIR field: while in other domains, con- cepts are first defined, and then used for the annotation of items, in MIR,

Different types of information, such as pattern-based representations and word embeddings, are used as input to the classification of the entity couples according to the

We revise ex- isting related data-sets and provide an analysis of how they could be augmented in order to facilitate machine- learning techniques to assess privacy policies with

Existing corpora of minority languages of Russia can be split into two groups: relatively small (almost always under 100,000 tokens) manually annotated collections,