• Aucun résultat trouvé

A Powerful Web-Based

Dans le document Data Mining in Proteomics (Page 126-130)

Client

114 Jones

The design of the DAS system has largely focused on devel-oping standardised XML formats for data exchange (from servers to clients). For example, individual annotations (“features”) include a feature type, feature id, a label, and the start and stop coordinates of the feature and a score, together with other optional fields. Standardising this format is obviously essential to allow clients to be able to query multiple DAS servers; however, it does not solve the problem that separate organisations will use different terminology to describe different feature types. This makes it very difficult to perform a comparative analysis of the feature types served from different institutions. Fortunately, this problem has been recognised and has been addressed through the development of the Protein Feature Ontology (10), by the BioSapiens Network of Excellence (11) and through the use of the Evidence Codes Ontology. See Note 3 for a very brief expla-nation of ontologies. The DAS servers provided by members of the BioSapiens Network are among the first to take advantage of this standardisation of terminology.

The Dasty2 DAS client (also funded by the BioSapiens Network) incorporates the use of these ontologies in the interface, providing links to term definitions and allowing filtering, sorting, and order-ing by both feature type and evidence code ontology terms.

Following is a description of how to query Dasty2 for a specific protein and how to manipulate the user interface to focus upon annotations of interest to the researcher.

Note that by default, Dasty2 retrieves annotation from DAS servers that are registered as being associated with the BioSapiens Network. It is possible to extend your search to include DAS servers from outside this network, so long as they accept UniProtKB protein accessions. If you use non-BioSapiens DAS servers, there is no guarantee that the DAS server will make use of the Protein Feature Ontology or the Evidence Codes Ontology for feature annotation.

1. Visit http://www.ebi.ac.uk/dasty/ using an Internet browser (see Fig. 3).

2. Enter the UniProtKB protein accession that you are inter-ested in into the “Protein ID” text field and click “GO.”

Note that there are several example accessions given below the text field.

3. The Dasty2 client will immediately start to query all of the available DAS servers with the protein accession that you have entered. This is done in parallel with results from each server being displayed as soon as they arrive. Some of the registered DAS servers may include no annotation of the protein requested or may be inactive for another reason (maintenance down-time for example). This does not impede Dasty2 in any way; however, if you are expecting or looking for annotation

115 Analysing Proteomics Identifications in the Context of Functional and Structural Protein

from a specific DAS server, it is wise to check that the service is responding correctly (see Note 4).

4. Scroll down to the section “Positional Features.” This section displays all of the feature annotations that have been loaded in the previous step. Features of the same type may be grouped together on to one row, either using information from the server or according to the configuration of Dasty2. This table includes the columns:

(a) “Feature Type”, which displays (where provided) the Protein Feature ontology term categorising the features displayed on that row.

(b) “Labels” provides a simple, non-standardised label for the feature type.

(c) “Feature Annotations” displays the position of the fea-ture relative to the sequence. This is an interactive display with the ability to zoom (grab and slide the red “han-dles” at the top of the “Positional Features” section and then click the grey “Zoom” button). You can also hover over, or click any of the features displayed for more complete information about the feature. If you click on a Fig. 3. The Dasty2 DAS client in action. This view shows part of the Dasty2 user interface displaying the DAS tracks from ten different DAS annotation servers for a single protein. In this case, Dasty2 has requested annotation from a total of 33 separate DAS servers.

116 Jones

feature, you can view details of the sequence in this region at the bottom of the Dasty2 interface.

(d) “Server Name” indicates which DAS server the annota-tion has been retrieved from.

(e) “Evidence (Category)” displays the Evidence Codes Ontology term that the features on that row are anno-tated with, typically differentiating between annotations for which there is direct experimental evidence from annotations that have been inferred by a human curator or annotations that have been derived by automatic means, for example, pattern matching or the use of Hidden Markov models.

Note that it is possible to add additional columns (including

“Score” and “Feature ID”) or remove columns by expanding the

“Manipulation Options (Positional Features)” section.

5. For some highly annotated proteins, there may be many rows of annotation displayed, much of which may be irrelevant to the research problem being addressed. The Dasty2 interface provides several mechanisms to allow you to manage this:

(a) You may reorder the DAS tracks on the screen by holding down the primary mouse button1 on a DAS track that you wish to relocate and drag it up or down to a new posi-tion. This is useful for viewing a selection of DAS tracks next to each other that you wish to compare directly.

(b) You can filter the DAS tracks displayed. Scroll up to the

“Filtering By” section and expand the section by clicking on the heading. In this section, you can filter by feature types, DAS server name and evidence code. (The feature type and evidence code filters make use of the Protein Feature Ontology and Evidence Code Ontology respectively.)

(c) You may modify the order of the DAS tracks by clicking on one of the column headings. Repeatedly clicking a heading reverses the sort order.

6. Some annotations may refer to the entire molecule rather than just a section of the sequence, for example, the list of literature citations associated with the molecule. These fea-tures can be viewed on Dasty2 by expanding the “Non Positional Features” section. If the DAS server has provided one or more hyperlinks to external sources, these are repre-sented by a purple “i” icon, to the right of the notes section.

1Left mouse button on a Microsoft Windows or Linux PC.

117 Analysing Proteomics Identifications in the Context of Functional and Structural Protein

Here, some features of the Dasty2 DAS client have been described. Its capabilities extend beyond those described here, including for example the ability to display the protein structure, employing the structure extensions to DAS. As well as Dasty2, other high quality DAS clients exist, including the DAS client built into Ensembl (http://www.ensembl.org/) and the powerful Spice DAS client (http://www.efamily.org.uk/software/dasclients/

spice/) (8), which provides a sophisticated protein structure viewer on to which DAS annotation can be projected.

DAS provides a very powerful way of accessing integrated data from many disparate sources. It has the restriction, however, that the available clients all focus on one protein at a time. If the researcher wishes to collate annotation for large sets of proteins in a single step for further analysis, BioMart may offer a more suit-able alternative as described below.

Dans le document Data Mining in Proteomics (Page 126-130)