• Aucun résultat trouvé

Fichiers et banques de données

N/A
N/A
Protected

Academic year: 2022

Partager "Fichiers et banques de données"

Copied!
3
0
0

Texte intégral

(1)

Fichiers et banques de données

A. Search for all amino acid sequences of glutamate dehydrogenase (GDH)

1. Go to the NCBI : http://www.ncbi.nlm.nih.gov/

2. Tape : "glutamate dehydrogenase OR GDH"

3. Select : "Protein" : 3807 hits (03/2007)

4. Choose the option "Preview/Index" . This allows the use of key-words with the option : "Add Term(s) to Query or View Index"and a boolean logical search ("AND" / "OR" / "NOT")

The number of hits is indicated on the right of the screen (3807). To see the results ("Summary"), click on this number in your browser.

1. Field "Add Term(s)", choose option "All fields"

2. Tape the following terms (make a copy / paste) : (warning : this list has not been updated)

peptide NOT partial NOT mutant NOT phenylalanine NOT leucine NOT synthase NOT putative NOT caspase NOT quinoprotein NOT membrane NOT decarboxylase NOT topoisomerase NOT

monooxygenase NOT transaminase NOT kinase NOT oxidase NOT thioredoxin NOT glycerate NOT glyoxylate NOT glucose NOT glucose-1-dehydrogenase NOT glutamate dehydrogenase-related NOT glutamine NOT glutamyl NOT glucarate NOT glycerol NOT proline NOT valine NOT semialdehyde NOT aldehyde NOT glyceraldehyde NOT dihydropyrimidine NOT

formyltetrahydrofolate NOT fatty NOT isocitrate NOT

saccharopine NOT methylmalonate NOT probable NOT possible NOT related NOT similarity NOT similar NOT homolog NOT homologue NOT synthetic NOT unknown NOT hypothetical NOT patent NOT transcriptional NOT thymidine NOT reductase NOT resolvase NOT regulatory NOT zinc NOT 3-dehydroquinate NOT adhesion NOT ammonium

3. Click on boolean "NOT". All keywords and booleans

See the answers

(2)

are written in the main field (top of the page)

4. Click on "Preview" : 290 hits (03/2007)

What is the goal of this selection ?

What is the consequence of the boolean "OR" and

"NOT" ?

Some are not GDH. Which ones ?

B. Removing redondant sequences

This part is the most tedious and time - consuming one. This can be made using "Multalin" from the Institut National de la Recherche Agronomique (INRA).

What type of file could be used to know the name of the organism ?

Why are there multiple files for the same protein from the same organism ?

To what kind of information are linked the various accession numbers in those files ?

See the answers

1. Field "Add Term(s)", choose option "Organism"

2. Tape the name chosen : for example "Homo sapiens"

3. Click on boolean "AND"

4. Click on "Preview" : 8 hits (03/2007) for this organism

5. Click on this number (8). The files "Summary" are returned

6. Field "Display", choose the option "Fasta". This is one of the various format of data used by the algorithms of

sequences alignment

7. Click on "Display". The files in FASTA format are returned

8. Field "Send to", choose "Text" : a new HTML page is returned. Copy the data

See screenshots

(3)

9. Open a new navigator window and go to the Web interface "Multalin" (Florence CORPET) for multiple alignment :

http://prodes.toulouse.inra.fr/multalin/

multalin.html

10. Paste data and adjust the parameters

11. Start the software. The multiple alignment indicates which sequences are the same, therefore redondant. Note these accession numbers.

What can you conclude ?

See screenshots

Go back to the NCBI. Remove the false-positve hits (CRYL1) and sequences whose length is less than 20 amino acids.

Remark : omenclature for range of sequence lengths: 3000:4000[SLEN] (see HELP from "Entrez")

Redo the selection for "Homo sapiens"

Make a new multiple alignment.

Do those 5 sequences correspond to 5 differents proteins ?

Références

Documents relatifs

Official  bilingualism  stipulates  that  French  and  English  are  the  two  official  languages  in  Canada.  The  two  groups  –  Anglophones  and 

Once the data and clock bits are separated, the detection of mark bytes, e.g., address mark, is achieved by comparing the data-bit and clock-bit patterns stored in the

Line Mode configures the terminal for line-by-line data transmission. The Program Function keys do not operate in the mode.. The following Command sequences

The number of features available to a computer user has grown, making the ordering and production system com- plex; at International Business Machines

Watch the video “How not to do a presentation” on Moodle (session 5) until **2 min 37** and try to remember as many bad points as possible – write them in the ‘Don’ts’

To cite this article: Aurelia Stirnimann & Laura Zizka (2021): Waste not, want not: Managerial attitudes towards mitigating food waste in the Swiss-German restaurant

Le tout jeune Danube Le tout jeune Danube Photography ©2017 Ref 10406401 artmajeur.link/tNkmN4. Prints

Rives du Léman en automne Rives du Léman en automne Photography ©2014. Ref 8071177