Choice of the best matrix and parameters for GDH sequences alignment
The alignments are made using : ClustalW at EBI (Cambridge)
Reference : Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994) "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice" Nucleic Acids Res. 22, p 4673 - 4680
Example with the subset H :
GDH NOT EC classified Viridiplantae
length range [411 - 470]
Get the files (ZIP
compressed) in FASTA format : Subset H
matrix PAM BLOSUM GONNET Id
alignment score (default values for all
parameters) 25246 25366 25471 25487
parameter value score PAM score BLOSUM
score
GONNET score Id
"Gapopen"
(default = 10)
1 25818 26193 26243 26720
2 25783 25783 26044 26224
100 21890 22150 22636 22636
"Endgap" 10 25246 25366 25471 25487
20 25246 25366 25471 25487
"Gapext"
(default = 0,2)
0,05 25246 25366 25471 25487
0,5 25246 25366 25471 25487
5 25246 25366 25471 25471
10 25246 25366 25471 25471
Gapdist 10 25246 25366 25471 25487
(default = 4) 5 25246 25366 25471 25487
1 25246 25366 25471 25487
Matrix Gapopen Endgap Gapext Score
Gonnet default = 10 default = ? default = 0,05 33558
PAM default default default 33207
Gonnet 1 default default 35545score) (Highest
PAM 1 default default 35003
Gonnet 1 10 default 35545
Gonnet 1 20 default 35545
Gonnet 1 default 0,05 35545
Gonnet 1 default 0,5 35511
Gonnet 1 default 5 35076
Gonnet default default 5 33298
Gonnet 1 default 10 34737
Gonnet 1 default 1 35503(Best
alignment)
Finally, "eye" inspection allows to choose the Gonnet matrix although the highest score is obtained with the Id matrix.
1. Id matrix : Gapopen = 1 - Other default values : SCORE = 26720
gi|15240793|ref|NP_196361.1| ILG-LDSKI----ERSLMI-PFREIKVECTIPKDDGTLVSYIGFRVQHDN 60 gi|15004984|dbj|BAB62170.1| ILG-LDSKI----EKSLMI-PFREIKVECTIPKDDGTLVSYVGFRVQHDN 60 gi|28269441|gb|AAO37984.1| LLG-LDSKL----EKSLLI-PFREIKVECTIPKDDGTLASYVGFRVQHDN 60 gi|7431768|pir||T16982 LLG-LDSKL----EQCLLI-PFREIKVECTIPKDDGSLATFIGFRVQHDN 60 gi|15054452|dbj|BAB62312.1| -LAVLD--LPPAMEK-IVITPQREMTVELIINRDDGKPESFMGYRVQHDN 94 gi|15054450|dbj|BAB62311.1| -LAVLD--LPPAMEK-IVITPQREMTVELIINRDDGKPESFMGYRVQHDN 94 *. ** : *: ::* * **:.** * :***. :::*:******
2. GONNET matrix : Gapopen = 1 - Other default values : SCORE = 26243
gi|15240793|ref|NP_196361.1| ILGLDSKIERSLMIPFREIKVECTIPKDDGTLVSYIGFRVQHDNARGPMK 66 gi|15004984|dbj|BAB62170.1| ILGLDSKIEKSLMIPFREIKVECTIPKDDGTLVSYVGFRVQHDNARGPMK 66 gi|28269441|gb|AAO37984.1| LLGLDSKLEKSLLIPFREIKVECTIPKDDGTLASYVGFRVQHDNARGPMK 66 gi|7431768|pir||T16982 LLGLDSKLEQCLLIPFREIKVECTIPKDDGSLATFIGFRVQHDNARGPMK 66 gi|15054452|dbj|BAB62312.1| VLDLPPAMEKIVITPQREMTVELIINRDDGKPESFMGYRVQHDNARGPFK 100 gi|15054450|dbj|BAB62311.1| VLDLPPAMEKIVITPQREMTVELIINRDDGKPESFMGYRVQHDNARGPFK 100 :*.* . :*: :: * **:.** * :***. :::*:**********:*