HAL Id: hal-00893828
https://hal.archives-ouvertes.fr/hal-00893828
Submitted on 1 Jan 1990
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Molecular cloning and partial sequence characterization
of the duplicated amylase genes from Drosophila erecta
L Bally-Cuif, V Payant, S Abukashawa, Bf Benkel, Da Hickey
To cite this version:
Original
article
Molecular
cloning
and
partial
sequence
characterization of the
duplicated
amylase
genes from
Drosophila
erecta
L
Bally-Cuif,
V
Payant,
S Abukashawa BF Benkel DA
Hickey
Department of Biology University of Ottawa, Ottawa, Ontario KlN 6N5, Canada
(Received
16 May 1989; accepted 15 November1989)
Summary - Previous studies have shown that the a-amylase gene is duplicated in each of the 8 species
comprising
the Drosophila melanogaster speciessubgroup.
In this study,a genomic
library
inphage
lambda was prepared from the DNA of 1 species member ofthis subgroup, D erecta. This gene library was screened with an amylase-specific probe
from the closely-related species D melanogaster. One of the cross-hybridizing phage clones
was isolated, insert
fragments
were sub-cloned into plasmid vectors, and theidentity
ofthe cloned locus was verified by DNA sequencing. The results confirm that this species
contains 2 very similar copies of the
amylase-coding
sequence, as does D medanogaster.Within the amylase-coding region, there is a 3% sequence divergence between the 2 species;
this value rises to 6% if we consider only silent sites within the
coding
region. Sequencedivergence
between D erecta and D melanogaster in the non-codingintergenic
region isapproximately 15%.
These results constitute the first molecular sequence data available for protein-coding
genes of D erecta. The data set allows us to calculate an estimated divergence time of more than 10 million years between the 2 species. This is consistent with the results of
previous
phylogenetic
studies.Drosophila erecta / a-amylase / molecular evolution / gene conversion
Résumé - Clonage et caractérisation d’une séquence partielle des gènes dupliqués de l’amylase chez Drosophila erecta - Des études antérieures ont montré que le
gène
a-amylase était dupliqué chez chacune des 8 espèces du sous-groupe
Drosophila melanogaster.
Dans la présente étude, une banque génomique a été préparée dans le phage lambda à partir de l’une des espèces membres du sous-groupe; D erecta. Cette banque a été criblée avec une sonde, provenant de l’espèce voisine D melanogaster, et contenant les
séquences codant pour
l’amylase.
Un des cloneshybridant
à la sonde a été isolé, etses fragments de restriction sous-clonés dans des plasmides. La nature des gènes a été
vérifiée par séquençage. Les résultats obtenus confirment que D erecta contient, comme
D
melanogaster,
2 copies très semblables de la région codante du gène a-amylase. Danscette
région,
la divergence de séquence entre les 2 espèces est de3%,
et atteint 6% si l’onne considère que les positions silencieuses des codons. Dans la région intergénique
(non
codante),
la divergence est d’environ 15%. Ces résultats constituent les premières données de séquence pour des gènes codants pour une protéine chez D erecta. Ils nous permettent*
d’estimer un temps de divergence de plus de 10 millions d’années entre les L espèces, ce
qui est compatible avec les résultats des études phylogénétiques antérieures.
Drosophila erecta / !Y-amylase / évolution moléculaire / conversion génique
INTRODUCTION
Drosophila a-amylase
genes havealready
beenextensively
studied(Levy
etal, 1985;
Gemmil et
al,
1986;
Doane etal,
1987;
Langley
etal,
1988;
Hickey
etal,
1989b),
andthey
haveproved
to beparticular
interest,
mainly
because of their numerous encodedallozymes
(Singh
etal, 1982;
Dainou etal,
1987),
theircomplex
patternof
regulation, including glucose repressibility
(Benkel
andHickey,
1987),
and theirduplicated
structure. Within naturalpopulations
of Dmedanogaster,
the gene isduplicated
(Levy
etal, 1985;
Gemmill etal,
1986),
with the 2copies,
each 1.4 kbin
length, being separated by approximately
4.8 kb of sequence.They
are present inopposite
orientation(Levy
etal,
1985;
Benkel etal,
1987)
and aredivergently
transcribed(Boer
andHickey,
1986).
Asimplified
map of this locus ispresented
inFig 1.
The number and pattern of
electrophoretic
variants within eachspecies
andin
interspecific hybrids
(Dainou
etal,
1987),
coupled
with the recentanalyses
of restriction maps
(Payant
etal,
1988),
have shown that theduplication
is notrestricted to D
melanogaster,
but is also present in the 7 otherspecies
of themelanogaster subgroup.
Thea-amylase
gene thus appears as a smallmultigene
family, containing only
2 members. Because of the presence of thisduplication
in all thespecies
of thesubgroup,
it can be considered as anevolutionarily
ancient trait.The 2
copies
of thea-amylase
gene from a strain of Dmelanogaster
havealready
beenpartly sequenced
(Boer
andHickey,
1986),
and the 2coding
regions
showed ahigh degree
of sequencesimilarity.
This,
in addition to the closelinkage
of the2
copies,
raised thepossibility
of sequence recombinations between the 2copies
of the gene, even
though
theiropposite
orientation is considered as a more stable arrangement than tandemduplication.
The construction of a restriction map for thea-amylase
locus in each of the 8species
of thesubgroup
(Payant
etal,
1988)
proved
gene in both D erecta and D teissieri
(Fig 1).
If the mutationeliminating
this BamHI site
proved
to be the same in the 2copies
of eachspecies,
this would support thehypothesis
of recombination eventsoccuring
between the 2copies
of the gene within a locus.We cloned both
copies
of theamylase
gene from D erecta andsequenced
part of thecoding region
(containing
the Bam HIlocation)
for each gene copy. Wethen
performed
sequencecomparisons
both,
within thea-amylase
locus in theD erecta
strain,
and between the twospecies,
D erecta and Dmelanogaster.
This allowed us to assess sequencesimilarity
between the 2 genecopies
in D erectaand, also,
toquantify
thedegree
of sequence ofdivergence
between D erecta andD
melanogaster.
These sequence data are discussed in relation to the results obtained inprevious
phylogenetic
studies.MATERIAL AND METHODS
The D erecta strain
220S,
collected in 1980 in theIvory Coast,
WestAfrica,
andprovided by
Dr JDavid,
was used in the presentstudy.
Twelve flies of that strain weresampled
and tested for theira-amylase allozymic
patternby polyacrylamide
gel electrophoresis. Having
verified theidentity
of thestrain,
genomic
DNA was isolated from thedescendants
of these fliesusing
aprocedure
described in detail elsewhere(Payant
etal,
1988).
Afterdigestion
of the DNA with Sau3A,
agenomic
library
was constructed in lambdaphage
EMBL3,
according
to the manufacturer’s instructions(Stratagene).
Plaque
lifts wereprepared
as describedby
Benton and Davis(1977),
using nylon
membranes(Biotrans;
ICNBiomedicals).
The membraneswere
hybridized
to thepOR-M7
(coding region)
Dmelanogaster amylase
cDNA32
p-labelled
probe
(Benkel
etal,
1987);
thisprobe
is indicatedby
the bar labelledpm7
inFig
1. The filters were then washed under conditions whereonly
the recombinantphages,
with at least 80% sequencesimilarity
to the aamylase
gene, would remain bound to theprobe
(for
detailedprocedure,
seePayant
etal,
1988).
After elimination of theradioactivity,
the sameprocedure
was used tohybridize
the membrane to the CS1.4a D
melanogaster amylase
32
p-labelled
probe
(Hickey
et
al,
1988;
and seeFig
1).
Five cloneshybridizing
to bothprobes
were identified.Their DNA was isolated and
digested
with Sal 1. A southern blot(Southern,
1975)
of thedigested
DNAs wasmade,
andprobed
with thepOR-M7
and CS.4aprobes,
as described above. A 6.4 kb Sal1 fragment hybridized
to the 2probes.
Based on itslength
andhybridization
pattern, it was concluded that thisfragment
contained the5-prime region
of bothcopies
of thea-amylase
gene. This 6.4 kbfragment (see
Fig
2)
was thenisolated, purified,
and subcloned into the Sal site of thepUC-based
vector,pIBI
21. The orientation of the subclones was determinedby
an Eco Rldigestion.
Each subclone was thendigested
with Sal 1 and HindIII,
and the Sal1/Hind
IIIfragments
were further subcloned andinserted,
in bothorientations,
intopIBI
24 andpIBI
25plasmid
vectors. Theresulting
subclones arepresented
in
Fig
2(inserts
of 4 kb and 2.4kb,
respectively,
for theproximal
and the distalgenes).
These subclones were used forsequencing.
se-quences were assembled and
analysed using
a sonicdigitizer
and theMicrogenie
(Beckman)
computer programs(Queen
andKorn,
1984).
RESULTS AND DISCUSSION
For both
amylase
genecopies
of D erecta, 500bp
ofcoding region
weresequenced,
along
with a shortportion
(200
bp)
of thenon-coding intergenic region.
Theseregions
are indicated inFig
2,
and thesequencing
results arepresented
inFig
3.Sequence similarity
between the two D erecta genecopies
We found that the 2 gene
copies
in D erecta were 100% similar for theregion
studied. Forinstance,
the mutation of the Bam HI site in D erecta is due to an identical nucleotide substitution in the 2 genecopies.
Several other mismatchesseparated
Dmedanogaster
and D erecta at the sequencelevel,
but all of thesepolymorphisms
were sharedby
both genecopies
of D erecta.Thus,
for thatregion,
all the nucleotide substitutions that have occured since the D erecta and Dmedarcogaster species
divergence,
have beenincorporated
into both genecopies
of D erecta. Out of the 15 5nucleotide substitutions that
separated
Dmelanogaster
and D erecta in thatregion,
9 affected the 3rd base of a
codon,
and did result in any amino-acid substitutionbetween the derived
proteins.
It is worthnoting
that even the silent substitutions were identical in bothcopies
of the D erectacoding
sequence.Thus,
thenear-identity
of the two D erectacoding regions
cannot beexplained solely
as result ofstrong selection
acting
in favour of the occurence of certain new amino acids in theamylase
of D erecta(compared
to that of Dmelanogaster),
andthus,
favouring
theincorporation
of thecorresponding
new nucleotides into bothcopies
of the gene.There are several
possible explanations
for the conservation of sequencesimilarity
between the members of a genefamily.
In this case, neither very recentduplication,
norretrotransposition
areplausible explanations,
since(i),
as was mentionedbefore,
theduplication
has been shown to be very ancient(Dainou
etal,
1987;
Payant
etal,
1988),
and(ii)
acomparison
of the upstreamregions
of the 2 genescopies
in Dmelanogaster
(Boer
andHickey,
1986)
showed that the sequenceidentity
exdented further than thecoding regions. Thus,
the 2a-amylase
genecopies
within theseSequence
divergence
between D erecta and Dmelanogaster
An
alignment
of the D erectaamylase
sequences with theproximal amylase
of the Dmelanogaster
strainOregon
R ispresented
inFig
3. The time ofdivergence
between the 2species,
D erecta and Dmelanogaster,
can be estimated from these data. The short segment from theintergenic region
(see
Fig
2 andFig
3A)
showsapproximately
15%divergence
between the 2species.
This sequence is notsubject
to the constraints of natural selectionacting
on itscoding capacity; therefore,
it serves as a usefulqualitative
indicator of the considerable amount ofdivergence
between the 2species
at the molecular level.Although
other members of thespecies
subgroup
have beensubjected
to molecularanalysis
(eg,
Bodmer andAshburner,
1984),
noprevious
molecular studies have been made on D erecta.’
As one
might
expect, thecoding
region
showedconsiderably
lessinterspecies
divergence
than did thenon-coding intergenic region.
The 500bp
ofcoding
sequencepresented
here(see
Fig
3B),
shows 3.3% nucleotidedivergence
between the 2species;
to
give
astatistically
validapproximation
of the percentage of nucleotidedivergence
through
the wholecoding region.
Many analyses
havealready
been carried out to estimate thedivergence
timebetween the 8
species
of themedanogaster subgroup. Mainly
based onallozymes
and the 2-dimensionalgel
electrophoresis
(Cariou, 1987),
mitochondrial DNA(Solignac
et
at,
1986),
ribosomal DNA and histone genefamily organization
(Coen
etat,
1982),
DNA sequence data
(Bodmer
andAshburner,
1984)
andpolytene
chromosomesbanding
patterns(see
Lemeunier etat,
1986),
the Dmelanogaster subgroup
has beensplit
into 3complexes:
the D erecta - D orenacomplex,
the Dyakuba -
D Teissiericomplex,
and the Dmelanogaster complex
(including
Dmelanogaster,
Dsimulans,
D mau7itiana and Dsechedlia) (see
Lachaise etat,
1988;
Lemeunier etat,
1986,
for areview).
Moreover, mainly
because of its members’amylase
allozyme
patterns, mitochondrial DNA patterns and its
inability
to interbreed withspecies
from the othercomplexes,
the D erecta - D orenacomplex
appears to be farthest apart from the 2others,
and thedivergence
time has been estimated to be ashigh
as 15 million years.Since some of our data concern the
coding region
of the gene,they
can be used to confirm the above estimatesof
species
divergence
times. Asexpected,
the level ofdivergence
found between D erecta and Dmelanogaster
(3%)
issignificantly
higher
than what was found between thea-amylase
genes of different strains of Dmelanogaster
(approximately 1%),
indicating
agreater
divergence
between the2
species
than within the latter(Boer
andHickey,
1986;
Langley
etat,
1988;
and ourunpublished
data).
The percentagedivergence
between the 2species
at silentsites
(6.1%)
can be used to estimate time since thedivergence
of D erecta and Dmelanogaster.
If we use the rate of 5.6 x10-
9
changes
persite,
per year,proposed
by Hayashida
andMiyata
(1983)
for silent sites within exons, this percentagegives
a
divergence
time ofapproximately
11 million years; the order of which is consistentwith the
previous
estimates ofdivergence
time. It should benoted,
however,
thatthis may be an underestimate of the real
divergence
time. Forinstance,
as notedabove,
gene conversion can eliminate allsubstitutions,
including
those at silentsize,
between genecopies
within aspecies,
and this would result in a gross underestimates of the age of theduplication
if we relied on sequencedivergence
at silent sites.Although
gene conversion cannot, of course, occur between sequences which areseparated
in differentlineages,
otherfactors,
such as biased GC content(see
Hickey
etat,
1989b),
could reduce the observed rate ofdivergence
at silentsites;
this wouldresult in a lowered estimate of the
divergence
time betweenspecies.
ACKNOWLEDGMENTS
This work was
supported
by
anOperating
Grant from NSERC Canada to DAHickey
and a PostdoctoralFellowship
fromCNRS, France,
to VPayant.
We areREFERENCES
Benkel
BF,
AbukashawaS,
BoerPH,
Hickey DA,
(1987)
Molecularcloning
of DNAcomplementary
toDrosophila melanogaster a-amylase
mRNA. Genome29,
510-515Benkel
BF,
Hickey
DA(1987)
ADrosophila
gene issubject
toglucose
repression.
Proc Natl Acad Sci USA
84,
1337-1339Benton
WD,
David RW(1977)
Screening
lambdagt
recombinant clonesby
hy-bridization tosingle
plaques
in satu. Science190,
180-182Bodmer
M,
Ashburner M(1984)
Conservation andchange
in theDNA
sequencescoding
for alcoholdehydrogenase
insibling
species
ofDrosophila.
Nature309,
425-430Boer
PH,
Hickey
DA(1986)
Thea-amylase
gene inDrosophila
melanogaster:
nucleotide sequence, gene structure and
expression
motifs. Nucl Acids Res14,
8399-8411
Cariou ML
(1987)
Biochemicalphylogeny
of theeight
species
in theDrosophila
melanogaster subgroup, including
D sechellza and D orena. Genet Res50,
181-186 CoenE,
StrachanT,
Dover G(1982)
Dynamics
of concerted evolutionspecies
subgroup
ofDrosophila.
J MolBiol 158,
17-35Dainou
0,
CariouML,
DavidJR, Hickey
DA(1987)
Amylase
geneduplication:
an ancestral trait in theDrosophila melanogaster
species
subgroup.
Heredity
59,
245-251Doane
WW,
GemmillRM,
SchwartzPE, Hawley SA,
Norman RA(1987)
Structuralorganization
of thea-amylase
gene locus inDrosophila melanogaster
andDrosophila
miranda. In:Isozymes:
CurrentTopics
inBiological
and Medical Research. Alan RLiss,
NewYork,
Vol14,
229-266Gemmill
RM,
ShwartzPE,
Doane VVW(1986)
Structuralorganization
of theAmy
locus in seven strains ofDrosophila melanogaster.
Nucd Acids Res14,
5337-5352Hayashida H, Nliyata
T(1983)
Unusualevolutionary
conservation andfrequent
DNA segmentexchange
in class I genes of themajor
histocompatibility complex.
Proc Natl Acad Sci USA
80,
2671-2675Hickey DA,
BenkelBF,
AbukashawaS,
Haus S(1988)
DNArearrangement
causesmultiple changes
in geneexpression
at theamylase
locus inDrosophila
melanogaster.
Biochem Genet26,
757-768Hickey
DA, Bally-Cuif
L,
AbukashawaS,
Benkel BF(1990)
Concerted evolution ofduplicated protein-coding
genes inDrosophila.
Submitted forpublication
Hickey DA,
BenkelBF,
Magoulas
C(1989b)
Molecularbiology
of enzymeadapta-tions in
higher
eukaryotes.
Genome31,
272-283Lachaise
D,
CariouML,
DavisJR,
LemeunierF,
TsacasL,
Ashburner M(1988)
Historical
biogeography
of the Drraelanogaster
species
subgroup.
Evol Biol22,
159-225Langley CH,
Shrimpton
AE,
YamazakiT, Miyashita
N,
MatsuoY,
Aquadro
CF(1988)
Naturally
occurring
variations in the restriction map of theAmy region
ofLemeunier
F,
DavidJR,
TsacasL,
Ashburner M(1986)
Themelanogaster
species
subgroup.
In: The Genetics andBiology of Drosophila
(Ashburner
M,
Thompson
JR,
CarsonHL,
eds)
AcademicPress, London,
Vol 3, 147-256Levy JN,
GemmillRM,
Doane WW(1985)
Molecularcloning
ofa-amylase
genes from Dmelanogaster.
II. Cloneorganization
and verification. Genetics110,
314-324Payant
V,
AbukashawaS,
SassevilleM,
BenkelBF,
Hickey DA,
David J(1988)
Evolutionary
conservation of the chromosomalconfiguration
andregulation
ofamylase
genes amongeight
species
of theDrosophila melanogaster species
subgroup.
Mol Biol
Evol 5,
560-567Queen C,
Korn LJ(1984)
Acomprehensive
sequenceanalysis
program for the IBMpersonal
computer. Nucl Acids Res12,
581-599Sanger
F,
NicklenS,
Coulson AR(1977)
DNAsequencing
withchain-terminating
inhibitors. Proc Natl Acad Sci USA
74,
5463-5467Singh
RS, Hickey DA,
David J(1982)
Genetic differentiation betweengeographi-cally
distantpopulations
of DMelanogaster.
Geneticslol,
235-256Solignac
M,
MonnerotM,
Mounolou JC(1986)
Mitochondrial DNA evolution inthe
melanogaster
species
subgroup
ofDrosophila.
J Mol Evol29,
31-40Southern EM