• Aucun résultat trouvé

Formation and stability of secondary structures in globular proteins

N/A
N/A
Protected

Academic year: 2021

Partager "Formation and stability of secondary structures in globular proteins"

Copied!
10
0
0

Texte intégral

(1)

HAL Id: jpa-00247827

https://hal.archives-ouvertes.fr/jpa-00247827

Submitted on 1 Jan 1993

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Formation and stability of secondary structures in globular proteins

J. Bascle, T. Garel, Henri Orland

To cite this version:

J. Bascle, T. Garel, Henri Orland. Formation and stability of secondary structures in globular proteins.

Journal de Physique II, EDP Sciences, 1993, 3 (2), pp.245-253. �10.1051/jp2:1993126�. �jpa-00247827�

(2)

Classification Physics Abstracts

87.15B 36.20 05.90 61.40D 87.10

Formation and stability of secondary structures in globular proteins

J.

Bascle,

T. Garel and H. Orland

(*)

Service de Physique Thdorique

(**),

CE-Saclay, 91191 Gif-sur-Yvette Cedex, France

(Received

10 August 1992, accepted 27 October

1992)

R4sum4. Nuns 4tudions deux modbles pour la formation et I'empiement d'h6Iices

on de

feuillets dans la phase globulaire

(compacte)

des prot4ines. Ces mod+Ies, fond4s sur des chemins hamiltoniens ponddrds sur r4seau, possbdent une transition de phase du premier ordre, entre

(I)

une phase haute tempdrature compacte, avec structures secondaires non 6tendues, et

(it)

une phase compacte quasi-geI4e, oh Ies structures secondaires envahissent tout Ie rdseau. La phase quasi-gelde, qui a une ddpendance en tempdrature trbs foible, est identifide h la phase native des

protdines; la phase haute tempdrature est pent-4tre reticle h la phase "globule fondu"

(molten globule)

des protdines.

Abstract. We study two models for the formation and

packing

of helices and sheets in

globular (compact)

proteins. These models, based on weighted Hamiltonian paths on a

regular

lattice both exhibit

a first order transition between

a compact high temperature phase, with

no extended secondary structures, and a quasi-frozen compact phase, with secondary structures

invading the whole lattice. The quasi-frozen phase with very weak temperature dependence, is identified as the native phase of proteins, whereas the high-temperature phase may be relevant to the so-called molten globule state of proteins.

1 Introduction.

Proteins are

weakly

branched

polymers,

built out of twenty

species

of monomers

(aminoacids);

they

have the property of

folding

into an

(almost) unique

compact native structure. This native structure is of great

interest,

since it is

intimately

related to the

biological

function of the

protein iii.

The

generic

formula of aminoacids

(except

for

proJine)

is

NH2-CHR COOH,

where the variable part R is called the residue.

Polycondensation

of No aminoacids leads to

(* Also at Groupe de Physique Thdorique Statistique, Universitd de Cergy-Pontoise, 95806 Cergy- Pontoise, Cedex, France.

(** Laboratoire de la Direction des Sciences de la Matibre du Commissariat h l'Energie Atomique.

(3)

246 JOURNAL DE PHYSIQUE II N°2

the formation of

proteins,

the

typical

size of which

being

in the range No '- 60 -1000

(e,g.

sperm-whale myoglobin

has No "

153).

Several levels of

complexity

may be defined in the

folding problem.

One may for instance consider the relevant interactions in

proteins:

at a

microscopic level,

one

only

deals with Coulomb interactions. On a

longer scale,

these

microscopic

interactions

give

rise to covalent

bonding,

effective Coulomb interactions

(with partial charges

or

screening), hydrogen bonding,

Van der Waals interactions and

hydrophobic

effects with the solvent. It is clear that all these interactions are

interdependent.

The

complexity

among the interactions is also reflected in the multi-level structure of the folded

proteins. Roughly speaking,

the

primary

structure

(I.e.

the chemical sequence of the

protein)

is due to covalent

bonding, whereas,

as shown

by Pauling

[2],

hydrogen bonding

is

responsible

for the existence of

secondary

structures, I-e- o-helices or

p-sheets (see Fig. I).

Finally,

the

tertiary

structure is

mostly

driven

by

the

hydrophobic

effect: the native

protein

is compact so that a

polar

residue may hide away from the solvent.

/

NH

(al lbl

Fig,

i. Schematic representation of hydrogen bonds in

(a)

a-helix

(b)

fl-sheet.

A

physical approach

to the

study

of low energy compact structures can be stated in the

following

way: the number of

hydrogen

bonds

(H-bonds)

with the solvent is

proportional

to the surface of the

protein.

This surface

being

constant, one is left with a balance between

intraglobular

H-bonds and the

requirement

of

globular compacity. Fully

saturated H-bonds

imply

a one dimensional

(o helix)

or two dimensional

(fl sheets)

local structure, both

being incompatible

with a compact three dimensional structure. This situation therefore bears some

resemblance to the situation one encounters in

glasses,

where local order

(e.g.

five-fold sym-

metry)

is not

compatible

with the

global space-filling

constraint [3].

Indeed,

one of the most

striking

features of the native structure of

proteins

is its almost frozen character. On the ex-

perimental side,

the

folding (or denaturation)

transition seems, in some cases, to

proceed

in two steps [4], the intermediate

phase being

the sc-called molten

globule.

In this paper, we

study

the statistical mechanics of

(bulk)

H-bonds in the compact

phase.

Two

models, describing respectively

the formation of o-helices and

p-sheets

are introduced.

Both models are formulated in terms of

weighted

Hamiltonian

paths;

in a mean field

approach,

we get a first order transition between a

high

temperature compact

liquid phase (which

we

interpret

as the molten

globule)

and a low temperature

crystal phase (which

we

interpret

as

(4)

the native

state).

Note that the

crystal phase

has

(almost) fully

saturated H-bonds and is thus

(almost)

frozen.

2

Physical

models.

Since we are

only

concerned with the

globular

state of the

protein,

we will consider compact

chains, represented

as Hamiltonian

paths

on a lattice [5]. This

representation,

which is usual in the

modelling

of

collapsed polymer chains,

can be summarized as follows: consider a

hypercubic

lattice in a d-dimensional space, with N = L~

sites,

and

periodic boundary

conditions. A Hamiltonian

path

on the lattice is a walk which goes

through

all sites once and

only

once.

Such a walk satisfies both the

compacity

and self-avoidedness

requirements. Moreover,

we will restrict ourselves to closed

paths,

since

boundary

conditions

play only

a role in subdominant terms [5].

We will now introduce two different models, to mimic the formation of o-helices or

p-sheets

in such compact structures.

2. I n-helices. In this

model,

each link of the Hamiltonian

path

represents a helical turn.

We recall

that,

in real

proteins,

a helical turn

corresponds

to 3.6 aminc-acids on the average, and that H-bonds stabilize extended helical structures.

Thus,

in our

model,

we attribute an

energy loss e to the

breaking

of a

"helix",

that is whenever the Hamiltonian

path

makes a turn

(corner).

The

partition

function of the system at inverse temperature

fl

=

$

reads:

z =

£ e-P«

N~l'i)

ji)

jiij

where

(~l)

denotes the ensemble of all Hamiltonian

paths

and

Nc(~l)

denotes the number of

corners present in

path

~l. In

figure 2,

we show an

example

of a Hamiltonian

path.

Fig. 2. A Hamiltonian path in d = 2 with Nc = 14 corners.

This model has been studied in the context of the

melting

of semi-flexible

polymer

chains [6,

7].

In order to calculate Z, we follow reference [7, 8]. We introduce on each site r and for

(5)

248 JOURNAL DE PHYSIQUE II N°2

each direction o

= I,..

,d

an n-component real field ~~

(r).

The

partition

function Z can be rewritten:

z =

~'im~

/ fl ji d~a lr) e~~° fl ( (

~l lr)

+ e~P~

£

~a

jr)

~,

r)j

12a)

r «=i r «=i i<«<,<d

with

~

~G

"

~ £

i'a

(~) '(l~$r ')

i'a

(~ ') (2~)

o=I r,r '

where the operator AQ~

,

is I if r and r ' are

nearest-neighbours

in direction o, and 0 otherwise.

In order to prove the

equivalence

of

(2)

and

ii),

we will use Wick's theorem: we

define,

from

(2),

the elementary contraction:

i~b~ l~J §'i~

(~

'J

"

6Uu6a7Alr

' 13)

where u and v are component indices of ~~

jr), running

from I to n.

Expanding

the

product

over

(r)

in

(2a),

we must choose at each site r either a term of the form

)~$ jr)

or one of the form

e~P~~~ (r)

~,

(r), corresponding respectively

to a

path going straight through

r in direction o, or to a

path making

a turn at r, from direction

tY to direction

~.

By contracting

the fields

according

to

(3),

we construct a sum over all self

avoiding

compact

closed

paths

with

appropriate weights,

with an additional factor n

(due

to the summation over component index

u)

for each closed

loop.

As usual [9], we extract the

single

chain contribution

by taking

the limit n

= 0. This concludes the

proof.

From this

proof,

it is clear that vacancies

(empty sites)

can

easily

be included in the

model, by adding

to the terms in brackets in

(2a)

a term of the form

e~P",

where ~ is the chemical

potential

for the vacancies. In that case,

equation (4) (see below)

is

slightly modified, giving

rise to the

possibility

of

phase separation

between

vacancy-rich

and vacancy-poor

(globular) phases.

In order to evaluate

(2a),

we will resort to a mean-field

theory,

that

is,

a saddle

point

method.

The

ground

state of the model consists of

straight paths, making

turns on the surface of the lattice

(their

free energy

being

non

extensive,

of order

L~~~).

A correct mean-field treatment

should,

a

priori,

take this one-dimensional character into account, but we have shown elsewhere [7] that excellent results are obtained

by using

a

straightforward isotropic homogeneous

mean-

field. We have also shown in

iii

that fluctuations around this

isotropic

mean field should not be

included,

since it

spoils

the

quality

of the results.

The mean field

equations (saddle point

on

(2a))

read:

§ ~~~

'~ ~ ~~ ~~ '~

l £a' §'I)~~~)

~~~ ~~))~

~j'~J

§'7'

l~J

~~~

The

isotropic homogenous

mean-field assumes ~~

jr)

= ~ for any o

=

I,..,d

and r. We further break the

O(n)

symmetry

by choosing

~ in a

given

direction, say I. From

(4),

we

obtain:

§'~~~ "

(5a)

At this mean-field

level,

the free energy per site reads:

f

= -kBT

log ()) (5b)

e

(6)

with

q(fl)

= 2 +

2(d I)e~P~ (5c)

and e = 2.71828... Note that

q(fl) plays

the role of an effective coordination number

(see Ref.[8]).

On the other

hand,

the free energy is a

decreasing

function of temperature

(since

the entropy is

positive),

and as mentioned

above,

the

ground

state free energy per site vanishes

(it

is of

order

I/L). Thus,

the free energy per site remains

negative (or zero)

at all temperatures.

There is a temperature

TF

for which the effective coordination number

q(fl)

is

equal

to e, and

f

vanishes. For d

=

3, kBTF

= 0.58 e. Below this temperature, the free energy remains

equal

to zero

(Fig.3).

t~ t

0J 0.8 1.2 1.6 20

f/~

l.0

Fig. 3. Plot of the free energy per site

(Eq.(5b))

versus temperature, for d

= 3. t is the reduced temperature: t

=

kBTle.

Physically,

there is a

competition

between the entropy

gain

of

making

turns, and the cor-

responding

energy loss. At

high

temperature, the corners are mobile in the

bulk, leading

to a

liquid

like structure, whereas at low temperature, the system is

frozen,

in stretched

walks,

with the corners

expelled

on the surface. The two

regions

are

separated by

a first order

phase

transition at TF. The average

length

of a helix is

given by

~

U

(flF)

~~~

where

U(fl)

=

-]log

Z is the internal energy.

At the

freezing point,

in d

= 3, the

length

is

equal

to £F "

3.78,

and it is infinite

it

=

L)

in the low temperature

phase.

Note that £F

corresponds

to a

typical

number of aminc-acids per o-helix of the order of15.

In a more elaborate treatment [7], one finds a very weak temperature

dependence,

of the low temperature

phase,

but the overall

freezing picture

remains correct.

Indeed,

in the frozen

phase,

the

typical length

of a helix is

~6fle g ~

12

(7)

250 JOURNAL DE PHYSIQUE II N°2

yielding

£

= 2592

just

below TF

(corresponding

to'9330

residues).

This

length

scale is

clearly

out of reach in any realistic

protein,

or

polymer

system. The

(first order) complete freezing picture

is thus

adequate

for any

practical

purpose.

2. 2

p-sheets-

We

again

use the Hamiltonian

path formalism,

with the

slight

modification that a link of a

path

is now to be

interpreted

as an aminc-acid.

The model can be described in the

following

way. Consider a Hamiltonian

path.

To mimic the formation of

CO-HN,

I-e- a

H-bond,

in

p-sheets,

we allow an H-bond

(energy gain e),

whenever two

pairs

of

aligned

links

belong

to two

non-intersecting neighbouring

strands

(see FigA).

The

partition

function reads:

z =

~ ~ e-P«

N~(it)

j~)

j7ij jH-bondsj

(ai (hi

~-Y

Fig. 4. The two different possible types of H-bonds in the fl-sheet model.

The summation runs over all

possible

Hamiltonian

paths ~l,

and over all

possible

sets of H-bonds

compatible

with the

path (see below).

We show in

figure

5 such a Hamiltonian

path.

Fig. 5. A Hamiltonian path in d

= 2 with NH = 4

(out

of five

possible) hydrogen-bonds.

In order to

give

an

integral representation

of

Z,

in addition to the

previous

fields ~~

jr)

which generate the

paths,

we need to introduce two scalar fields

~l$ jr)

and ~la

jr)

which

respectively

(8)

initiate and terminate and H-bond at site r in direction o. We obtain

~ j~~ I

J n~,r ~h'a (~) dlfia (~) dint (~) ~~~~ fir

~

(~)

~ ~

n-o

I j n~

~

d~2~

jr) dfbo jr) dfbt jr)

e~~G ~

with

~G

"

i~ ~

()i'o (~) (l~~r ')

i'a

(~ ')

+

l~$ (~) (l~~~')

~

lba (~ )j (~~)

o=I r,r '

and

Dir)

"

~ (i°i jr) Gn jr)

+

~

i°«

(r)

i°b

jr)

18C)

and

G« jr)

= i + eP~/~

~ (~t jr)

+

~, jr))

+ e~~

l~ ~t (rJ ~, jr) (8d)

7(#") 7(#")

The operator AQ~

,

is the one defined in

2,I,

whereas

AQf,

= 6

(r' (r

+

ea)),

where ea is the unit vector in direction o.

Expanding

the

product

over r in

(8a),

at each site r, we must choose either

ii)

a term of the form

)~$ jr) Ga jr)

or

(it)

a term of the form ~~

jr)

~~

jr)

The latter

(it)

represents a corner and does not allow for an H-bond. The former I) rep- resents a

path going straight through

r in direction o, and

according

to

(8d),

allows for four

possibilities, namely,

no bond with

weight

I, one bond

entering

or

leaving

r in direction

7(# o)

with

weight eP~/~,

and

finally,

one bond

entering

and one

leaving

at r in direction

7(# o)

with

weight

eP~

As in section 2,I, the identification of

ii)

and

(8a) proceeds through

the use of the Wick's

theorem,

with the

following elementary

contractions:

i~b~ (~) i~(~

(~

'J

"

6afl6uuAlr

'

l~t l~)

l~fl

l~ 'J

" SUP

All'

~~~

~fia

jr)

~fip

(r ')

= ~fit

(r)

~fit

jr ')

= o

These contractions indeed generate the

required partition

function. As mentioned in the pre- vious section vacancies can be included in the model

by

a

slight

modification of D

jr).

As in the

previous section,

we

study

this model in a mean-field

approximation.

The mean- field

equations

read:

~ (~ir ')~

h'a

(~ ')

"

~~ ~~~

~~ ~~j~~~~~

~~ ~~~

(1°~)

i

(~il')

trio

(~ ')

"

~~~~~~ ~~

~~

)~~~~

~ ~~~

~~

~~~~

(~°~)

i~ I~SS)

~~

l~t (~ ')

"

~~~~~~ ~~ ~))~)~~

~

~~~~~

~~~~

(1°~)

(9)

252 JOURNAL DE PHYSIQUE II N°2

Restricting

ourselves to an

isotropic homogeneous solution,

~~

jr)

= ~,

~l$ (r)

=

~la jr)

= ~l,

we

get:

ia ~ =

(ha)

and ~l satisfies the

equation:

~l =

~~~ ~~

~~~~

~~ ~ ~~~~~~~

ii16)

with

D = 2d +

4(d I)eP~/~~l

+

2(d I)eP~~l~ (llc)

The free energy per site reads:

f=T(I+d~l~-logD) (12)

As in the

previous model,

the

ground

state

configuration corresponds

to frozen

fully

stretched

configurations

saturated with transverse H-bonds

(stacked fl-sheets),

with energy per site

equal

to -e.

Therefore,

we have

f

< -e

(13)

Solving numerically equation (lib),

we find

again

a first order

freezing

transition at a tem-

perature TF where

f

= -e. At d =

3,

we get kBTF " 0.86 e.

The

physics

of this model is very similar to that of the

previous

model

2.I, namely,

a

liquid-

like

high

temperature

phase

with no definite

p-sheet

structure, and a low temperature frozen

phase, consisting

of stacks of

parallel p-sheets-

A non

isotropic

type of

mean-field,

as in reference

iii,

would

probably again

induce a very weak temperature

dependence

in the low temperature

phase,

but the

biological

robustness of the two dimensional

p-sheets

should make the temperature

dependence

even weaker than in the

previous

case.

Furthermore,

due to the

higher dimensionality

of the

p-sheets,

we believe the

isotropic

mean-field

approximation

to be even better than in the

previous

case.

3 Conclusion.

We have studied in a mean-field

approximation,

two models for the formation of

secondary

structures in

globular proteins (in

the

thermodynamic limit).

These models exhibit a first order transition between

a

high-temperature liquid-like

com- pact

phase

and a

quasi-frozen low-temperature phase.

In this

quasi-frozen phase, secondary

structures span the whole system.

As far as

comparison

with

proteins

is

concerned,

several caveats are in order.

ii)

The lattice

approach,

where one link is taken as a helix turn or an

aminc-acid,

may be too crude.

(ii)

The models are studied in the

thermodynamic

limit

(N

= L~

going

to

infinity). Choosing arbitrarily

a size of N

=

10~ -10~ aminc-acids for a protein, the

quasi-frozen phase

is in fact

completely

frozen in the

bulk,

since thermal fluctuations would be present

only

in much

larger

systems, N

~-

10~ aminc-acids

(see

Sect.

2).

However,

surface modes should not be

frozen,

and a more detailed

study

of finite size effects is in order.

(10)

(iii)

Since we have considered

only

compact structures, we have

implicitly

assumed the

collapse

energy Ec to be much

larger

than the intramolecular H-bond energy EH. In real systems, the ratio

Ec/EH

seems rather

large,

of order 20 [10].

A different limit is

currently studied,

where Ec « EH> and the formation of

secondary

structures is initiated in the unfolded

(non-compact) phase.

(iv) Finally,

let us note that aminc-acids have

given probabilities

to be present in

tY-helices, p-sheets

or turns

(I.e,

no

secondary structure).

This is reflected in the Ramachandran

plots iii;

a more elaborate

approach

should include these

weights.

In our

models,

the denaturation

proceeds

in two steps: a first step, where

secondary

struc- tures unfreeze and become mobile. This

phase

seems

closely

related to the molten

globule

of reference [4]; in a second step

(not

described

by

the above

models)

the compact

globule

would

unfold,

into a swollen coil.

References

iii (a)

Creighton T-E-, Proteins

(W.H.

Freeman, New York, 1984);

(b)

Levitt M., Current Opinion in Structural Biology1

(1991)

224 [2]

(a)

Pauling L. and Corey R-B-, P-N-A-S. 37

(1951)

235, 251, 272, 729;

(b)

Richardson J-S-, Adv. Prot. Chem. 34

(1981)

167.

[3]

(a)

Kldman M, and Sadoc J-F-, J. Pllys. 40

(1974)

L569;

(b)

Nelson D-R- and Spaepen F-S-, Solid State Phys. 42

(1988)

and references therein.

[4]

(a)

Ptitsyn O-B-, J. Protein Chem. 6

(1987)

273;

(b)

Stigter D., Alonso D-O-V- and Dill K-A-, P-N-A-S. 88

(1991)

4176.

[5] des aoizeaux J. and Jannink G., Les Polymbres en Solution

(Editions

de Physique, Les Ulis,

1987).

[6] Flory P-J-, Proc. R. Soc., London Ser. A234

(1956)

60.

[7] Bascle J., Garel T. and Orland H., J. Phys. A25

(1992)

L1323.

[8] Orland H., Itzykson C, and De Dominicis C., J. Phys. Lett. 46

(1985)

L353.

[9]de Gennes P-G-, Phys. Lett. A38

(1972)

339.

[10] Gamier J., private communication.

Références

Documents relatifs

5. According to Theorem 2, the Liouville set S q is not empty.. ) of even positive integers.. Erd˝ os, Representation of real numbers as sums and products of Liouville numbers,

[r]

The last two rows are proportional, so subtracting from the third row three times the second row, we obtain a matrix with a zero row whose determinant is zero.. , n − 1, the row i

Videotaped lessons of a model on energy chain, from two French 5 th grade classes with the same teacher, were analysed in order to assess the percentage of stability as

[r]

~ber 5 willkiirlichen Punkten des R.~ gibt es eine Gruppe yon 32 (involutorischen) Transformationen, welche auch in jeder anderen iiber irgend 6 besonderen

(b) The electric field and volume fraction phase diagram for 0:8 m silica particles (sample thickness, 35 – 45 m ); empty circles, void phase; empty triangles, string-

In order to test the effectiveness of the proposed model in deducing the transition behavior taking into account the role of interfacial energy and temperature effects, we compare