Classification
Physics
Abstracts87.10 64.40 75.10H
Storage of spatially correlated patterns in autoassociative memories
Rdmi Monasson
Laboratoire de
Physique Th60rique
de l'Ecole NormaleSupdrieure (°),
24 rue Lhomond, 75231 Paris Cedex 05, France(Received
2 November 1992, accepted in final form 8January 1993)
Abstract. The effects of spatially
organised
data on autoassociative neural networks areinvestigated
in theoptimal
storage case. Ananalytical study
ispossible
for weakspatial
cor-relations. It
predicts
anincreasing
of the storage capacity ac andferromagnetic
means for thecouplings.
Numerical simulations confirm these results forlarge spatial
correlations.1. Introduction.
Since it was
introduced,
theconcept
of attractor neural network hasgained
much attention from thephysics community [I].
Numerical and theoretical studies havealready captured
manyinteresting
features(e.g. working
as an associative memory, robustnessagainst degradation,
fast
parallel processing, learning
fromexamples, [2]).
Formainly
technical reasons, these results were obtained in the absence of anyspatial
structure of both the neural networks and theinputs.
However most real lifeapplications
deal withspatially organised
data.In this paper, we focus on the effects of such a
spatial organisation
of theinput patterns
on the behaviour of an autoassociative neural network. Such a network exists in a d- dimensionalphysical
space and realisticpatterns
exhibitspatial
correlations.The aim is twofold.
First,
it is of interest to see how the results derived for nonspatially
structured networks are modified in this new framework. We shall concentrate upon the
opti-
mal
storage capacity, using
the statistical mechanics toolsdeveloped by
Gardner-Derrida[3].
Secondly,
we shallinvestigate
how the networkorganises
itself(from
theweights
distributionstandpoint)
in the presence ofspatially
correlatedpatterns.
It hasalready
been shown for hetero-associativemapping
[4] and informationprocessing [5],
[6] that theresulting couplings organisation
is non trivial. We shall see here that the latter isdeeply
different from the usualuncorrelated case and influences the retrieval
properties
of the network near the saturation.(°)
Urdt6 Pmpm de Recherche du CNRS, assoc16e h l'Ecole Norrrale Sup6rieure et h l'Urdversit6 de Paris-Sud2.
Description
of the model.The network consists of N neurons, each connected to every
other,
andtaking
a value +I.Neuron
S;
is connected to neuronSj
via a real valued synapseJ;j (J;;
=0, Vi).
Thedynamics
of thesystem
follows asequential updating
rule of the forms; (i
+i)
msign l~ J;j sj
i)j
(i)
>(x;) The
training
set consists of P= ON
binary patterns (ff ) belonging
to a d-dimensional space.Zerc-mean
activity
andspatial
correlation areimposed by
the constraints~~' ~ ~~~
~~~' ~~"'~i (2)
where the overbars
represent
an average over theprobability
distribution of thepattern
bits.C is a
symmetric
andpositive
matrix with a I on thediagonal, dependent
on the distanceii ii (the subscripts
I andj
may be seen as d-dimensional vectors but all the simulations will beperformed
here for d = I). Binary patterns obviously
can not beentirely
definedby
the two first moments of their distribution. We shall discuss thispoint
in the nextpart.
The
stability
of thepattern
~t at site I isgiven by
&S'-
~S'~ j,,~S' (3)
I i 'J I
>(Xi)
The
training
set is said to be stored if all thepatterns
are fixedpoints
of thedynamics (I),
I-e.if all the stabilities are
positive. However,
an associative memory networkrequires
thatlarge
attraction basins surround these storedpatterns.
It has been shown(at
least for uncorrelateddata)
that their size increases with the value of theparameter
~ =mini,~(Af) [8].
We further define a
quantity
which will appear in thefollowing
work. It is the N I dimensional matrixd;j
=C;j C;icji (I, j
= 2.N) (4)
One can
easily verify
that the inverse matrix isld~~)
=(C~~);. (I, j
= 2.N) (5)
;j J
We define the normalised trace of any matrix A
by
TrA =) £; A;;.
3.
Analytical study
of the network.In this
part,
wetry
to answer thefollowing question
what is the maximal size a of thetraining
set the above network is able to store ? Given a
training
set(ff),
for eachpossible synaptic
matrix(J;j ),
we define an error functionp N
E
(lJ;>I, lftl)
=~ ~ (~ At) (6)
where K is a
positive parameter. Using
the framework createdby
Gardner-Derrida[3],
thepartition
function of the network isz(lffl)
=
/ fldJ;> fld ~ J( lexP (-SE (lJ;> Ii lffl)) (7)
I#>
>(#;)
where
fl plays
the role of an inversetemperature.
In thefollowing,
we shall sendfl
- cc, sothat the
integration
in(7)
takes into account thesynaptic weights storing
all thepatterns
with stabilitieslarger
than ~. The choice of the cost function(6)
is natural in the zerotemperature
limitonly
below the criticalcapacity. Training
the network with otherenergies
E(including error-size,
and notonly error-number, dependences)
may be more efficient atrelatively
smallfl
andhigh storage
levels toimprove
the retrievalperformences
of the network[9].
Using
thereplica method,
wecompute lnZ,
which is assumed to beself-averaging
in the limit oflarge
network size. Since the domain of suitablecouplings
remains connected even with correlatedpatterns
we make thereplica symmetric
Ansatz. This Ansatz has also been shown toprovide
a stablesaddle-point
for heterc-associativestorage [4].
3.I THE REPLICA CALCULATION OF THE PARTITION FUNCTION. The network
partition
function Z defined in
(7)
isequal
to theproduct
of the Nsingle
sitepartition
functionsZ;
wherei=I. N.
Z;
=/ fl dJ;j
d~j J;( lexp -flf9
~
if ~j J,jf( (8)
>(Xi)
j(#I)
P"~ >(xi)
~
and in the
large
N limit[3],
~ W= ~ ~j@= @"
rim rim~~~~
N N
(9)
~i
N '~~°°"~° ~~
We introduce n
replicas (J(),
a = I. n and average over thepatterns
distribution(2)
@ fl
d jafl
j£
ja)2 fl fi
l~
I
j,a lj a j lj
p,a~ ~«~)
X exp
(-fl £ 9(~ t(
+£
~
i(t(
+£~ I~) (10)
P>~ P>
where
I~
= In exp~j ii J( f(f( I= ~j I( ~j Jij Cij
+~ i(I$ ~j Jij J)kdjk
a,f a J ab jk
~j I]I$I[ ~j Jij J)k J/i c)((
+ iiabc jkl
The tensor
appearing
in the third term of the r-h-s-equals
C)~~
"~l~j~k~l Cljckl Clkcjl Cllcjk
+ 2Cljclkcll (~~)
We may assume, as for the hetero-associative
storage [4],
that thepatterns satisfy
someclustering conditions,
I.e. that thek-points
connected correlation function of theirprobability
distribution
always
decreaseexponentially
with the distanceseparating
the two closestpoints
among the k ones.
For any
pattern
~t, theproducts f(f)
are in averageequal
toCo
Thestability
of thepattern
/J definedby
formula(3)
will be increased if the synapses betweenneighbouring
neurons arestrengthened
with the correctsigns such
thatJijcij
>0, Vj.
It is thusplausible
thatJij
=O(I)
when the neuronj
is close to the neuronI,
while it is of orderfi
in the usual uncorrelated case. As a consequence, the truncation of theI~'s
at thequadratic
term in t is notvalid,
even in thelarge
N limit(as
it used to be in[4]).
An exact calculation ofW
forany
spatially
correlated distribution of theinput patterns
does not seempossible.
It would bothrequire
theknowledge
of all the moments of the statistical distribution of thepatterns
and thecomputation
of non Gaussianintegrals
inI.
In the
following,
we resort to anexpansion
of thepartition
function around the usual un- correlated case(C;j
cid;j), keeping only
the first two terms of theexpansion (11).
To be moreprecise,
whenC;j
=x"~i'
and x <I,
this Gaussianapproximation
can be shown toprovide
the critical
capacity
up tox~.
3.2 THE RESULTS FOR WEAK CORRELATIONS. For
C;j
mb,j,
theentropy
of thesynaptic
weights
isequal
to)
In Z m'~l' )q#
+ s@ + mih + fi)Tr ~j
U + s-q
)~ (ln(2fiI
+(2§ #)b)j
+)fli~ £)~_~ Cji[(2fiI
+(2§ #)d)~~]jkcki
+ of
Dz In(H (@)j (13)
, -q
where
~-z ~/2 +m
~~
@
'~~~~
~~ ~~~~The three order
parameters
m, q, s aregiven by
'~
~~~
~ ~~~ ~
N
~
,~ ~ik
< J j > ~ jJ,k=2
~ lk >
N
~
j~~ ~
~ ~~lJ ~lk
>(15)
where < > denotes the thermal average over the
couplings J;j.
Variablesfli, #,
§ areLagrange parameters enforcing
constraints(15).
fi ensures the normalisation of each line of thesynaptic
matrix to I. These seven parameters are foundby solving
thesaddle-point equations
of(13).
We restrict our
analysis
to the critical linea~(~)
where s q - 0 and the fourLagrange multipliers diverge.
We then obtain twocoupled equations linking
the common critical value s~ of q, s to the criticalparameter
m~N
~mc "
vc~jcji[(d+v~I)-~]~~C~i
j,k=2
Tr
~ ~2 v2 (
~_
j(/~
~~~~-2j
~(/~
+ ~i)2
c ii c jk kl~ ik=2
/~
'
N
(16)
~~
i~
+ Vcl '~~~~~ ~ ~~~ ~~
~ ii [(~
+Vci) ~jjkckl
I,k=2
where
/~ e~~ (17)
Vc(~,
mC, ~C) ~'~ ~~~~~
~ ~2~ H
@)
Once m~ and s~ have been
computed,
the criticalcapacity a~(K)
isgiven by
~~
H
j-j
~~b
~(vcll~~~~
Let us now
apply
these results to the one-dimensional case withexponentially decreasing
correlationsCij
=x'~~i',
where x < I. Theoptimal capacity
is reached when K = 0 and fromequations (16 18),
we obtainac(x)
= 2 + ~x~
+O(x~) (19)
ir
Let us evaluate indeed the value of the first
neglected
term(that involving C(~),
see(12)).
Asexpected,
the J's do not vanish forlarge
N and are order x or less when x is small. The cubic term inI
therefore scales as at leastx~.
Thisscaling
holds for thehigher
terms in theexpansion (11)
while thequadratic
term inI
is of orderx~.
We conclude that the results derived whenforgetting
all but the first two terms are valid up to order~2.
4.
Interpretation
and simulations.4,I THE STORAGE CAPACITY AND THE QUANTITY OF INFORMATION. The results derived
at the end of the
previous part
show that the criticalcapacity
of attractor neural networks increases with the presence ofspatial
correlations in thetraining patterns.
For feed-forward networks and hetero-associativemapping,
thestorage capacity
remainedequal
to2,
whateverGil
141.The
patterns presenting
thespatial
correlationsC;j
=x"~J
may be
generated
asequilibrium configurations
of a one dimensionalIsing
model attemperature
T such that x =tanh(I IT).
Thus the
quantity
of informationI(x)
in apattern
issimply
theentropy
of thisIsing
latticeI(x)
= N(- ~)
ln~ ~)
~~)
ln~ ( ~)j (20)
2 2 2
When x <
I,
the totalquantity
of informationI(x)
stored in thetraining
set is1(x)
=a~(x)N.I(x)
=
N~
21n 2 + ~~~ ~ lx~
+O(x~) (21)
7r
Though
thecapacity
of autc-associative networks increases whenthey
arepresented
with spa-tially
correlateddata,
we see that(as
far as we can trust the small xresult)
thequantity
ofinformation the neural network
actually
stores decreases. This situationalready
occurred with biasedpatterns [3].
Some numerical simulations were carried out to check the increase of the
storage capacity
with the amount of
spatial
correlations inside thetraining
set. The patterns aregenerated
through
a Monte Carlo simulation of theIsing
model and the Minoveralgorithm
finds thecouplings achieving
thelargest stability
K[4], [7].
The sizes of the network wereequal
to N=
50,
100 and 200. The differences between these three simulations were lower than the errorz-o
5 i~
I j
(
Ii.o '~.
i '",
[
"._)
""._ .~o.5
"".,
~~'~.~
o-o "'
o-o 0.5 1.0 1.5 2.0
Size of the
training
set aFig-I-
theoptimal stability
~ for different sizes a obtained from the Minoveralgorithm (the
fullcurve is a
polynomial fit).
The patterns are drawn from a one dimensionalIsing
model(z
= 0.7 see
text)
and the number of neurons is N = 200. The dotted line is the theoretical curve for the usual uncorrelated case(z
=
0).
The dash-dotted curvedisplays
the solution of thesaddle-point equations
valid for low z.
bars
(due
to the statistical fluctuations and the finiterunning
time of theMinover).
Weplot
here the results obtained for N = 200.
Figure
I compares the results obtained for x = 0.7 to the classical uncorrelated case(x
=0)
[3] and to the solutionprovided by
thesaddle-point
equations.
We can notice that there is agood agreement
between the simulations and the small xtheory
for small a.However,
when x -I,
the criticalcapacity computed
from thesaddle-points equations
scales as"~~~'~
~~ ~fi~~~ 2(1~ )~
~~~~
which is
clearly
unrealistic since thequantity
of informationI(x)
woulddiverge.
4.2 THE FERROMAGNETIC STRUCTURE OF THE SYNAPTIC WEIGHTS. In order to shed some
light
on the way the network increases itsstorage capacity,
wecompute
the first moments of the distribution of itsweights (which
isgaussian
for smallx)
on the critical lineo~(~)
N
<
Jo
> =Ac ~ Cik ((£l
+vcl)~~j
~.
Ac
=) (23)
k=2 ~
~
~~J'~~~
~ ~ ~~J ~'~ ~~~ ~(© ~I)2~
'
~C f/j~)~i ~)~
~~~)
jk c @
In the usual uncorrelated case
(C;j
=d;j),
all the synapses are of order§,
with zero mean.When the
input
data becomespatially correlated,
theweights J;j
areequal
to the sum of two contributions* a
ferromagnetic
termJ;(
: the mean values of the synapseslinking
two close cells(in
thesense that
ii j(
is not muchlarger
that the correlationlength
of theinput data)
are finiteand
positive. They
decreasequickly
with the distanceseparating
the units but do not vanish when N goes toinfinity.
* a
spin-glass
termJ)j~' equation (24)
shows that thecouplings
fluctuate around thesemean values from
sample
tosample. Furthermore,
thetypical
deviations are of order),
like with uncorrelatedpatterns.
The
origin
of the first termJ(
hasalready
beenexplained
inpart
3.I. Theparameter
mc may beinterpreted
as thetypical
value of theferromagnetic part
of the stabilities in the limit of small x.For more
general
correlation matrices(I.e.
not close to theidentity),
thereplica
calculations ofpart
3.I are nolonger
valid. We can neverthelesspredict
thequalitative
behaviour ofthe network. With
respect
to the usual uncorrelated case, theweights
fluctuate around aferromagnetic background J(.
The latter isself-averaging
itdepends only
on the statistical distributionC;j
of thepatterns,
and not on theparticular training
set. Thispositive synaptic
"bump" provides
all the units of the network with an additional field whose mean isequal
to Nh+
=~j CijJ( (25)
j=2
for each
pattern.
As aresult,
the effective field due to thetuning
of theweights (fluctuating parts)
ishs
g, = ~
h+
rather than ~. Near the saturation(when
~ = 0 forinstance), hs,g,
isnegative,
while the total field hs_g_ +h+
is stillpositive.
Thisexplains
how the criticalcapacity
of the
present
model may exceed 2[3],
as found inpart
3.2.The results of simulations done with
C;j
=(0.7)'~~~'
and a = I aredisplayed
infigure
2.The mean values of the
weights J(
=§
and thetypical
fluctuationsJ(~
=J;j~
areplotted
for the distances k=
1,...,
5 and the different sizes N=
50,100
and 200(the
numbers ofsamples
the data areaveraged
over range from 10 to100).
The standard deviations seem to vanish likefi
as it ispredicted
in formula(24).
Moreoverthey
areroughly independent
on the distance
ii j( separating
the units. Other simulationsperformed
from o = 0.5 toa = 2 show that the coefficient B does not vary a lot when the size of the
training
setchanges (0.7
< B < 0.9typically).
The meancouplings J+ depend
veryweakly
on the number ofneurons N
(the
variations are taken into accountby
the sizes of thesquares). However, J(
decreases
quickly
with iij(. Figure
3a shows that thisdecreasing
is indeedexponential.
It may be well fittedby
thefollowing
lawJ(
=Jt v(x, ~Y)"->' (26)
Figure
3bdisplays
the values ofy(x,a
=0.5)
obtained fromfigure
3a and compares them to the theoreticalexpectations. Equation (23)
alsopredicts
anexponential decreasing
of thesynaptic weights.
We see that theagreement
isquite good although
thetheory
is exact up to orderx~ only.
In addition to the statisticalfluctuations,
the error barsappearing
infigure
0.4
co~ N (j,( (j,(x4
~'~
~~
~ ~jj
200 o 0 06 0.84~
l~
~
~ 0. 2
lj C
~ +
~' ,.,,f
--X,.,., ,.,,.,-~.,,.,.-
w o ~ ~
'~ O-I
JZ +...
~- ,,.,.,.--~,,,,. ~-- -~
b0I ...,,o...,- ~--,... ,e,.. ...~... ,~.,
~ C o
f
~ ~ a$i
0 2 3 4 5 6
Distance
(I-j(
Fig.2.
the meancouplings J(
=
( (small squares)
and the fluctuationsJ(~
=fi
for different sizes N
=
50,100
and 200 and a = I. The distancesii j(
range from I to 5. Thescaling
of the deviations iscompatible
with afi
law. The meanweights
seem to decreaseexponentially
with the distanceseparating
two neurons(see figure 3).
The size of the squaresgives
an upper bound of the statistical fluctuations of theJ+ (from sample
tosample
and inside eachsample,
fromperceptron
toperceptron)
and of their variations for the different N.3b include the
uncertainty
due the finiterunning
time of the Minover.Using
the notations of[7] and
[4],
one must take into account theperformance guarantee
factorAT.
It compares thelargest stability
~m achievable for onegiven sample (I.e.
after an infinitetime)
to ~T obtainedby
thealgorithm
at time T ~T < ~m <AT
~T. All the simulations wereperformed
with1.02 <
AT
< 1.06.Figure
4displays
the behaviour ofJ;(
for agiven
x = 0.7 and different sizes of thetraining
set. There is a
good qualitative agreement
with the Gaussiantheory.
For low a, the latterpredicts
J;(
ci@ C;j (a
-0) (27)
This may be
easily understood,
since the Hebb rule becomesoptimal (for
thestability
cri-terium)
for finite P.Thus,
at lowstorage,
thecouplings
reflect the inner structure of theinput patterns.
Near thesaturation,
the theoretical result isJt
Ci -~(C~~);> (a
- C1c)(28)
where ~ is a
positive
constant derived from thesaddle-point equations.
ForC,j
=xii-ii,
-1
".[")"._
-~
~~
..
'l'
.. ._
~ "'.'«.
~ ., ".
u ..
w ,
-3 « .._ ...
uo '. '~ ".
° ". ".. "..
- m_
..
~ ",
._ .., "._
~ -4 .. ".. "., "..
£
; i, ',_~ K .. > ..
tu ., ._
I ", "._
~ -5 '.
".j
"._ .._Ic I ".
g
., ._ ._~~
-6
~.
"._
'j
~. "..
., ., "._
-?
l 2 3 4 5 6 ? 8 9
distance
ii-ii
o 5
I
~ 04
j
cl
03u
j
'~
~ 0.2
b
01
'j/
0 0
0 0 0.2 0.4 0 6 0
input correlation x
Fig.3. (a)
the mean values of thecouplings J(
as a function of the distance iij( (semi-log plot).
The networks includes N = 200 neurons and the data are
averaged
over 100samples.
The s12e of thetraining
set is a = 0.5 and the different correlationstrengths
are from top to down : z =0.8, 0.7, 0.6, 0.4 and 0.2. The numerical dataprovided by
the Minoveralgorithm
seem toobey
anexponential
law proportional to
y"~?' (see
equation(26)). (b)
the dotted line showsy(z)
for a = 0.5 as obtained from the weak correlationtheory.
The points are deduced flom thefigure
3a. The error bars take intoaccount the statistical fluctuations and the
uncertainty
due to the stop of the Minoveralgorithm
aftera finite time
(see
part4.2).
JOURNAL DE PHYSIOUE , -T I N' ~ MAY iU91 4jj
o.5
3
O.4a a a a a a
-- °
01 a
~
I
u0.3
-~
§
01+ -n
0l 0.2
J$~
flf
~ ~
~ +
~'~ ~
~
~ +
zg +
~ +
o-o
~
o-O 0.5 1-O 1.5 Z-O
Size of the
training
set aFigA.
the meancouplings J;(
as functions of thetraining
set size a for x = 0.7. Fromtop
to
down, ii j(=1, 2, 3,
4 and 5. The small xpredictions (frill curves)
arecompared
to the simulation results(the
size of thesymbols represent
thelargest
statistical standarddeviation).
Note the
qualitative agreement
for small a(J(
m@ x'~~j')
and forlarge
o(J;(
« I ifii ii #1).
the
only subsisting ferromagnetic couplings
link the nearestneighbours [4].
Thisprediction (I.e.
the meanweights
aregiven by
the inverse matrix at the criticalcapacity)
seenw to becorroborated
by
the simulations.4.3 THE DYNAMICAL STABILITY OF THE PATTERNS. The
question
we consider now is thefollowing
does theferromagnetic
structure of theweights
influence the retrieval abilities of the network ? Let us start from a storedpattern
~t. We then choose a site at random(numbered I)
andflip
thecorresponding spin.
In the usual uncorrelated case, the stabilities of the otherspins
arechanged by O(h).
In thelarge
Nlimit,
thissingle flip
cannot affect the other units and thetraining pattern
isperfectly
retrievedthrough
thedynamical
evolution of the network(the
notion of attraction basins isclassically
defined from the reversal of a finite fraction of the total number of neurons N[8]).
When the
couplings
haveferromagnetic
means, the situation is morecomplicated.
The storedpattern
~t exhibitspatial
correlations whosetypical length
is L withC;j
=exp(-(I j (IL) (e,g.
L ci 2.8 for x = 0.7 in theprevious simulations).
It may beroughly
described as thejuxtaposition
of domains of samesign spins
and thelengths
of which are around L. Let us suppose now that the site + I(or I) belongs
to the same domain as I. After theflip,
itsstability
is decreasedby
dAf+1
=-2Ji,1+iffff+1
=-2Ji,1+1 (29)
which becomes
equal
to-2J£
in thelarge
N limit. As soon as a islarge enough (I.e.
~ <2J£),
the
neighbours
are in turnflipped
and the initialperturbation propagates. However,
when it reaches theedge
of thedomain,
thestability
of the first neuron m of theopposite spins
block ischanged by
bA$
=-2Jm-1,m($-1($
=+2Jm-1,m (30)
which is now
positive.
This neuron is thus stable. We see that the the initialspin flip
causethe reversal of the whole domain this unit
belonged
to. We can conclude that the fixedpoint
of the
dynamics
differs from thepattern
~t in a finite number of bits(roughly L).
There is noperfect
retrieval but forlarge networks,
theoverlap
between the twopatterns
goes to I.The above
reasoning implicitely
takesonly
into account the nearestneighbours
interactions.It is
justified by
thesharp decreasing
of theJ+
with the distanceseparating
the units(see Fig.2).
This discussion alsopointed
out the main difference between thespins belonging
tocenters of the domains
(I.e. having
the samesign
as theirneighbours)
and thespins
located at theedgq
of the blocks. The " center" units are,roughly speaking,
stabilisedby
theJ+ coup1blgs
while the field
incoming
onto the"edge" spins
due to theferromagnetic weights
vanishes. The latter canonly
be stored in theirright
directions(up
ordown) by
thetuning
of the J~.E.couplings.
Since the number of such"edge" spins
is a fraction of N(as
soon as L >0),
thestorage capacity
increases.5.
Sununary
and discussion.In this paper, we have focused on the
storage properties
of autc-associative memories whenthey
are
presented
withspatially
correlatedpatterns.
We havecomputed
the free energy for weak correlations. Italready predicts
two mabl differences withrespect
to the classical uncorrelatedcase.
First,
thestorage capacity
increases but the totalquantity
of information seenw to decrease.This situation is reminiscent of the biased
patterns
case [3] andemphasizes
thenecessity
of
decorrelating
the data toimprove
theefficiency
of the memories. One must however be cautious about thepossible analogy
withneurobiological
facts(decorrelation
of the visual stimulithrough
the first retinalayers
forinstance).
The results we obtained heredepend strongly
on thebinary
nature of the neurons. Inparticular,
theferromagnetic weights
areuseful since
they
allow the units tostrengthen
theinput
fields in theright
direction. Sucha
principle
would not beadequate
anylonger
for continuous neurons(which
may be more relevant forbiological modelizations),
where theintensity
of the fields(and
notonly
theirsigns)
matters to the cells responses.Secondly,
thesynaptic weights
exhibit a richer structure than for uncorrelatedpatterns.
In addition to the usual
O(j~) fluctuating weights (I.e.
tunedduring
thetraining),
there is aO(I) self-averaging
andferromagnetic background.
These short rangecouplings
takeadvantage
of thespatial
correlations of theinput patterns
to enhance the cellsfields,
withoutaffecting
too much the retrievalperformances
oflarge
networks. Some numerical simulationswere
performed
forexponentially decreasing
correlations and one dimensionalpatterns. They
show a
surprisingly good agreement
with aboveanalytical predictions (valid
for weakspatial correlations).
All the results we
presented
here were derived for continuousweights.
It would beinteresting
to know what
happens
when otherprescriptions
areimposed
over thecouplings.
Withbinary
weights (J;j
=+fi)
or bounded synapses((J;j(
<§),
theferromagnetic
means could not be of order I and the nature of theoptimal synaptic
matrix would be different.Acknowledgments.
am
grateful
to M. Mdzard for numerousilluminating
discussions andreadings
of themanuscript
I also thank W. Krauth for useful remarks
concerning
the numerical simulations and I.Kocher,
D.
O'Kane,
A. ~eves for theirhelp during
thecompletion
of this work.References
[ii
J. J.Hopfield,
Proc Nat' Acad Sci USA 79(1982)
2554.[2] D.
Amit, "Modeling
Brain Function"Cambridge University
Press(1989).
[3] E. Gardner, J
Phys
A 21(1988)
257;E. Gardner, B.
Derrida,
JPhys
A 21(1988)
271 [4] R. Monasson, J Phys A 25(1992)
3701.[5] J. J.
Atick,
A. N. Redlich, NeuralComput.
2(1990)
308.[6] J. P.
Nadal,
N.Parga,
"Informationprocessing
by a perceptron"Proceedings
ofELBA,
Neural Networksfrom Biology
toHigh Energy Physics (1992).
[7] W.
Krauth,
M.M6zard,
JPhys
A 20(1987)
L745.[8] T. B.
Kepler,
L. F. Abbott, JPhysique
49(1988)
1657;B. M. Forrest, J
Phys
A 21(1988)
245.[9] D.
Amit,
M.Evans,
H.Homer,
K. Y. M.Wong,
JPhys
A 23(1990)
3361;M.