Storage of spatially correlated patterns in autoassociative memories

(1)

HAL Id: jpa-00246786

https://hal.archives-ouvertes.fr/jpa-00246786

Submitted on 1 Jan 1993

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Storage of spatially correlated patterns in autoassociative memories

Rémi Monasson

To cite this version:

Rémi Monasson. Storage of spatially correlated patterns in autoassociative memories. Journal de

Physique I, EDP Sciences, 1993, 3 (5), pp.1141-1152. �10.1051/jp1:1993107�. �jpa-00246786�

(2)

Classification Physics Abstracts

87.10 64.40 75.10H

Storage of spatially correlated patterns ⁱⁿ autoassociative memories

Rdmi Monasson

Laboratoire de Physique Th60rique de l'Ecole Normale Supdrieure

(°),

24 _rue Lhomond, 75231 Paris Cedex 05, France

(Received

2 November 1992, accepted in final form 8 January

1993)

Abstract. The effects of spatially organised data _on autoassociative neural networks _are

investigated in the optimal storage case. An analytical study is possible for weak spatial _cor-

relations. It predicts _an increasing of the storage capacity _ac and ferromagnetic _means for the

couplings. Numerical simulations confirm these results for large spatial correlations.

1. Introduction.

Since it _was

introduced,

the concept ^of ^attractor ^neural ^network ^has

gained

much attention from the

physics

community

[I].

Numerical and theoretical studies have

already

captured _many

interesting

features

(e.g. working

_as _an associative _memory, robustness

against degradation,

fast

parallel processing, learning

from

examples, [2]).

For

mainly

technical _reasons, these results _were obtained in the absence of any

spatial

structure of both the neural networks and the

inputs.

However _most real life

applications

deal with

spatially organised

data.

In this _paper, _we focus _on the effects of such _a

spatial organisation

of the

input

patterns _on the behaviour of _an autoassociative neural network. Such _a network exists in

a d- dimensional

physical

_space and realistic patterns ^exhibit

spatial

correlations.

The aim is twofold. First, it is of interest to _see how the results derived for _non

spatially

structured networks _are modified in this _new framework. We shall concentrate upon the

opti-

mal storage

capacity, using

the statistical mechanics tools

developed by

Gardner-Derrida [3].

Secondly,

_we shall

investigate

how the network

organises

itself

(from

the

weights

distribution

standpoint)

in the _presence of

spatially

correlated patterns. It has

already

been shown for hetero-associative

mapping

_[4] and information

processing

[5], [6] ^that ^the

resulting couplings organisation

is _non trivial. We shall _see here that the latter is

deeply

different from the usual

uncorrelated _case and influences the retrieval

properties

of the network _near the saturation.

(°) Urdt6 Pmpm de Recherche du CNRS, assoc16e h l'Ecole Norrrale Sup6rieure et h l'Urdversit6 de Paris-Sud

(3)

2.

Description

of the model.

The network consists of N neurons, each connected to every

other,

and

taking

_a value +I.

Neuron

S;

is connected to _neuron

Sj

via _a real valued _synapse

J;j (J;;

₌

0, Vi).

^The

^dynamics

of the system ^follows a

sequential updating

rule of the form

s;

(i

+

i)

_m

sign l~ _{J;j sj} ^i)j (i)

>(x;) The

training

set consists of P

= ON

binary

patterns

(ff ) belonging

to a d-dimensional _space.

Zerc-mean

activity

and

spatial

correlation _are

imposed by

the constraints

~~' ~ ~~~

~' "'~i (2)

where the overbars represent _an _average _over the

probability

distribution of the pattern ^bits.

C is _a

symmetric

and

positive

matrix with _a I _on the

diagonal, dependent

_on the distance ii

ii (the subscripts

I and

j

_may be _seen _as d-dimensional vectors but all the simulations

will be

performed

here for d

= I

). Binary

patterns

obviously

_can not be

entirely

defined

by

the two first moments of their distribution. We shall discuss this

point

in the next part.

The

stability

of the pattern _~t at site I is

given by

&S'- _~S'

~ j,,~S' (3)

I i 'J I

>(Xi)

The

training

set is said to be stored if all the patterns are fixed

points

of the

dynamics (I),

I-e.

if all the stabilities _are

positive. However,

_an associative _memory network

requires

that

large

attraction basins surround these stored patterns. ^It ^has ^been ^shown

(at

least for uncorrelated

data)

that their size increases with the value of the parameter _~ ₌

mini,~(Af)

_[8].

We further define _a

quantity

^which ^will _appear in the

following

work. It is the N I dimensional matrix

d;j

=

C;j C;icji (I, j

= 2.

N) (4)

One _can

easily verify

that the inverse matrix is

ld~~)

=

(C~~);. (I, ^j

= 2.

N) (5)

;j ^J

We define the normalised trace of _any matrix A

by

TrA ₌

) £;

A;;.

3.

Analytical study

of the network.

In this part, we try to _answer the

following question

what is the maximal size _a of the

training

set the above network is able to store ? Given _a

training

set

(ff),

for each

possible synaptic

matrix

(J;j ),

_we define _an _error function

p N

E

(lJ;>I, lftl)

=

~ ~ _(~ _At) ₍₆₎

where _K is _a

positive

parameter.

Using

the framework created

by

Gardner-Derrida [3], ^the

partition

function of the network is

z(lffl)

₌

/ _fldJ;> _fld _~ _J( ^lexP (-SE (lJ;> Ii lffl)) (7)

I#>

>(#;)

(4)

where

fl plays

the role of _an inverse temperature. In the

following,

_we shall send

fl

_- _cc, _so that the

integration

in

(7)

^takes into account the

synaptic weights storing

all the patterns ^with stabilities

larger

than _~. The choice of the cost function

(6)

is natural in the _zero temperature limit

only

below the critical

capacity. Training

the network with other

energies

E

(including error-size,

and not

only error-number, dependences)

_may be _more efficient _at

relatively

small

fl

and

high

storage ^levels to

improve

the retrieval

performences

of the network [9].

Using

^the

replica method,

we compute

lnZ,

which is assumed to be

self-averaging

in the limit of

large

network size. Since the domain of suitable

couplings

remains connected _even with correlated patterns _we make the

replica symmetric

Ansatz. This Ansatz has also been shown to

provide

_a stable

saddle-point

for heterc-associative storage [4].

3.I THE _REPLICA CALCULATION OF THE PARTITION FUNCTION. The network

partition

function Z defined in

(7)

is

equal

to the

product

of the N

single

site

partition

functions Z;

wherei=I. N.

Z; ₌

/ _fl _dJ;j

_d

_~j

J;(

lexp -flf9

~

if ~j _J,jf( ₍₈₎

>(Xi)

j(#I)

P"~ >(xi)

~

and in the

large

N limit [3],

~

W=

^~

~j@= @"

_rim _rim

^~~~~ ₍₉₎

N N

~i

N '~~°°"~° ~~

We introduce _n

replicas (J(),

_a ₌ I. _n and _average _over the patterns distribution

(2)

@

fl

d ja

fl

j

£

ja )2

fl fi

l~

I

j,a lj _a j lj

p,a~ ~«~)

X exp

(-fl ^£ 9(~ t(

+

£

~

i(t(

₊

_£~ _I~) ₍₁₀₎

P>~ P>

where

I~ ₌ In _exp

~j _{ii J(} _f(f( I= ~j _I( ~j _Jij

_Cij ₊

~ _i(I$ ~j _Jij J)kdjk

a,f a J ab jk

~j I]I$I[ ~j _Jij _J)k _J/i _c)((

₊ _ii

abc jkl

The tensor

appearing

in the third term of the r-h-s-

equals

C)~~ ^"

~l~j~k~l Cljckl Clkcjl Cllcjk

+ 2

Cljclkcll (~~)

We _may assume, as for the hetero-associative storage [4], that the patterns

satisfy

_some

clustering conditions,

I.e. that the

k-points

^connected correlation function of their

probability

distribution

always

decrease

exponentially

^with the distance

separating

the two closest

points

among the k _ones.

For _any pattern _~t, the

products f(f)

_are in average

equal

to

Co

The

stability

of the pattern _/J defined

by

formula

(3)

will be increased if the synapses between

neighbouring

neurons are

strengthened

with the correct

signs

such that

Jijcij

>

0, Vj.

It is thus

plausible

that

(5)

Jij

₌

O(I)

when the _neuron

j

is close to the _neuron I, while it is of order

fi

in the usual uncorrelated

case. As _a consequence, the truncation of the

I~'s

at the

quadratic

term in t is

not

valid,

_even in the

large

N limit

(as

it used to be in

[4]).

^An exact calculation of

W

for any

spatially

correlated distribution of the

input

patterns does not _seem

possible.

It would both

require

the

knowledge

of all the moments of the statistical distribution of the patterns and the

computation

of _non Gaussian

integrals

in

I.

In the

following,

_we resort to an

expansion

of the

partition

function around the usual _un- correlated

case

(C;j

ci

d;j), keeping only

the first two terms of the

expansion (11).

To be _more

precise,

when

C;j

₌

^x"~i'

and _x <

I,

this Gaussian

approximation

_can be shown _to

provide

the critical

capacity

_up to x~.

3.2 THE _RESULTS _FOR _WEAK CORRELATIONS. For

C;j

_m

b,j,

the entropy ^of the

synaptic

weights

is

equal

to

)

In Z _m

'~l' _)q#

₊ _s@ ₊ _mih _{+ fi}

_)Tr ~j

U + s-q

)~ _(ln(2fiI

₊

_(2§ _#)b)j

+)fli~ £)~_~ Cji[(2fiI

₊

(2§ #)d)~~]jkcki

₊ _o

f

Dz In

(H (@)j ₍₁₃₎

, -q

where

~-z ~/2 +m

~~

@

^'

^~~~~

^~~ ^~~~~

The three order parameters _{m, q,} _s _are

given by

'~

~~~

~ ~~~ ^~

N

~

,~ _~ik

_< _J

j > ~ j J,k=2

~ lk >

N

~

j~~ ^~

^~ ^~

^~lJ

^~lk ^>

⁽¹⁵⁾

where < > denotes the thermal _average _over the

couplings J;j.

Variables fli, #,§ _are

Lagrange

parameters

enforcing

constraints

(15).

fi _ensures the normalisation of each line of the

synaptic

matrix to I. These _seven parameters are found

by solving

the

saddle-point equations

of

(13).

We restrict _our

analysis

to the critical line

a~(~)

where _s _q

- 0 and the four

Lagrange multipliers diverge.

We then obtain _two

coupled equations linking

the _common critical value s~ of _q, _s to the critical parameter _m~

N

~mc "

vc~jcji[(d+v~I)-~]~~C~i

j,k=2

Tr

~

~2 v2

(

~_

j(/~

~~

~~-2j

~

(/~

+ ~

i)2

^c ⁱⁱ ^c ^jk ^kl

~ ik=2

/~

'

N

(16)

~~

i~

+ Vcl '~~~~~ ~ ~~~ ~~

~ _ii [(~

₊

_Vci) ~jjkckl

I,k=2

(6)

where

/~ ^e~~

Vc(~,

mC, ~C) ~'~ ~~

~~~

~ _~

(17)

2~ H

@)

Once _m~ and _s~ have been

computed,

the critical

capacity a~(K)

^is

given by

~~

H

j-j

^~~

^b

^~

(vcll~~~~

Let _us now

apply

these results to the one-dimensional _case with

exponentially decreasing

correlations

Cij

₌

x'~~i',

where _x < I. The

optimal capacity

is reached when _K

= 0 and from

equations (16 18),

we obtain

ac(x)

= 2 +

~ x~ ₊

O(x~) (19)

ir

Let _us evaluate indeed the value of the first

neglected

term

(that involving C(~),

_see

(12)).

As

expected,

the J's do not vanish for

large

N and _are order _x _or less when _x is small. The cubic term in I _therefore _scales

as at least x~. _This

scaling

holds for the

higher

terms in the

expansion (11)

while the

quadratic

term in I is of order x~. _We _conclude _that _the _results _derived _when

forgetting

all but the first two terms _are valid _up to order ~2.

4.

Interpretation

and simulations.

4,I THE STORAGE CAPACITY AND THE QUANTITY OF INFORMATION. The results derived

at the end of the

capacity

of attractor neural networks increases with the presence of

spatial

correlations in the

training

patterns. For feed-forward networks and hetero-associative

mapping,

^the storage

capacity

remained

equal

to

2,

whatever

Gil

_141.

The patterns

presenting

the

spatial

correlations

C;j

₌

x"~J

may be

generated

_as

equilibrium configurations

of _a _one dimensional

Ising

model at temperature T such that _x ₌

tanh(I IT).

Thus the

quantity

of information

I(x)

in _a pattern is

simply

the entropy ^of this

Ising

lattice

I(x)

= N

(- ~)

^ln

^~ ~)

^~

~)

^ln

^~ ( _~)j ₍₂₀₎

2 2 2

When _x < I, ^the ^total

quantity

of information

I(x)

stored in the

training

set is

1(x)

₌

a~(x)N.I(x)

=

N~

_21n _{2 +} ^{~~~ ~} _l x~ ₊

O(x~) ⁽²¹⁾

7r

Though

the

capacity

of autc-associative networks increases when

they

_are

presented

with _spa-

tially

correlated

data,

_we _see that

(as

far _as _we _can trust the small _x

result)

the

quantity

of

information the neural network

actually

stores decreases. This situation

already

occurred with biased patterns [3].

Some numerical simulations _were carried out to check the increase of the storage

capacity

with the _amount of

spatial

correlations inside the

training

set. The patterns _are

generated

through

_a Monte Carlo simulation of the

Ising

model and the Minover

algorithm

finds the

couplings achieving

the

largest stability

_K _{[4], [7].} The sizes of the network _were

equal

to N

=

50,

100 and 200. The differences between these three simulations _were lower than the _error

(7)

z-o

5 i~

I j

(

I

i.o _'~.

i

'",

[

"._

)

""._ _.~

o.5

"".,

^~

~'~.~

o-o ^"'

o-o 0.5 1.0 1.5 2.0

Size of the

training

set _a

Fig-I-

the optimal stability _~ for different sizes

a obtained from the Minover algorithm

(the

full

curve is _a polynomial

fit).

The patterns are drawn from _a _one dimensional Ising model

(z

= 0.7 _see

text)

and the number of _neurons is N

= 200. The dotted line is the theoretical _curve for the usual uncorrelated _case

(z

= 0). The dash-dotted _curve displays the solution of the saddle-point equations

valid for low _z.

bars

(due

to the statistical fluctuations and the finite

running

time of the

Minover).

We

plot

here the results obtained for N

= 200.

Figure

I _compares the results obtained for _x

= 0.7

to the classical uncorrelated _case

(x

₌

0)

[3] ^and ^to ^the solution

provided by

the

saddle-point equations.

We _can notice that there is _a

good

agreement between the simulations and the small _x

theory

for small _a.

However,

when _x _-

I,

the critical

capacity computed

from the

saddle-points equations

scales _as

"~~~'~

_~~ _~

fi~~~ 2(1~ )~

~~~~

which is

clearly

unrealistic since the

quantity

of information

I(x)

would

diverge.

4.2 _THE FERROMAGNETIC STRUCTURE OF THE SYNAPTIC WEIGHTS. In order to shed _some

light

_on the _way the network increases its storage

capacity,

_we compute the first moments of the distribution of its

weights (which

is

gaussian

for small

x)

on the critical line

o~(~)

N

<

Jo

> ₌

Ac ~ Cik ((£l

⁺

vcl)~~j

~.

Ac ₌

) ₍₂₃₎

k=2 ^~

(8)

~

J'~

~ ~ ~~J ^~'~ ~~~ ~

(© ~I)2~

'

~C

f/j~)~i _~)~

~~~)

jk c @

In the usual uncorrelated _case

(C;j

₌

d;j),

^all the _synapses _are of order

§,

with _zero _mean.

When the

input

data become

spatially correlated,

the

weights J;j

_are

equal

to the _sum of two contributions

* a

ferromagnetic

term

J;(

: the _mean values of the synapses

linking

two close cells

(in

the

sense that ii

j(

is _not much

larger

that the correlation

length

of the

input data)

_are finite

and

positive. They

decrease

quickly

with the distance

separating

^the ^units but do not vanish when N goes ^to

infinity.

* a

spin-glass

term

J)j~' ^equation (24)

shows that the

couplings

fluctuate around these

mean values from

sample

to

sample. Furthermore,

the

typical

deviations _are of order

),

like with uncorrelated patterns.

The

origin

^{of the} ^first term

J(

has

already

been

explained

in part 3.I. The parameter _mc

may be

interpreted

as the

typical

value of the

ferromagnetic

part of the stabilities in the limit of small _x.

For _more

general

correlation matrices

(I.e.

not close to the

identity),

the

replica

calculations of part 3.I _are _no

longer

valid. We _can nevertheless

predict

the

qualitative

behaviour of

the network. With respect to the usual uncorrelated case, the

weights

fluctuate around _a

ferromagnetic background J(.

The latter is

self-averaging

it

depends only

on the statistical distribution

C;j

^of ^the patterns, and not on the

particular training

set. This

positive synaptic

"bump" provides

all the units of the network with _an additional field whose _mean is

equal

to N

h+

=

~j _CijJ( ₍₂₅₎

j=2

for each pattern. ^As a

result,

the effective field due to the

tuning

of the

weights (fluctuating parts)

is hs

g, ⁼ ^~

h+

rather than _~. Near the saturation

(when

_~ ₌ 0 for

instance),

_hs,g, is

negative,

while the total field hs_g_ +

h+

is still

positive.

This

explains

how the critical

capacity

of the present ^model _may ^exceed 2 [3], as found in part 3.2.

The results of simulations done with

C;j

₌

(0.7)'~~~'

and _a ₌ I _are

displayed

in

figure

^2.

The _mean values of the

weights J(

=

§

and the

typical

fluctuations

J(~

₌

J;j~

are

plotted

for the distances k

= 1,..., 5 and the different sizes N

=

50,100

and 200

(the

numbers of

samples

the data _are

averaged

_over _range ^from 10 to

100).

^The ^standard ^deviations seem to vanish like

fi

_as it is

predicted

in formula

(24).

Moreover

they

_are

roughly independent

on the distance

ii j( separating

the units. Other simulations

performed

from _o ₌ 0.5 to

a = 2 show that the coefficient B does not vary a lot when the size of the

training

set

changes (0.7

_< B _< 0.9

typically).

The _mean

couplings

^J+

depend

_very

weakly

_on the number of

neurons N

(the

variations _are taken into account

by

the sizes of the

squares).

However,

J(

decreases

quickly

with ii

j(. Figure

3a shows that this

decreasing

is indeed

exponential.

It may be well fitted

by

the

following

law

J(

=

Jt v(x,

~Y)"->'

(26)

Figure

3b

displays

the values of

y(x,a

₌

0.5)

obtained from

figure

3a and _compares them to the theoretical

expectations. Equation (23)

also

predicts

_an

exponential decreasing

of the

synaptic weights.

We _see that the agreement is

quite good although

the

theory

is exact _up to order x~

only.

In addition to the statistical

fluctuations,

the _error bars

appearing

in

figure

(9)

0.4

co

~ N (j,( (j,(x4

~'~ ~~ ^~ ^~

jj

₂₀₀

o 0 06 0.84

~

l~

~

~ 0. 2

lj C

~

+

~' ,.,,f

--X,.,., ,.,,.,-~.,,.,.-

w o ^~ ~

'~ O-I

JZ _+...

~- ,,.,.,.--~,,,,. ~-- -~

b0I ...,,o...,- ~--,... ^,e,.. ...~... ,~.,

~

o

C

f

~ ~ ^a

$i

0 2 3 4 5 6

Distance

(I-j(

Fig.2.

the _mean

couplings J(

=

( _(small squares)

and the fluctuations

J(~

₌

fi

for different sizes N

= 50,100 and 200 and _a

= I. The distances ii

j(

_range from I to 5. The

scaling

of the deviations is

compatible

with _a

fi

law. The _mean

weights

seem to decrease

exponentially

with the distance

separating

two _neurons

(see figure 3).

The size of the squares

gives

an upper bound of the statistical fluctuations of the J+

(from sample

to

sample

and inside each

sample,

from perceptron to

perceptron)

and of their variations for the different N.

3b include the

uncertainty

due the finite

running

time of the Minover.

Using

the notations of

[7] ^and [4], one must take into account the

performance guarantee

^factor AT. It compares the

largest stability

_~m achievable for _one

given sample (I.e.

^after an infinite

time)

to _~T obtained

by

the

algorithm

at time T _~T _< _~m _< AT _~T. All the simulations _were

performed

with

1.02 _< AT _< 1.06.

Figure

4

displays

the behaviour of

J;(

for _a

given

x = 0.7 and different sizes of the

training

set. There is

a

good qualitative

agreement ^with ^the Gaussian

theory.

For low _a, the latter

predicts

J;(

_ci

^@ C;j (a

_-

0) (27)

This _may be

easily understood,

since the Hebb rule becomes

optimal (for

the

stability

cri-

terium)

for finite P. Thus, at low storage, ^the

couplings

reflect the inner structure of the

input

patterns. ^Near ^the

saturation,

the theoretical result is

Jt

_Ci _-~

_(C~~);> (a

- C1c)

(28)

where _~ is _a

positive

constant derived from the

saddle-point equations.

For

C,j

₌

xii-ii,

(10)

-1

".[")"._

-~

~~

..

'l'

.. ._

~ "'.'«.

~ _., ".

u ..

w ,

-3 « .._ ...

uo '. '~ ".

° ". ".. "..

-

m_

..

~ ",

._ .., "._

~ -4

.. ".. "., "..

£

; ^i, ',_

~ K .. > ..

tu _., ._

I _", _{"._}

~ -5 '.

".j

"._ .._I

c I ^".

g ., ._ ._~

~

-6 ~.

"._

'j

~. "..

., ., "._

-?

l 2 3 4 5 6 ? 8 9

distance

ii-ii

o 5

I

~ 04

j

c

l

03

u

j

'~

~ 0.2

b

01

'j/

0 0

0 0 0.2 0.4 0 6 0

input correlation _x

Fig.3. (a)

the _mean values of the couplings

J(

as a function of the distance ii

j( (semi-log plot).

The networks includes N

= 200 _neurons and the data _are averaged

over 100 samples. The s12e of the training set is _a

= 0.5 and the different correlation strengths _are from top to down _: _z =0.8, 0.7, 0.6, 0.4 and 0.2. The numerical data provided by the Minover algorithm _seem to obey _an exponential

law proportional to

y"~?' _(see

_equation

(26)). (b)

the dotted line shows

y(z)

for _a

= 0.5

as obtained from the weak correlation theory. The points _are deduced flom the figure 3a. The _error bars take into

account the statistical fluctuations and the uncertainty due to the stop of the Minover algorithm after

a finite time

(see

_part

4.2).

JOURNAL DE PHYSIOUE , -T I N'~ MAY iU91 4jj

(11)

o.5

3

_O.4

a a a a a a

-- °

01 a

~

I

u

0.3

-~

§

01

+ -n

0l 0.2

~

J$

flf

~ ^~

~ +

~'~ ^~

~

~ +

zg +

~ +

o-o

~

o-O 0.5 1-O 1.5 Z-O

Size of the

training

set _a

FigA.

the _mean

couplings J;(

as functions of the

training

set size _a for _x ₌ 0.7. From top

to

down,

_ii

j(=1, 2, 3,

4 and 5. The small _x

predictions (frill curves)

_are

compared

to the simulation results

(the

size of the

symbols

represent the

largest

statistical standard

deviation).

Note the

qualitative agreement

for small _a

(J(

_m

@ x'~~j')

and for

large

_o

(J;(

« I if

ii ii #1).

the

only subsisting ferromagnetic couplings

link the nearest

neighbours

[4]. ^This

prediction (I.e.

^the mean

weights

_are

given by

^the inverse matrix at the critical

capacity)

_seenw to be

corroborated

by

the simulations.

4.3 THE DYNAMICAL STABILITY OF THE PATTERNS. The

question

_we consider _now is the

following

does the

ferromagnetic

structure of the

weights

influence the retrieval abilities of the network ? Let _us start from _a stored pattern _~t. We then choose _a site at random

(numbered I)

^and

flip

the

corresponding spin.

In the usual uncorrelated _case, the stabilities of the other

spins

are

changed by O(h).

In the

large

N

limit,

this

single flip

cannot affect the other units and the

training

pattern ^is

perfectly

retrieved

through

the

dynamical

evolution of the network

(the

notion of attraction basins is

classically

defined from the reversal of _a finite fraction of the total number of _neurons N

[8]).

When the

couplings

have

ferromagnetic

_means, the situation is _more

complicated.

The stored pattern _~t ^exhibit

spatial

correlations whose

typical length

is L with

C;j

₌

exp(-(I

j

(IL) (e,g.

L _ci 2.8 for _x

= 0.7 in the

previous simulations).

It may be

roughly

described _as the

juxtaposition

of domains of _same

sign spins

and the

lengths

of which _are around L. Let _us suppose now that the site _{+ I}

(or I) belongs

to the _same domain _as I. After the

flip,

its

(12)

stability

^is ^decreased

by

dAf+1

=

-2Ji,1+iffff+1

₌

-2Ji,1+1 (29)

which becomes

equal

to

-2J£

in the

large

N limit. As _soon _as _a is

large enough (I.e.

~ <

2J£),

the

neighbours

_are in turn

flipped

and the initial

perturbation propagates. However,

when it reaches the

edge

of the

domain,

the

stability

of the first _neuron _m of the

opposite spins

block is

changed by

bA$

₌

-2Jm-1,m($-1($

=

+2Jm-1,m (30)

which is _now

positive.

This _neuron is thus stable. We see that the the initial

spin flip

_cause the reversal of the whole domain this unit

belonged

to. We _can conclude that the fixed

point

of the

dynamics

differs from the pattern _~t ⁱⁿ a finite number of bits

(roughly L).

There is _no

perfect

retrieval but for

large networks,

the

overlap

between the _two patterns _goes to I.

The above

reasoning implicitely

takes

only

into account the nearest

neighbours

interactions.

It is

justified by

the

sharp decreasing

of the J+ with the distance

separating

the units

(see Fig.2).

This discussion also

pointed

out the main difference between the

spins belonging

to

centers of the domains

(I.e. having

^the same

sign

as their

neighbours)

and the

spins

located at the

edgq

of the blocks. The " center" units _are,

roughly speaking,

stabilised

by

the J+

coup1blgs

while the field

incoming

onto the

"edge" spins

due to the

ferromagnetic weights

vanishes. The latter _can

only

be stored in their

right

directions

(up

or

down) by

the

tuning

of the J~.E.

couplings.

Since the number of such

"edge" spins

is _a fraction of N

(as

_soon _as L _>

0),

the

storage capacity

increases.

5.

Sununary

and discussion.

In this _paper, _we have focused _on the storage

properties

of autc-associative memories when

they

are

presented

with

spatially

correlated patterns. We have

computed

^the free energy for weak correlations. It

already predicts

two mabl differences with respect to the classical uncorrelated

case.

First,

the

storage capacity

increases but the total

quantity

of information _seenw _to decrease.

This situation is reminiscent of the biased patterns case [3] and

emphasizes

the

necessity

of

decorrelating

the data to

improve

the

efficiency

of the memories. One _must however be cautious about the

possible analogy

with

neurobiological

facts

(decorrelation

of the visual stimuli

through

the first retina

layers

for

instance).

The results _we obtained here

depend strongly

on the

binary

nature of the _neurons. In

particular,

the

ferromagnetic weights

are

useful since

they

allow the units to

strengthen

the

input

fields in the

right

direction. Such

a

principle

would not be

adequate

_any

longer

for continuous _neurons

(which

_may be _more relevant for

biological modelizations),

where the

intensity

of the fields

(and

not

only

their

signs)

matters to the cells _responses.

Secondly,

the

synaptic weights

exhibit _a richer structure than for uncorrelated patterns.

In addition to the usual

O(j~) fluctuating weights (I.e.

tuned

during

the

training),

there is _a

O(I) self-averaging

and

ferromagnetic background.

These short _range

couplings

take

advantage

^{of the}

spatial

correlations of the

input

patterns to enhance the cells

fields,

without

affecting

too much the retrieval

performances

of

large

networks. Some numerical simulations

were

performed

for

exponentially decreasing

correlations and _one dimensional patterns.

They

show _a

surprisingly good

agreement ^with ^above

analytical predictions (valid

for weak

spatial correlations).

All the results _we

presented

here _were derived for continuous

weights.

It would be

interesting

to know what

happens

when other

prescriptions

_are

imposed

_over the

couplings.

With

binary

(13)

weights (J;j

₌

+fi)

_or bounded _synapses

((J;j(

<

§),

the

ferromagnetic

_means could not be of order I and the _nature of the

optimal synaptic

matrix would be different.

Acknowledgments.

am

grateful

to M. Mdzard for _numerous

illuminating

discussions and

readings

of the

manuscript

I also thank W. Krauth for useful remarks

concerning

the numerical simulations and I.

Kocher,

D.

O'Kane,

A. ~eves for their

help during

the

completion

of this work.

References

[ii

J. J. Hopfield, ^Proc Nat' Acad Sci USA 79

(1982)

2554.

[2] ^D. Amit, "Modeling Brain Function" Cambridge University Press

(1989).

[3] ^E. Gardner, J Phys A 21

(1988)

257;

E. Gardner, B. Derrida, J Phys A 21

(1988)

271 [4] ^R. Monasson, J Phys A 25

(1992)

3701.

[5] ^J. ^J. Atick, A. N. Redlich, Neural Comput. 2

(1990)

308.

[6] ^J. ^P. Nadal, N. Parga, "Information processing by _a perceptron" Proceedings of ELBA, Neural Networks from Biology _to High Energy Physics

(1992).

[7] W. Krauth, M. M6zard, J Phys A 20

(1987)

L745.

[8] ^T. B. Kepler, L. F. Abbott, J Physique 49

(1988)

1657;

B. M. Forrest, J Phys A 21

(1988)

245.

[9] D. Amit, M. Evans, H. Homer, K. Y. M. Wong, J Phys A 23

(1990)

3361;

M. Griniasty, H. Gutfreund, J Phys A 24

(1991)

715.

Storage of spatially correlated patterns in autoassociative memories

HAL Id: jpa-00246786

https://hal.archives-ouvertes.fr/jpa-00246786

Submitted on 1 Jan 1993

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Storage of spatially correlated patterns in autoassociative memories

Rémi Monasson

To cite this version:

Rémi Monasson. Storage of spatially correlated patterns in autoassociative memories. Journal de

Physique I, EDP Sciences, 1993, 3 (5), pp.1141-1152. �10.1051/jp1:1993107�. �jpa-00246786�

Storage of spatially correlated patterns in autoassociative memories

(°),

(Received

1993)

introduced,

gained

physics

[I].

already

interesting

(e.g. working

against degradation,

parallel processing, learning

examples, [2]).

mainly

spatial

inputs.

applications

spatially organised

spatial organisation

input

physical

spatial

spatially

opti-

capacity, using

developed by

Secondly,

investigate

organises

(from

weights

standpoint)

spatially

already

mapping

processing

resulting couplings organisation

deeply

properties

Description

other,

taking

S;

Sj

J;j (J;;

0, Vi).

dynamics

sequential updating

(i

i)

sign l~ J;j sj i)j (i)

training

binary

(ff ) belonging

activity

spatial

imposed by

~~~' ~~"'~i (2)

probability

symmetric

positive

diagonal, dependent

ii (the subscripts

j

performed

). Binary

obviously

entirely

Storage of spatially correlated patterns ⁱⁿ autoassociative memories

^dynamics

sign l~ _{J;j sj} ^i)j (i)

~' "'~i (2)

(C~~);. (I, ^j

~ ~ _(~ _At) ₍₆₎

/ _fldJ;> _fld _~ _J( ^lexP (-SE (lJ;> Ii lffl)) (7)