• Aucun résultat trouvé

Attraction domains in neural networks

N/A
N/A
Protected

Academic year: 2021

Partager "Attraction domains in neural networks"

Copied!
11
0
0

Texte intégral

(1)

HAL Id: jpa-00246757

https://hal.archives-ouvertes.fr/jpa-00246757

Submitted on 1 Jan 1993

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Attraction domains in neural networks

L. Viana, A. Coolen

To cite this version:

L. Viana, A. Coolen. Attraction domains in neural networks. Journal de Physique I, EDP Sciences,

1993, 3 (3), pp.777-786. �10.1051/jp1:1993162�. �jpa-00246757�

(2)

J.

Phys.

I France 3 (1993) 777-786 MARCH 1993, PAGE 777

Classification

Physics

Abstracts

87.30 75.10H 64.601

Attraction domains in neural networks

L. Viana and A. C. C. Coolen

Lab. de Ensenada, Instituto de Fisica, UNAM, A. Postal 2681, 22800 Ensenada, B.C. M6xico

Department

of Theoretical

Physics, University

of Oxford, I Keble Road, Oxford 0Xl 3NP, U.K.

(Received 28

April

1992,

accepted

in revised

form

8 October 1992)

Abstract. We performed a systematic study of the sizes of the basins of attraction in a Hebbian- type neural network in which small numbers of pattems were stored with non-uniform

embedding

strengths w~. This was done by iterating

numerically

the flux equations for the

« overlaps » between the stored pattems and the

dynamical

state of the system for zero noise level T. We found that the existence of attractors related to mixtures of three or more pure memories

depends

on the

specific

values of the

embedding strengths

involved. With the same method we also obtained the domain sizes for the standard

Hopfield

model for pm18.

1. Introduction.

In the last few years, neural networks

(NN)

have attracted a fair amount of attention due to their

properties

as content-addressable memories

[I].

In these

systems

the p stored pattems are

stable states which act as attractors of the

dynamics

of an

N-spin

system, thus

allowing

the

recoverability

of information from

partial

or

noisy

data. Amit et al.

[2]

demonstrated that in the

thermodynamical

limit it is indeed

possible

to

study

the space of

configurations

of NN in a

systematic

way,

by using

statistical mechanics tools. In their studies for the Hebb

model, they predict

the existence of a

huge

number of

spurious

stable states in addition to those

corresponding

to the stored pattems

(to

be called pure

memories)

; these

spurious

states deteriorate the memory function for

they

also act as attractors of the

dynamics

of the system.

Their

approach

has been used to

study

other

Hopfield-like

models

[3, 4]

; however,

although

it

gives

us essential information about the existence and

stability

of attractive states, it does not

allow us to evaluate their

importance,

in the sense that it does not deal with the size of their basins of attraction. If we want to assess the

importance

of pure and

spurious

fixed

point

attractors, then it becomes necessary to go one

step

further and evaluate the size of the attraction

domains,

as this

quantity

is

directly

related to the

content-addressability

of the stored

information.

Several

analytical

and numerical

approaches

have been

followed,

at various levels of

description,

in order to evaluate the basins of attraction for a number of models

[5-7]. Among

such

work,

we can mention that of Forrest

[6],

who calculated

numerically

the mean fraction

f(qo)

of states which are recalled with less than N/16 errors for various initial

overlaps

qo and p =

aN,

and the work of Homer et al.

[7]

who made a

non-equilibrium

treatment for a

(3)

Hopfield-type

NN with different levels of

activity

that allowed them to

study dynamical properties. They

evaluated the critical value q~ of the initial

(pure) overlap,

needed to

trigger

retrieval of this

particular

pattem and

subsequently

defined the

quantity

R

= I q~, to be the

corresponding

size of the basin of attraction

(again

for

a =

p/N finite).

In this work we will define the « size of basin of attraction » as the fraction

f~(p )

of all microstates which evolve towards the

p-th

stored

pattem,

and will calculate this

quantity

for

finite p in the

thermodynamical

limit N

- oJ. This calculation can be done at either a

macroscopic

or

microscopic

level : at a

microscopic level,

it consists in

considering

a NN

composed by

N elements where p pattems have been

stored,

and then

carrying

out the actual simulations of the

(Monte Carlo) dynamics starting

from random initial states

[8].

On the other

hand,

the

macroscopic

level concems the

overlaps

vector q, whose

components

q~, constitute

a

macroscopic

measure of the resemblance between the present

microscopic

state

(S,

of the system and each of stored pattems

f,~.

The

procedure

is based on the iteration of the flux

equations

for these

overlaps starting

from random Gaussian initial states. This level of treatment is

specially

convenient as not all the

microscopical

details of the system, at the

spin level,

are relevant.

However,

it can

only

be

implemented

in the case a

= 0

~p-finite

as

N -

oJ).

By using

this last

method,

Coolen calculated the cumulative size of the attraction domains of the p stored

pattems

defined as

f~

m 2

z f~(p ),

for the Hebb model

[9] (the

factor 2 comes

a

from the

symmetry

of

(~

~

(~).

He

performed

this calculation

analytically

for p « 3 and

numerically

for p m 4. He found an

interesting

result : after an initial

decrease,

the cumulative domain size of the stored pattems

begins

to increase for pm 6 ; that

is,

the effect of mixed states is reduced

by increasing

the number of pattems stored.

The same kind of work was also

performed

for a modified Hebb model in which each pattem is stored with different

weight [10]

; the

importance

of this last model is that it is

possible

to

increase or decrease the domain size of individual stored

pattems by increasing

or

decreasing

the

weight

associated to

them,

so various

degrees

of

training

can be accounted for

[3].

For this model and p =

3,

the sizes of the basins of attraction were found

provided

the restriction

w~

<

z

wA> for all p,

applied however,

there was no answer for the cases

violating

this

<A « aJ

restriction.

Therefore,

in this paper we

perform

a detailed numerical

study

of the domain sizes

for this modified Hebb model as a function of the

embedding strengths (w

for

p =

3,

4 we put

special emphasis

on the

problem

of

spurious memories, by evaluati~g

their cumulative basin of attraction. We also

analyse

the domain sizes for the standard Hebb model

as a function of p, for

large

values of p.

2.

Analytical background.

We will consider the Hamiltonian

3C=-~ £

J~~s~s~, s,=±i (1>

~<iwi>

describing

a

system composed

of N neuron like

Ising

elements

S,,

whose

(symmetrical)

interactions J,~ between

pairs (ij )

reflect the

storage

of a finite number p of random unbiased pattems

(ff)

= ± I, with p

= I,

,

p,

according

to a modified Hebb Rule

[3]

P

J_

=

z

~

fa fa (~)

lj j~ a I j '

@

(4)

N° 3 ATTRACTION DOMAINS IN NEURAL NETWORKS 779

where

w~

is the

weight

associated to the

Jz-th pattem.

In the case where all

embedding strengths

are

equal, (w~

= I for all p,

equation (2)

reduces to the Hebb rule. The

equilibrium properties

of this system are characterized at a

macroscopic

level

by

the existence of p order

parameters

q~, or «

overlaps

» which measure the resemblance between a

microscopic

stable

state of the system and the

p-th

pattem. In the

thermodynamical

limit

(N

- oJ

),

the value of these

overlaps

is

given [2, 3] by

the solutions to the set of p

coupled equations

Q~ =

lff

tanh

fl I

WV qv

fi

> Jl =

I,

, p

(3)

v

~

where

fl

is defined as the inverse of the noise level

(fl

oz

I/T),

and the brackets

( )

indicate an average over the random variables

(f,~).

In this limit strong

averaging applies [I II,

as a consequence, for zero noise level

(T

= 0

),

this

equation

can be written as :

~a

l~la

~~~~

(~

~Y ~Y

IY)

~~~

y

~

where the double bracket

II ) )~

indicates

averaging

over the 2P comers of an

hypercube surrounding

m~ =

0,

(~1~

= ±1, with

« = 1,.

,

2P"~).

This static

picture

has a

dynamical counterpart

: for a system with a

synchronous parallel dynamics,

the time evolution of the

overlaps

is

given by

the

mapping [9]

q(n

+ I

= F

(q(n» (5>

where F

(q)

is

given by

F

(q)

=

m~ sign £

w~ q~ ~1~

(6)

~ ~

In such a way that fixed

points

of the

dynamics (attractors),

that

is, points satisfying F~(q)

= F~~

(q), correspond

to stable

points given by equation (4).

On the other

hand,

the domain size

f~,

related to the

Jz-th pattem,

can be written as :

f~m

lim

ldqD(q) (7a)

N - m A~

where

4~

is defined as the

region containing

all the initial states which

eventually

evolve towards that

pattem,

and

D(q)

is the

density

of states in the «

overlaps

» space,

given by

D

(q )

m q

z Si (; 1(7b)

~

i s

In some

particular

cases and

approximations, f~

can be obtained

analytically [9, 10],

in

others,

this

quantity might

be evaluated

by

numerical iteration of the flux

equations (5)-(6).

The

regions

F

(q)

= fare convex

(bounded by

the

planes z

w~ q~ ~1~)

and,

for p

finite,

this

A

vector can

only

have a finite number of values

f;,

each of them associated to a

region

D,

in the «

overlaps

» space. If we now define the set

R~

~ RP

n@" q011q$( ") z A(ql() (g>

aA#a

(5)

then we know that F

(qo)

=

(0,

,

q[, 0),

with

q[

= ± I, for all qo e

R~.

Therefore the set

R~

has the

following properties

:

(Ii R~

is convex.

(2)

For all q in R

~

F~ (q

=

F

(q ). Therefore,

all initial q-states in

R~

will evolve towards the

p-th pattem.

This

quantity

allows us to calculate

analytically

the fraction of microstates which evolve towards the

p-th

pattem in one

single

step, and therefore

gives

us a lower bound to the fraction

f~ (clearly R~

z

4~).

It is

important

to notice that the

boundary

of the

region

defined

by R~, namely,

the set obtained

by using

an «

= »

sign

in

equation (7)

instead of «

> », does not

satisfy F~(q)

=

F

(q)

;

however,

as N

- oJ this set has a measure 0.

Figure

I includes the lower bound of

f~(x

xx as a function of p for the Hebb model

(all

(w~)

=

I),

as calculated

by integrating equation (7)

over the

region

U

R~

; it also includes the value of

f~

as obtained

by

a numerical iteration of the flux

equations.

As we can see, after

an initial decrease, the cumulative domain size of the stored pattems

begins

to increase for p > 6 and tends

asymptotically

to a value around

0.88,

that

is,

the effect of mixed states is reduced

by increasing

the number of stored pattems this result

improves

that

reported by

Coolen et al.

[9] by eliminating

some finite size effects.

fp

Hebb model

llAf~i=I

o.7

O.4

~ ~ 3 flux equations

x Analytical lower

O.2 ~°~~

o-i

o-o

5 lo 15 20

P

Fig.

I. This

figure

shows the cumulative domain size

f~

= 2

£ f~

for the Hebb model as a obtained

by

a

numerical iteration of the flux

equations

as a function of p (3 3 3 ), and the analytical lower bound to this

quantity

(x x ).

3. Fixed

points

of the flux

equations.

Equations (5)-(6)

have a number of fixed

points

some of them

acting

as attractors of the

dynamics

of the system. It has been a common belief that, for a

given

finite number p of stored

(6)

N° 3 ATTRACTION DOMAINS IN NEURAL NETWORKS 781

pattems in the

thermodynamic

limit

(a

=

0,

N

-

oJ),

there exists a

fixed-point

related to each of the p pure

memories, plus

additional fixed

points

related to any combination of r pure

memories,

where 3 « r w p

(plus

their

symmetrical counterparts). Although

this is true for the Hebb model

(for

a

=

0),

in this paper we will show that for the modified Hebb model, the existence of attractors related to any mixture of three or more pure memories

depends

on the

specific

values of the

(w~)

involved. To this

end,

we ordered the

weights,

without

loosing generality,

in such a way that wi =

I and 0

< w~ « w~ for

>

j.

Fixed

points q(n

+ I

)

=

q(n)

are classified into two groups

depending

on their number of

non zero components. These

corresponding

to the pure and

spurious

states. A fixed

point

related

only

to the

p-th

stored memory

(ff)

is characterized

mathematically by having only

one

component

different from zero, that is q~ # 0 and q~ =

0,

for V v # p. On the other

hand,

a fixed

point corresponding

to an attractor related to a mixture of several stored memories is

one that has

simultaneously

more than one component or

overlap

different from zero.

3.I p = 1, 2.

By inspection

of

equations (5)-(8),

we can observe

that,

in the case when less than three pattems have been

stored,

U

R~

covers the whole

overlap

space.

Therefore,

the system does not have any

spurious

attractors, and

equation (I)

can be solved

exactly.

3.2 p = 3. For the case p =

3,

it is

possible

to demonstrate

(see Appendix A)

that the

relationship F~(q)

=

F

(q

) holds for all q, if w~ <

z

w~, for all p this means that for any

<A « al

initial value q, the flux

equations

will converge to a

fixed-point (related

either to a pure or to a mixed

memory),

in a

single

time step. Due to our convention in the

ordering

of the

weights

this

restriction can be summarized as I < w~ + w~.

Clearly,

the Hebb case lies within this category.

By

an exhaustive

analysis

of the

possible

fixed

points

of the flux

equations (5)-(6),

we found the

following

: this system has one fixed

point

q~

= &~~ related to each stored pattem p

(plus

its

symmetrical

counterpart q~ =

-&~~);

these fixed

points

exist for any set

(w~)

(I

= wi m w~ m w~ > 0

). Additionally,

we found two different kinds of fixed

points

corre-

sponding

to mixture states

(with

any combination of

signs)

:

I)

qi =

±1/2,

existent

q~ = ±

1/2,

in the w~ + w~ > wi

,

q3 * ± 1/2,

region III

qi

" ±

I/4,

existent

q, = ±

3/4, along

w,

=

(wi

+ w~

)/3

,

I,

j

=

2,

3

q~ = ± I/4. the line

A

stability analysis

shows that

only

solution I, which exists for w~ + w~ > wi = I, is

stable, being

stable in the whole

region

where it exists. In other

words,

the solutions of the type II do

not

correspond

to attractors. It is

interesting

to note that fixed

points

related to

spurious

memories do not exist for w~ + w~ ~ wi =

I,

for it has been a common belief that

having

more

than two memories in Hebbian type models for a = 0

(N

- oJ

) implies

the existence of

spurious

stable states. These fixed

points

are indicated in

figure

2.

3.2 p = 4. In this case, all solutions mentioned above exist. That

is,

there is an attractor related to each of the 4 stored pattems,

plus

one

spurious

attractor related to each combination of r

= 3 stored memories whenever the condition w; <

wj

+ w~ for I ~

j, k,

and any value for wt, is 8atisfied here

(I, j, k, f )

are any

permutations

of

(1, 2, 3, 4) (this

restriction leaves out many

regions

of the

parameters' space). Additionally,

new attractors appear which are mixtures of 4

pattems.

Due to the

large

number of

parameters

it is not

simple

to find out, for

(7)

1.o 1Jim1.0

l"1= l~~~1.0 0.9

1J2mA

0.8

&L,

o.7

0.6

O-O O-1 0.2 0.3 DA 0.5

W,

bl~

DA

0.3

0.2

o-i

o-o

O-O O-1 0.2 0.3 DA 0.5 0.6 0.7 0.8 0.9 1-O

w~

Fig.

2. p =

3. The behaviour of the network in w-space is

separated

in two

regimes by

the line w2 + w3 "1. Above this line, the contour levels represent the cumulative domain size

f~.

In the shadowed area

f3

m I

exactly,

in this region broken lines represent the percentage of times the flux

equations

converge on the first iteration.

Along

the lines (+ + and (+ + there exist unstable fixed

points.

p >

3,

which are the conditions

required

for

particular

types of

fixed-points

to exist.

Similarly,

it is not

possible

to derive

analytically

the number of the iterations

required

for the flux

equations

to converge, so it becomes necessary to find out

numerically

the answer to these

questions.

4. Numerical iteration of the flux

equations.

In order to evaluate the domain sizes of the p =

3,

4 attractors, a

systematic study

was

performed

of the evolution in time of the flux

equations

for a NN with

synchronous dynamics.

The p memories were stored

according

to the modified Hebb rule

[Eqs. (1)-(2)].

This

study

was

done as a function of the

embedding strengths (w~),

with the conventions

previously

indicated,

and

by considering

a

grid

in the

parameters'

space

given by

Aw

= 0.04. The idea

was to calculate the cumulative size of the attraction domains

f~

= 2

z f~ (p ), by iterating

the

a

flux

equations (5)-(6)

with random initial values for q, obtained from a Gaussian distribution

D(q)

with zero mean and a

dispersion

« 0. This was done

lo,

000 times. The choice of this distribution reflects random initial states

(S;)

when a set of non biased random pattems

(ff)

has been stored in a network

composed by

N II « ~

spins.

The results obtained were the

following

: For p

=

3 the behaviour of the network was found to be

separated

into two

regimes by

the line w~ + w~ =

I

(in general, by

the line w~ + w~ =

wi),

as follows :

(8)

N° 3 ATTRACTION DOMAINS IN NEURAL NETWORKS 783

. Below this

line,

for w~ + w~ < wi =

I,

all attractors

correspond

to one of the stored

pattems,

that

is,

no

spurious

memories

exist,

so

f~

= I. In this

regime,

the flux

equations

converge in either one or two time

steps.

Broken lines in

figure

2 show the contour levels for the

percentage

of times the flux

equations

converge on the first

iteration,

the

remaining

of

times

they require

two iterations to converge.

. As the line w2 + w~ = wi is crossed

(entering

the parameter

region

w~ + w~ > wi =

I),

there is an

abrupt

transition into a different

regime

:

here,

all flux

equations

converge on the first

iteration, however,

not all the attractor states are related to

only

one of the p stored

pattems.

Solid lines in

figure

2 indicate the contour levels for the percentages of microstates

f~

which evolve towards states related to pure memories these lines can be obtained

analytically [10].

It is

interesting

to note that the contour lines in

figure

2 seem to continue across the line w2 + w3 m wi> but have a different

meaning

on each side. This indicates that all those cases in which the flux

equations

do not converge on the first step get transformed into

spurious

memories as one switches to the other

regime.

Another indication of

this,

is the behaviour

along

the line w2 +w~ = wi =

I,

which

happens

to be on the

boundary

of the

region

R

i ; on this

line,

all the

points

show both a percentage of

spurious

memories and a percentage of pure memories for which two steps were needed to obtain the

fixed-point.

In these cases

these two percentages sum the same as the percentage of the contour lines

they

are in.

For p = 4 we found the

following

there are

large regions

in the

parameter's

space where no

spurious

attractors with r

=

3 exist.

Additionally, by

numerical iteration of the flux

equations

we found that there is a

region

where there are no attractors related to

spurious

memories at all.

However,

contrary to what

happens

for p

=

3,

the transition between

regions

with and without

spurious memories,

is a soft one. That

is,

as we

change

the

embedding strengths

(w~),

the fraction of microstates which evolve towards a

spurious

attractor goes

smoothly

from values

equal

to zero, to values different to zero.

Figures

3 and 4 show the contour lines for the cumulative domain size

f4,

for some sets of values

(w~)

of the

embedding strengths,

as calculated

by

numerical iteration of the flux

equations starting

from random initial states, as indicated above. In these

figures,

shadowed

areas

correspond

to

regions

with

f4

=1. The main

figure

in 3

corresponds

to wi =

I,

w~ =

1,

0 < w4 w w~ « w~. As we can see, the Hebb case

presents

the lowest cumulative domain

size,

with

f4

0.5 this value increases as w~ and w4

decrease,

up to a value of almost

f~

~

l. For any set of values

(w~

included on this

graph,

some of the pure memories

require

more than one iteration to converge to a

fixed-point

; the average number of iterations n~~

required being

the

highest

in the Hebb case with n~~ =

1.42 and

decreasing

to about I, lo for wi =

I,

w~

=

I,

and w~, w~ 0. The inset in

figure

3 represents the contour lines for the case wi = I, and 0 < w~ w w~ w w~ = DA ; as we can see, there is a soft

(second order)

transition

between a

region

with

f~

= I to another one with

f~

< I.

Figure

4

depicts

the contour lines of

f~

for three other cuts in the

parameters'

space.

4. Discussion.

The flux

equations

which are used in this

approach

are exact in the

thermodynamical

limit.

However,

in the results obtained

numerically

there are two

possible

sources of finite size effects in addition to an error of about 19b related to the number of random initial states

considered. The first

possible

source of finite size effects is related to whether the union of the

convex sets

R~ (which

determine the

fixed-point

to which a

given

initial state will

evolve)

for

small p indeed covers

overlap

space. This is true

except

for those

overlap

vectors located

(9)

.o

o<w3< w~ < wi =i o.9

+ + + 1J2

=

~[l +1J~]

0.8

+ + 1J3 =

~[l +1J21

0.7

~ ~.

~.~ ~ 0.6

~ ~

~ ~

0.5

bl~

0.80

O-I

o-O o-1 0.2 0.3 DA 0.5 0.6 0.7 0.8 0.9 1.0

bl2

Fig. 3. The main figure shows the contour lines of

f~

for the case 0 + w~ w w~ « w~ = wi =

I in this

region

of

(w~ ),

there are always

spurious

memories. The inset shows the case 0

~ w4 « w~ « w~ = 0.4,

with wi =

I, the shadowed area shows the region in w-space with

f~

= 1.

~~_~ 0.2

0.6

~z~

Wi

=1.0

~

0,1 W4

W2m0.6

°'~

~0.0

DA O-O O-1 0.2

W~

~~~

0.3W4

0.3

0.2

~$~[(

o.75 o~~ 0.2

~

O-1 , W4

° O-I

O-O

~

O-O O-1 0.2 0.3 DA 0.5 0.6 0.0

W3

o-o 0.1 0.2 0.3

W~

Fig.

4. Contour lines for

f~,

for three different

regions

in the parameter space

(w~)

the shadowed

areas indicate the regions where

f~

=

1.

(10)

N° 3 ATTRACTION DOMAINS IN NEURAL NETWORKS 785

exactly

at the boundaries of the

regions,

I,e. for which

[q~

=

£

w~

[q~ [,

for some p.

~@

A wa

In the

thermodynamical

limit these

regional

boundaries are sets of measure zero.

The second

possible

source of finite size

effects,

is related to the choice of initial values for the

overlaps.

Random initial conditions (S~ lead to a Gaussian distribution for the

overlaps

(q~),

with zero mean and a deviation

given by

« ~

l/

Qk.

The

use of a non zero width Gaussian distribution for the initial

overlaps

would introduce finite size effects for non-zero noise levels. However, in the noiseless case

(T

=

0),

these effects

disappear.

It is very

important

to stress that

storing

more than three pattems does not

imply having spurious memories,

for there are some

regions

in the w-space for which no

spurious

memories exist.

Therefore,

it is

possible

to eliminate the existence of

spurious

stable states

by modifying

the

weights

associated to the

pattems.

This has an intuitive

explanation

if we make a

comparison

with

hyperspheres

of different sizes which we know « fill » better the space than

spheres

of

approximately

the same

size, by leaving

less intersection space.

Acknowledgements.

One of the authors

(LV)

wishes to thank Dr.

Miguel

Avalos for his advice in

computing

matters, and C. Martinez for her collaboration in the

production

of

figure

I. This work was

partially supported by project

DGAPA IN013189 of the National

University

of Mexico.

Appendix.

The

expression F~(q)

= F

(q)

can also be written as :

am

(£w~ q~)

y~A wA

FA(q)) »o, (A.1)

~ ~

where

~1is

a vector whose

components

are the 2P comers of an

hypercube surrounding

m~ =

0,

(~1~

=

±1,

with «

= 1,

,

2P~ ~). We define z~ m ~1~ w~ q~, and Z m

£

z~, with

Z#0 (I,e. excluding region boundaries),

and consider the case

Z>0;

the result for Z < 0 can be obtained

by switching

z ~ z and

using

W

(-

q

= W

(q).

In this way, W can be

written as

W

=

~ £z~ ~ £w~ F~(z)

If we now define mm

z f~,

we can write

P

~

2P(fl)

=2

z (w,ijsgn(z.ij+ z (w.i). (A.2)

~

i."~o i."=o

For p =

1,

2,

3 this

expression corresponds

to :

Pure solutions ~p =

1)

:

W/Z=wisgn (z)~0,

(11)

W/Z

=

(wi

+

w~)

sgn

(zi

+

z~)

+

((wi w~)

sgn

(zi

=

z~)

+

(w~

wi

ign (z~

zi

))

2 4

= wi

[I

+ sgn

(zi z~)]

+ w~

[I

sgn

(zi z~)]

> 0

2 2

As we can see, for p = 1,

2,

the

expression F~(q)

= F

(q)

is

always

true for any set of values

(w).

For p

=

3, equation (A.2) corresponds

to p = 3 :

4 W/Z

= wi + w~ + w~ +

(wi

+ w~ +

w~)

sgn

(zi

+ z~ z~

)

+

+

(Wl

W2 + W3 S~~

(Zl

22 + 23 +

(WI

W2 W3 S~~ (21 22 23

)

= wi

[I

+ sgn

(zi

+ z~ z~

)

+ sgn

(zi

z~ + z~ + sgn

(zi

z~ z~

)]

+

w~[I

+ sgn

(zi

+ z~

z~)

sgn

(zi

z~ + z~

)

sgn

(zi

z~ z~

)]

+ w~

[I

sgn

(zi

+ z~

z~)

+ sgn

(zi

z~ + z~ sgn

(zi

z~ z~

)]

=

wi[>

2 sgn

(yi)]

+

w~[>

2 sgn

(y~)]

+

w~[>

2 sgn

(y~)]

,

where : y, m

zz~

2 z, and >

m I +

z

sgn

(yj).

Since

z

y~

= Z >

0,

at least one of the

i i

y; must be

positive

; let's assume that y~ >

0,

with

(p,

>, p

)

any

permutation

of

(1,

2,

3),

therefore :

4 W/Z

= w~

isgn (y~ )

+ sgn

(y~ )i

+

w~12

+ sgn

(y~

sgn

(y~

)1 +

+wA12-sgn (yA)+sgn (y~)i.

This

expression

will be

positive

if w~~w~

+w~

for

(p,

>,

p)

any

permutation

of

(1, 2, 3). Therefore,

this is a necessary condition for

having F~(q)

=

F(q),

in the case p =

4.

References

[1] HOPFIELD J. J., Proc. Nail. Acad. Sci. USA 79 (1982) 2554.

[2] AMIT D. J., GUTFREUND H., SOMPOLINSKY H., Phys. Rev. A 32 (1985) 1007.

[3] VIANA J., J.

Phys.

France 49 (1988) 167.

[4] COOLEN A. C. C. and RUUGROK Th. W., Phys. Rev. A38 (1988) 4253.

[5] GARDNER, J. Phys. A 22 (1989) 1969.

KEPLER T. B, and ABBOTT L. F., J. Phys. France 49 (1988) 1657.

KOHRING G. A., Europhys. Lett. 8 (1989) 697.

KRAUTH, NADAL J. P. and MEzARD M., J.

Phys.

A 21(1988) 2995.

KRATSCHMAR J. and KOHRING G. A., J. Phys. France 51(1990) 223.

[6] FORREST B. M., J. Phys. A 21(1988) 245.

[7] HORNER H., BORMANN D., FRICK M., KINzELBACH H. and SCHMIDT A., Z.

Phys.

B 76 (1989) 381.

[8] VIANA L., COTA E., MARTINEZ C., in Statistical Mechanics of Neural Networks, L. Garrido Ed., Lecture Notes

Phys. (Springer Verlag)

368

(1990)

97.

[9] COOLEN A. C. C., JONKER H. J. J. and RUIJGROK Th. W.,

Phys.

Rev. 4o (1989) 5295.

[10] COOLEN A. C. C.,

Europhys.

Lett. 16 (1991) 73.

jll] VAN HEMMEN J. L., GRENSING D., HUBER A. and KUHN R., Z.

Phys.

B 65 (1986) 53.

Références

Documents relatifs

It is important to notice that the state of activity of the neurons of the output layer only depends on the state of activity of the neurons located in the input layer and

We are interested in computing alternate sums of Euler characteristics of some particular semialgebraic sets, intersections of an algebraic one, smooth or with nitely

The policy-based solutions parameterize the policy function directly by a neural architecture and try to find the optimal policy using gradient ascent, while the value-based

The methodologies introduced in [Worms (2014)], in the heavy-tailed case, are adapted here to the negative extreme value index framework, leading to the definition of weighted

[34] S´ebastien Razakarivony and Fr´ed´eric Jurie, “Vehicle Detection in Aerial Imagery: A small target detection benchmark,” Jour- nal of Visual Communication and

To demonstrate the effectiveness of this technique, two electronic application examples are presented. These applications use boost topology in order to obtain oscillating current

(iii) As already stated in order to evaluate a model’s ability to detect out of sample instances with high uncertainty we utilised entropy on the predictions of a model to

This work focusses on developing hard-mining strategies to deal with classes with few examples and is a lot faster than all former approaches due to the small sized network and