HAL Id: jpa-00246638
https://hal.archives-ouvertes.fr/jpa-00246638
Submitted on 1 Jan 1992
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
On the problems of neural networks with multi-state neurons
G. Korhring
To cite this version:
G. Korhring. On the problems of neural networks with multi-state neurons. Journal de Physique I,
EDP Sciences, 1992, 2 (8), pp.1549-1552. �10.1051/jp1:1992226�. �jpa-00246638�
Classification Physics Abstracts
87.10 89.70 64.60C
Short Communication
On the problems of neural networks with multi-state
neuronsGA.
Kohring
Institut fir Theoretische Physik, Universit£t zu K61n, Zfilpicherstrasse 77, D-5000 K61n, Ger- many
(Received
1 June1992, accepted 11June1992)
Abstract For realistic neural network applications the storage and recognition of gray-tone patterns, I-e, patterns where each neuron in the network can take one of Q different values,
is more important than the storage of black and white patterns, although the latter has been
more widely studied. Recently, several groups have shown the former task to be problematic with current techniques since the useful storage capacity, a, generally decreases like: «
+~
Q~~.
In this paper one solution to this
problem
is proposed, which leads to the storage capacitydecreasing like: «
+~ (log~
Q)~~.
For realistic situations, where Q= 256 this implies an increase of nearly four orders of magnitude in the storage capacity. The price paid, is that the time
needed to recall a pattern increases like: log~ Q. This price can be partially offset by an efficient parallel program which runs at IA Gflops on a 32 processor
iPSC/860
Hypercube.Attractor neural networks are
usually
defined as Nspins, S;, fully coupled together
via a connection matrixJ;j
andobeying
adynamical equation
of the form:S;(t
+i)
=f IL .>S>(t)) (i)
The fixed
points
of this system are determinedby
the connectionmatrix, J,
and the functionf.
These fixedpoints
may be presetby
anappropriate
choice of J.By
observation ofbiological
systems, Hebb
[I] proposed
that the P fixedpoints,
or patterns,(f"),
could be stored(I.e.
preset)
in this systemby chasing
J to be:J:j
=£ fffJ (2)
»
To
date,
the mostcommonly
studied attractor neural networks consists of neurons with two states [2], for which thedynamical function, f,
in the noise freelimit,
isusually
taken as:f(~)
=sign(~).
Theproperties
of thesenetworks,
which serve assimple
models of patternrecognition
in
biological
systems, have been studiedextensively using
bothanalytical
[3] and numerical1550 JOURNAL DE PHYSIQUE I N°8
[4 6] methods. It is
by
now well established that thesimplest
suchnetwork,
theHopfield
model
(consisting
ofonly
the aboveingredients), undergoes
a(first order) phase
transition in thequality
of its retrievalproperties [3-
6] at a storagecapacity,
o, ofapproximately:
a n
PIN
m 0.142 [6].Quality
of retrieval is meant here to include both the size of thebasins of attraction and the location of the nearest fixed
point
to the stored pattern. Thissimple
modelhowever,
does not represent the bestperformance
that can be achieved for thesenetworks,
indeed the maximum storagecapacity
for uncorrelated patterns has been shown to be a = 2[ii
and thequality
of retrieval has been shown todegrade
via a second orderphase
transition [4].Achieving
thislarger
storagecapacity requires
the use of different connections than thosegiven by equation (2)
and has beeninvestigated by
many different researchers(for
a review see
[8]).
Although
networks with two state neurons are useful forbuilding
a theoreticalunderstanding
of the
principles
involved in neuralcomputation,
realisticapplications
inbiological
and artificial systemsrequire
the use of multi-state neurons, I-e-, the storage of gray-tone patterns. Anobvious
starting point
for studies of multi-state neurons is asimple
extension of theHopfield
model. This was done
by llieger
[9], who assumed that the neurons are allowed to occupy any ofQ
different states, that the patterns are storedusing equation (2)
and that thedynamical
function, f,
is definedby:
f(~)
= ak for ~ E iRk-1> ski> k = I,.,
Q (3j
where the ak are the
elementary
neuron states and the Rk are chosen togive optimal perfor-
mance. He then found that the
storage capacity
falls off as a+~
Q~~
This result is somewhatsurprising,
since in the limit ofanalog
neurons,Q
- oo, itimplies:
a - 0.However,
aprevi-
ous
study by
Marcus et al. [10] onanalog
neurons hadpredicted
an anearly
the same as thetwc-8tate a. This
seeming
contradiction can be resolvedby looking
moreclosely
at the workof Marcus et al. In their papers
they
usedanalog
neurons to store two-state patterns, whereasRieger
stores patterns whose neuron may occupy any of thepossible Q
states. Since the latterrequirement
is moresensible,
it can be concluded that the storage of gray-tone patternsusing
a
simple
extension of theHopfield
model is notpractical
for realisticapplications,
whereQ
istypically
on the order ofO(256).
In an effort to overcome the limitations of the
Hopfield
typemodels, Kohring ill]
calculated the maximumpossible
storagecapacity
of networks withQ
state neurons and found thata - I as
Q
- oo. This result would have meant it wasonly
a matter offinding
the correctlearning
rule in order toapply
multi-state neurons.However,
Mertens et al,pointed
out that this result is of littleutility
since the basins of attraction shrink to zero size at the maximum storagecapacity,
I-e-, at the maximum storagecapacity
the stored patterns are unstable withrespect to a
single spin flip.
Mertens et al.[ii
then calculated the storagecapacity
at a fixed size of the basins of attraction andfound,
as didRieger,
the storagecapacity
to decrease likeQ-2_
While these results may be
encouraging
to extremists who view the world in terms of green and non-green, it isdiscouraging
for those whoprefer
todistinguish
between the various gray levels. Furthermore these results could very welldisspell
claims about the usefulness of neuralnetworks.
One
path
out of this cauldron can be foundby
careful consideration of therelationship
between two-state models and multi-state models. In two state
models,
the neurons can be describedby
asingle bit,
whereaslog~ Q
bits arerequired
to describe multi-state neurons. The bits of asingle
neuron interact with each otherthrough equation (3).
This then is the source of thedifficulties, namely,
the interactions between the bits at asingle
neuron are notalways
constructive,
infact,
the bits interfere with each other. As a firstapproximation
tosolving
thisproblem,
one cansplit
the bits off into separatenon-interacting networks,
I-e-, the neurons areprocessed
so that first bit of each neuron goes into one black-and-whitenetwork,
the second bit goes into a secondnetwork,
etc. When the separate networks have each reached a fixedpoint~
the bits are recombined to form the final state of each neuron.Using
thisnon-interacting-bit approximation, requires log~o
morecouplings
than thestraight
forwardapproach,
but one expects to achieve the same storagecapacity
as the two-state neuron models. In
particular,
for Hebbcouplings
in eachsubnetwork,
thestorage capacity
is
expected
to be:°Q
log~ Q
~~~where a2 StS 0.142. Hence, for a realistic value of
Q
=
256,
the presentapproximation
should increase the storagecapacity by nearly
four orders ofmagnitude compared
witha2/Q~.
The
expectations
are born outby
simulations. Many of the simulations wereperformed
on an Intel
iPSC/860 Hypercube
with 32 processors. Thealgorithm
takes theinput
pattern,splits
oil thelog~ Q
bits from each node and sends them to log~Q
different groups of processorswhere the networks relax in
parallel.
After convergence, the bits are recombined andcompared
to the stored pattern one is
trying
to recall. Thisalgorithm
is ideal forparallel processing
and on a
32-processor machine,
italgorithm
runs at lAGflops.
For the gray-tone patterns,the neurons take on values in the interval
[0,1], hence,
the error in therecalling
of a storedpattern is
given by
the Euclidean distance between the stored pattern and the final state of thenetwork~ I.e.,
~P
~j(s, fP)2 (5)
j~~
~
' i
I=1
With this definition of the error, the retrieval
qualities
of gray-tone patterns was measured.Figure
I shows the distance from the stored patterns to the nearest fixedpoint.
As can be seen, for a tzS 0.14 the systemundergoes
a first order transition from aphase
ofhigh
retrieval to aphase
of low retrieval. Previoushigh precision
simulations of networkscomposed
of two-stateneurons showed that this transition
proceeded
in two steps [6]. There was an intermediateregion,
0.138 < a <0.144,
where some fraction of thepatterns
had a fixedpoint nearby
and others did not. Above
a tzS 0.144 there were no state with fixed
points nearby.
This intermediateregion
is washed out in the presentstudy
because itrequires
that agiven
patternmust have
nearby fixed-points
on alllog~ Q
networkssimultaneously.
Aslog~ Q
becomeslarge,
this is
increasingly unlikely.
The secondfigure
shows the basins of attraction as a function of a. Hereagain,
a first orderjump
in thequality
of retrieval can be seen near o m0.14,
inagreement with
previous
calculations for the twc-state neuronHopfield
model.It should be mentioned that for the
simple Hopfield
model the use of real-valuedcouplings
is inefficient [6],
however,
since the abovealgorithm
is not restricted to anyparticular
formof the
coupling
matrix it can be used when thecouplings
are set via a morecomplicated learning
processes. Theproperties
of such networks arecurrently
underinvestigation
and will bereported
upon elsewhere.In summary, the
problem
ofstoring
gray-tone patterns has been discussed and one viable solution in terms ofnon-interacting
bits has beenproposed.
This solution is well suited forparallel
computers and aperformance
of IAGflops
was achieved on theiPSC/860 Hypercube
with 32 nodes. For random uncorrelated patterns, the
proposed
solution worksquite well,
1552 JOURNAL DE PHYSIQUE I N°8
~
i
j
~
'ij
..
j j
i
o
o o o
05
a
Fig. 1. The circles indicate the average distance to the nearest fixed point from a stored pattern
as a function of the storage capacity, a. The +'s indicate the minimum initial distance to a stored pattern so that
a fixed point near the stored pattern is eventually reached, I.e., they indicate the basins of attraction.
however,
for correlated patterns it may beadvantageous
to considerinteracting
bit models in order to reduce the redundantstorage
of information.Acknowledgments.
I would like to thank D. Stauifer for
helpful
comments, the HLRZ at KFA Jfilich for a grant of time on their Intel iPSC/860 Hypercube
andCray-YMP
as well as theUniversity
ofCologne
for a grant of time on their NEC-SX3. Financial support for this work came from the SFB-341.
References
[II Hebb D.O., The Organization of Behavior
(Wiley,
New York,1949).
[2] Hopfield J-J-, Proc. Nat. Acad. Sci. USA 79
(1982)
2554.[3] Amit D., Gutfreund H. and Sompolinsky H., Ann. Phys. 173
(1987)
30;Newman C.M., Neural Networks1
(1988)
223;Kom16s J, and Paturi R., Neural Networks 1
(1988)
239.[4] Forrest B-M-, J. Phvs. A21
(1988)
245;Kritzschmar J. and Kohring G-A-, J.
Phys.
France 51(1990)
223.[5] Homer H., Bormann D., Frick M., Kinzelbach H. and Schmidt A., Z. Phvs. 876
(1989)
381.[6] Kohring G.A., J. Stat. Phvs. 59
(1990)
lo77.[7] Gardner E., J. Phvs. A21
(1988)
257.[8] Abbott L-F-, Network1
(1990)
105.[9] Rieger H., J. Phvs. A23
(1990)
L1273.[10] Marcus C-M-, Waugh F.R. and Westervelt R-M-, Phys. Rev. A41
(1990)
3355.[ill Kohring
G-A-, J. Stat. Phvs. 62(1991)
563.[12] Mertens S., K6hler H-M- and Bos S., J. Phys. A24