HAL Id: jpa-00208477
https://hal.archives-ouvertes.fr/jpa-00208477
Submitted on 1 Jan 1976
HAL
is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire
HAL, estdestinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
The clustering problem : some Monte Carlo results
D.H. Fremlin
To cite this version:
D.H. Fremlin. The clustering problem : some Monte Carlo results. Journal de Physique, 1976, 37
(7-8), pp.813-817. �10.1051/jphys:01976003707-8081300�. �jpa-00208477�
(Reçu
le5 janvier 1976,
révisé le 1 er mars1976, accepté
le 2 mars1976)
Résumé. 2014 Je
présente
une série de résultats statistiques sur la distribution de la taille des amasformés par les atomes répartis aléatoirement dans un milieu continu de deux ou trois dimensions.
Si la portée des interactions est r0, les densités critiques
(auxquelles apparaissent
les amasinfinis)
sont probablement de 2,7 ± 0,1atomes/sphère
de rayon r0 (en 3 dimensions) et 4,4 ± 0,2atomes/disque
de rayon r0 (en 2 dimensions).
Abstract. 2014 I present a series of statistical results on the distribution of the size (i.e. the number of atoms) of clusters formed by atoms scattered
randomly
in a continuous medium of two or three dimensions. If the range at which they interact is r0, the critical densities (at which infinite clustersappear)
areprobably
of 2.7 ± 0.1atoms /sphere
of radius r0 (in 3 dimensions) and 4.4 ± 0.2 atoms/discof
radius r0
(in 2 dimensions).1. Introduction. - In
[8]
and[9]
thefollowing problem
was introduced.Suppose
that atoms arerandomly
distributed in 2- or 3-dimensional spaceaccording
to a Poissonpoint
process.Suppose
thatatoms are linked
by
some interaction whenthey
arewithin a
distance ro
of each other. What is the pro-bability
that there will be an infinite cluster of linked atoms ? The answerplainly depends
on thedensity
Wof the atomic cloud. It is known
(see [10]
for ageneral
review of this and allied
problems)
thatthere
is acritical
density We
suchthat,
for WWc,
theproba- bility
that any infinite cluster exists is zero, while for W >Wc,
theprobability
that agiven
atombelongs
toan infinite cluster is non-zero, and the
probability
that some infinite cluster exists is 1.
The
problem
that I wish to address here is that ofestimating WC.
Theproof
thatWe
existsyields
weakupper and lower bounds for it
(see [10], § 7),
but thesebounds are
widely separated,
and the theoreticalproblems
associated withnarrowing
themusefully
aredaunting.
In this paper I offer statisticalevidence,
based on computer simulation of random
clusters,
which I believe enables us to locate the critical values with a certain
degree
of confidence.I should like to
acknowledge
mygratitude
toPr. M.
Gordon,
forbringing
thisproblem
to mynotice;
to Dr. D. D.Freund,
forhelpful discussions ;
and to Mr. C.
Bowman,
formaking special
effortsto
provide
thecomputing
facilities I needed. I am alsograteful
to the referees for theirhelpful
comments.2. Method. - I
employed
a PDP-10 computer(at
theUniversity
ofEssex)
to generate and countclusters, using
apseudo-random-number
generator.The method was
essentially
that of[3],
with somerefinements in the
programming.
For eachcluster,
anatom was fixed at the
origin,
and thesphere
of radius ro around itsearched,
i. e.points
weregenerated
in thissphere
in such a way as to simulate a Poissonpoint
process,
using
the distribution functions described in[9],
part4,
and[8],
part 3. The atomsfound
at thisstage were listed and the
ro-sphere
around each wassearched
again.
In order to eliminatedouble-counting,
each atom found must be checked to determine whether it lies in a volume
already searched;
i.e. whether it is within a distance ro of some atom whoseneighbour-
hood has
already
beeninvestigated.
To reduce the timerequired
for these checks(which
appear at firstsight
to
require 0(n2) operations,
where n is the size of thecluster),
I used scatter-storage(hash-coding)
to list thecoordinates of the atoms found.
(The
method isdescribed under the name open type
addressing
system in
[7].)
In my programs the coordinates of each atom were stored in an addresskeyed by
theinteger
parts of the coordinates
(modulo ro) ;
thusonly
9 or27 areas of the store had to be referred to while
checking
each new atom, and thecomputing
timerequired
to generate a cluster of n elements became0(n).
Thelater,
morerefined,
versions of the programran in about 140 K of store,
handling
clusters up to 25 K insize,
and took about 20 ms per atom.Following [3]
and[6],
I also in most of the workcounted the
generation
sizes. Here the O’thgeneration
consists of the
starting
atom; the firstgeneration,
those atoms within
ro of
thestarting
atom; the secondgeneration,
those atoms within ro of an atom in theArticle published online by EDP Sciences and available at http://dx.doi.org/10.1051/jphys:01976003707-8081300
814
FIG. 1 a. - Histograms of cluster sizes found for 3-dimensional clusters, for values of W between 2.60 and 2.85. The bars at the bottom correspond to clusters that reached the computational limit of 25 001 atoms. A modified logarithmic scale is used for the vertical axis. The numbers of clusters on which the histograms are
based are as follows :
FIG. lb. - Histograms of cluster sizes found for 2-dimensional clusters, for values of W between 4.0 and 4.6. The numbers of clusters
on which the histograms are based are as follows :
first
generation (other
than those in the O’th and firstgenerations);
and so on. The informationyielded
issummarized in
figure
2.Two caveats must be made here. As with any
computer-generated numbers,
the coordinates of the atoms must be distributed notcontinuously
but on alattice. In my programs, the
grain
of the lattice was2-11 ro
(in
threedimensions;
2-15 ro in two dimen-sions).
This seems tocorrespond
to an error inro of
one or two
parts
in211,
or an error in W of three to sixparts in
211;
I am confident that the real error from this cause is less.Secondly,
there isalways
room fordoubt
concerning
random-numbergenerators;
myown is a home-made version which has been checked
(in
asimple-minded fashion)
and seems, inparticular,
to have no
periodicity
of order 2 or 3(see [1]).
3.
Theory.
- In theinterpretation
of the resultsit will be
helpful
to know thefollowing
facts. LetGk
=Gk( W)
be the mean size of the k’thgeneration
of a cluster
generated
withdensity parameter
W.FIG. 2a. - Growth curves for mean generation sizes in 3-dimen- sional clusters, for values of W between 2.60 and 2.80. A square-root scale is used for the vertical axis. Each point plotted is based on at
least 20 clusters.
FIG. 2b. - Growth curves for mean generation sizes in 2-dimen- sional clusters, for values of W between 4.0 and 4.6. Each point
plotted is based on at least 20 clusters.
Proof.
- For arigorous proof
I think thetheory
ofdisintegration
of measures is needed.However,
I believe that thefollowing
argument isessentially
sound.
Imagine
that the clusters aregenerated
as inthe computer program,
generation by generation,
so
that,
at the n’th stage, we have aregion V
that hasalready
beensearched,
and a finite number of atoms of the n’thgeneration,
atpoints
ai, ..., are For the rest of thegrowth
of thecluster,
it isonly
theregion V (from
which atoms are henceforthexcluded)
and thepoints
a1, ..., ar that aresignificant ;
the distributions of the atoms of earliergenerations
withiri v is imma- terial. Writefor the
expected
total number of atoms in the genera- tions n +1,
..., n + m,given
that the atoms of theFor the effect of
excluding
theregion V
from ourfurther
searching
must be to reduce the number of descendants.And E Gk
isby
definitionQ(Ø ; 0),
I -k-m
which
by
thehomogeneity
of theunderlying
Poissonprocess is
equal
to6(0 ? a)
for any a.Thus we see that
So, given
that the n’thgeneration
has r atoms, theexpectation
of the number of atoms in thegenerations
n +
1,
..., n + m is not greaterthan r E Gk- (The
I -k-m
argument above
implicitly
assumes thatr > 0 ;
butobviously
the last statement at least is true also forr =
0.) Summing
over all thepossible
numbers ofatoms in the n’th
generation,
where
p(n, r)
is theprobability
that the n’thgeneration
contains exactly r atoms.
Corollary.
-If,
for some n,Gn(W) 1,
then theexpected
total size of thecluster,
is
finite,
and WWc’
00
Proof.
-Clearly y Gk( W)
is theexpected
totalk=0
size of the cluster. If
Gn(W) 1,
then for any m > 0we have
so that
and now W W’ ,
Wr.
Remark. - It is clear from this
corollary
thatG(W)
is finite if there is some n for which
Gn(W) 1;
in thiscase, of course, the
Gk’s
decreaseexponentially
aftera certain
point (see
thegraphs
inFig. 2).
It appears to be unknown whetherG(W)
is finite for everyW
Wc.
4. Results. -
Following [3]
and[6],
I use thedensity parameter W
to mean the average number of atoms in asphere (in
3dimensions)
or a disc(in
2
dimensions)
of radius ro, where ro is the distance up to which the atoms interact. Thus each atom is onaverage connected to W others.
In
figure
1 I presenthistograms
of the cluster sizes found. Since the computer is limited in store, the bar at the bottom of each column collectsindiscriminately
all clusters of sizes greater than 25 000.
Naturally,
thisbar
lengthens
as W increases. Thephenomenon
thatgives
us somehope
that we haveactually passed through
a critical value is the dramatic fall in theproportion
of clusters of size 1 000-25 000. I suggest that this isbecause,
for W >Wc,
thelarger
a clusterbecomes,
the better is its chance ofbecoming
aninfinite cluster.
Unfortunately,
I have noproof
thatthe visible advance and retreat of
the finite part
of thehistogram
- thatreferring
to clusters of size up to 25 000 - is not an artefact of thelogarithmic
scale Iam
using,
or of the actual limit 25 000. Another way ofexpressing
theseproblems
is set out in table I. Here Igive
the average sizes of clusters of size less than orequal
to 25 000. Thesegive
dramaticpeaks
aroundW = 2.7 in 3 dimensions and W = 4.4 in 2 dimen- sions. But if we repeat the exercise with a limit of 2 500 instead of 25
000,
thepicture changes;
thepeak
is less dramatic - as is natural - but also there is a
distinct shift downwards in the value of W at which the
peak
occurs in 2’dimensions. The statistical errors in the tabulated values are enormous, but the realproblem
is to find someargument
to assure us that there would not be acorresponding
rise in the appa- rent critical value if we were torepeat
this work witha
larger
limit on the cluster size. In view offigure 1, I
find it difficult to
believe
thatWc
canreally
be greater than 2.85 in 3dimensions,
but I have to admit that thismethod still
gives
noreally
solidgrounds
forbelieving
that I have found upper bounds for
Wc.
Concerning
lowerbounds,
the situation is much816
TABLE I
(a) Average
sizesof
3-dimensional clustersof
size not greater than no,for
no = 25 000 and 2 500 and valuesof
W between 2.6 and 2. 85. The rangesgiven
are the estimated standard deviationsof
the numbersthey follow.
The numbers in brackets are the numbersof
trials in each case.(b) Average
sizesof
2-dimensional clustersof
size not greater than no,for
no = 25 000 and 2 500 and valuesof
W between 4.0 and 4.6(*) The discrepancy between these figures and those above them is due to the inclusion in the lower figures of data from clusters
generated with a limit set between 2 500 and 25 000.
happier. Following [3]
and[6],
I present infigure
2curves of mean
generation
sizeplotted against
gene- ration numbers for selected values of W. As remarked inparagraph 3,
we know that ifGk,
the mean size ofthe k’th
generation,
is ever less than1,
then WWc.
Subject
to thepossibility
of statistical error,therefore,
the curves here show that
Wc >
2.65 in 3 dimensionsand
We
> 4.2 in 2 dimensions.(Fortunately,
we haveupper bounds for
Gk (see [3], § 3.2),
so that theprobability
of aspurious
result is at leastestimable.)
However,
weagain
have no way ofbeing
sure thateven the
steepest
curvesplotted
here are notgoing eventually
topeak
out and decline towards zero.So
concerning
upper bounds forWe
this method iseven less
satisfactory
than thehistogram approach.
Finally,
Igive
threedrawings
of actual two-dimen- sional clusters(Fig. 3).
There has been a certain amount. of
speculation concerning the frontiers
of these clusters(see
e.g.[5]),
and it seems worth whilehaving
furtherpictures
tosupplement
those of[2].
5. Conclusions. -
Subject
to thepossibility
ofstatistical errors, the
probability
of which is estimable and can be reducedby repetition
of the above calcu-lations,
the curves infigure
2 show thatWc >
2.65 in3
dimensions,
andWc >
4.2 in 2 dimensions. Thehistograms
offigure
1 and the statistics of table I suggest thatWe
2.8 in 3 dimensions andWc
4.6in 2
dimensions,
but there are deficiencies in the method which mean that without some further theore- tical advance no amount ofrepetition
of the calcula- tions canreally
establish thesefigures.
FIG. 3. - Three two-dimensional clusters. The atoms are drawn
as circles of diameter ro, so that two atoms interact if the correspond- ing circles overlap. The solid circle in each picture is the starting
atom. Those atoms from which the cluster is, or may be, still growing
are marked with crosses. a) W = 4.3; 4 397 atoms, complete.
b) W = 4.4; 7 000 atoms, still growing. c) W = 4.5 ; 6 203 atoms,
complete.
ture of clusters, J. Physique 34 (1973) 341-344. percolation theory, Adv. Phys. 20 (1971) 325-357.