• Aucun résultat trouvé

The clustering problem : some Monte Carlo results

N/A
N/A
Protected

Academic year: 2021

Partager "The clustering problem : some Monte Carlo results"

Copied!
6
0
0

Texte intégral

(1)

HAL Id: jpa-00208477

https://hal.archives-ouvertes.fr/jpa-00208477

Submitted on 1 Jan 1976

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

The clustering problem : some Monte Carlo results

D.H. Fremlin

To cite this version:

D.H. Fremlin. The clustering problem : some Monte Carlo results. Journal de Physique, 1976, 37

(7-8), pp.813-817. �10.1051/jphys:01976003707-8081300�. �jpa-00208477�

(2)

(Reçu

le

5 janvier 1976,

révisé le 1 er mars

1976, accepté

le 2 mars

1976)

Résumé. 2014 Je

présente

une série de résultats statistiques sur la distribution de la taille des amas

formés par les atomes répartis aléatoirement dans un milieu continu de deux ou trois dimensions.

Si la portée des interactions est r0, les densités critiques

(auxquelles apparaissent

les amas

infinis)

sont probablement de 2,7 ± 0,1

atomes/sphère

de rayon r0 (en 3 dimensions) et 4,4 ± 0,2

atomes/disque

de rayon r0 (en 2 dimensions).

Abstract. 2014 I present a series of statistical results on the distribution of the size (i.e. the number of atoms) of clusters formed by atoms scattered

randomly

in a continuous medium of two or three dimensions. If the range at which they interact is r0, the critical densities (at which infinite clusters

appear)

are

probably

of 2.7 ± 0.1

atoms /sphere

of radius r0 (in 3 dimensions) and 4.4 ± 0.2 atoms/disc

of

radius r0

(in 2 dimensions).

1. Introduction. - In

[8]

and

[9]

the

following problem

was introduced.

Suppose

that atoms are

randomly

distributed in 2- or 3-dimensional space

according

to a Poisson

point

process.

Suppose

that

atoms are linked

by

some interaction when

they

are

within a

distance ro

of each other. What is the pro-

bability

that there will be an infinite cluster of linked atoms ? The answer

plainly depends

on the

density

W

of the atomic cloud. It is known

(see [10]

for a

general

review of this and allied

problems)

that

there

is a

critical

density We

such

that,

for W

Wc,

the

proba- bility

that any infinite cluster exists is zero, while for W >

Wc,

the

probability

that a

given

atom

belongs

to

an infinite cluster is non-zero, and the

probability

that some infinite cluster exists is 1.

The

problem

that I wish to address here is that of

estimating WC.

The

proof

that

We

exists

yields

weak

upper and lower bounds for it

(see [10], § 7),

but these

bounds are

widely separated,

and the theoretical

problems

associated with

narrowing

them

usefully

are

daunting.

In this paper I offer statistical

evidence,

based on computer simulation of random

clusters,

which I believe enables us to locate the critical values with a certain

degree

of confidence.

I should like to

acknowledge

my

gratitude

to

Pr. M.

Gordon,

for

bringing

this

problem

to my

notice;

to Dr. D. D.

Freund,

for

helpful discussions ;

and to Mr. C.

Bowman,

for

making special

efforts

to

provide

the

computing

facilities I needed. I am also

grateful

to the referees for their

helpful

comments.

2. Method. - I

employed

a PDP-10 computer

(at

the

University

of

Essex)

to generate and count

clusters, using

a

pseudo-random-number

generator.

The method was

essentially

that of

[3],

with some

refinements in the

programming.

For each

cluster,

an

atom was fixed at the

origin,

and the

sphere

of radius ro around it

searched,

i. e.

points

were

generated

in this

sphere

in such a way as to simulate a Poisson

point

process,

using

the distribution functions described in

[9],

part

4,

and

[8],

part 3. The atoms

found

at this

stage were listed and the

ro-sphere

around each was

searched

again.

In order to eliminate

double-counting,

each atom found must be checked to determine whether it lies in a volume

already searched;

i.e. whether it is within a distance ro of some atom whose

neighbour-

hood has

already

been

investigated.

To reduce the time

required

for these checks

(which

appear at first

sight

to

require 0(n2) operations,

where n is the size of the

cluster),

I used scatter-storage

(hash-coding)

to list the

coordinates of the atoms found.

(The

method is

described under the name open type

addressing

system in

[7].)

In my programs the coordinates of each atom were stored in an address

keyed by

the

integer

parts of the coordinates

(modulo ro) ;

thus

only

9 or

27 areas of the store had to be referred to while

checking

each new atom, and the

computing

time

required

to generate a cluster of n elements became

0(n).

The

later,

more

refined,

versions of the program

ran in about 140 K of store,

handling

clusters up to 25 K in

size,

and took about 20 ms per atom.

Following [3]

and

[6],

I also in most of the work

counted the

generation

sizes. Here the O’th

generation

consists of the

starting

atom; the first

generation,

those atoms within

ro of

the

starting

atom; the second

generation,

those atoms within ro of an atom in the

Article published online by EDP Sciences and available at http://dx.doi.org/10.1051/jphys:01976003707-8081300

(3)

814

FIG. 1 a. - Histograms of cluster sizes found for 3-dimensional clusters, for values of W between 2.60 and 2.85. The bars at the bottom correspond to clusters that reached the computational limit of 25 001 atoms. A modified logarithmic scale is used for the vertical axis. The numbers of clusters on which the histograms are

based are as follows :

FIG. lb. - Histograms of cluster sizes found for 2-dimensional clusters, for values of W between 4.0 and 4.6. The numbers of clusters

on which the histograms are based are as follows :

first

generation (other

than those in the O’th and first

generations);

and so on. The information

yielded

is

summarized in

figure

2.

Two caveats must be made here. As with any

computer-generated numbers,

the coordinates of the atoms must be distributed not

continuously

but on a

lattice. In my programs, the

grain

of the lattice was

2-11 ro

(in

three

dimensions;

2-15 ro in two dimen-

sions).

This seems to

correspond

to an error in

ro of

one or two

parts

in

211,

or an error in W of three to six

parts in

211;

I am confident that the real error from this cause is less.

Secondly,

there is

always

room for

doubt

concerning

random-number

generators;

my

own is a home-made version which has been checked

(in

a

simple-minded fashion)

and seems, in

particular,

to have no

periodicity

of order 2 or 3

(see [1]).

3.

Theory.

- In the

interpretation

of the results

it will be

helpful

to know the

following

facts. Let

Gk

=

Gk( W)

be the mean size of the k’th

generation

of a cluster

generated

with

density parameter

W.

FIG. 2a. - Growth curves for mean generation sizes in 3-dimen- sional clusters, for values of W between 2.60 and 2.80. A square-root scale is used for the vertical axis. Each point plotted is based on at

least 20 clusters.

FIG. 2b. - Growth curves for mean generation sizes in 2-dimen- sional clusters, for values of W between 4.0 and 4.6. Each point

plotted is based on at least 20 clusters.

Proof.

- For a

rigorous proof

I think the

theory

of

disintegration

of measures is needed.

However,

I believe that the

following

argument is

essentially

sound.

Imagine

that the clusters are

generated

as in

the computer program,

generation by generation,

so

that,

at the n’th stage, we have a

region V

that has

already

been

searched,

and a finite number of atoms of the n’th

generation,

at

points

ai, ..., are For the rest of the

growth

of the

cluster,

it is

only

the

region V (from

which atoms are henceforth

excluded)

and the

points

a1, ..., ar that are

significant ;

the distributions of the atoms of earlier

generations

withiri v is imma- terial. Write

for the

expected

total number of atoms in the genera- tions n +

1,

..., n + m,

given

that the atoms of the

(4)

For the effect of

excluding

the

region V

from our

further

searching

must be to reduce the number of descendants.

And E Gk

is

by

definition

Q(Ø ; 0),

I -k-m

which

by

the

homogeneity

of the

underlying

Poisson

process is

equal

to

6(0 ? a)

for any a.

Thus we see that

So, given

that the n’th

generation

has r atoms, the

expectation

of the number of atoms in the

generations

n +

1,

..., n + m is not greater

than r E Gk- (The

I -k-m

argument above

implicitly

assumes that

r > 0 ;

but

obviously

the last statement at least is true also for

r =

0.) Summing

over all the

possible

numbers of

atoms in the n’th

generation,

where

p(n, r)

is the

probability

that the n’th

generation

contains exactly

r atoms.

Corollary.

-

If,

for some n,

Gn(W) 1,

then the

expected

total size of the

cluster,

is

finite,

and W

Wc’

00

Proof.

-

Clearly y Gk( W)

is the

expected

total

k=0

size of the cluster. If

Gn(W) 1,

then for any m > 0

we have

so that

and now W W’ ,

Wr.

Remark. - It is clear from this

corollary

that

G(W)

is finite if there is some n for which

Gn(W) 1;

in this

case, of course, the

Gk’s

decrease

exponentially

after

a certain

point (see

the

graphs

in

Fig. 2).

It appears to be unknown whether

G(W)

is finite for every

W

Wc.

4. Results. -

Following [3]

and

[6],

I use the

density parameter W

to mean the average number of atoms in a

sphere (in

3

dimensions)

or a disc

(in

2

dimensions)

of radius ro, where ro is the distance up to which the atoms interact. Thus each atom is on

average connected to W others.

In

figure

1 I present

histograms

of the cluster sizes found. Since the computer is limited in store, the bar at the bottom of each column collects

indiscriminately

all clusters of sizes greater than 25 000.

Naturally,

this

bar

lengthens

as W increases. The

phenomenon

that

gives

us some

hope

that we have

actually passed through

a critical value is the dramatic fall in the

proportion

of clusters of size 1 000-25 000. I suggest that this is

because,

for W >

Wc,

the

larger

a cluster

becomes,

the better is its chance of

becoming

an

infinite cluster.

Unfortunately,

I have no

proof

that

the visible advance and retreat of

the finite part

of the

histogram

- that

referring

to clusters of size up to 25 000 - is not an artefact of the

logarithmic

scale I

am

using,

or of the actual limit 25 000. Another way of

expressing

these

problems

is set out in table I. Here I

give

the average sizes of clusters of size less than or

equal

to 25 000. These

give

dramatic

peaks

around

W = 2.7 in 3 dimensions and W = 4.4 in 2 dimen- sions. But if we repeat the exercise with a limit of 2 500 instead of 25

000,

the

picture changes;

the

peak

is less dramatic - as is natural - but also there is a

distinct shift downwards in the value of W at which the

peak

occurs in 2’dimensions. The statistical errors in the tabulated values are enormous, but the real

problem

is to find some

argument

to assure us that there would not be a

corresponding

rise in the appa- rent critical value if we were to

repeat

this work with

a

larger

limit on the cluster size. In view of

figure 1, I

find it difficult to

believe

that

Wc

can

really

be greater than 2.85 in 3

dimensions,

but I have to admit that this

method still

gives

no

really

solid

grounds

for

believing

that I have found upper bounds for

Wc.

Concerning

lower

bounds,

the situation is much

(5)

816

TABLE I

(a) Average

sizes

of

3-dimensional clusters

of

size not greater than no,

for

no = 25 000 and 2 500 and values

of

W between 2.6 and 2. 85. The ranges

given

are the estimated standard deviations

of

the numbers

they follow.

The numbers in brackets are the numbers

of

trials in each case.

(b) Average

sizes

of

2-dimensional clusters

of

size not greater than no,

for

no = 25 000 and 2 500 and values

of

W between 4.0 and 4.6

(*) The discrepancy between these figures and those above them is due to the inclusion in the lower figures of data from clusters

generated with a limit set between 2 500 and 25 000.

happier. Following [3]

and

[6],

I present in

figure

2

curves of mean

generation

size

plotted against

gene- ration numbers for selected values of W. As remarked in

paragraph 3,

we know that if

Gk,

the mean size of

the k’th

generation,

is ever less than

1,

then W

Wc.

Subject

to the

possibility

of statistical error,

therefore,

the curves here show that

Wc >

2.65 in 3 dimensions

and

We

> 4.2 in 2 dimensions.

(Fortunately,

we have

upper bounds for

Gk (see [3], § 3.2),

so that the

probability

of a

spurious

result is at least

estimable.)

However,

we

again

have no way of

being

sure that

even the

steepest

curves

plotted

here are not

going eventually

to

peak

out and decline towards zero.

So

concerning

upper bounds for

We

this method is

even less

satisfactory

than the

histogram approach.

Finally,

I

give

three

drawings

of actual two-dimen- sional clusters

(Fig. 3).

There has been a certain amount

. of

speculation concerning the frontiers

of these clusters

(see

e.g.

[5]),

and it seems worth while

having

further

pictures

to

supplement

those of

[2].

5. Conclusions. -

Subject

to the

possibility

of

statistical errors, the

probability

of which is estimable and can be reduced

by repetition

of the above calcu-

lations,

the curves in

figure

2 show that

Wc >

2.65 in

3

dimensions,

and

Wc >

4.2 in 2 dimensions. The

histograms

of

figure

1 and the statistics of table I suggest that

We

2.8 in 3 dimensions and

Wc

4.6

in 2

dimensions,

but there are deficiencies in the method which mean that without some further theore- tical advance no amount of

repetition

of the calcula- tions can

really

establish these

figures.

FIG. 3. - Three two-dimensional clusters. The atoms are drawn

as circles of diameter ro, so that two atoms interact if the correspond- ing circles overlap. The solid circle in each picture is the starting

atom. Those atoms from which the cluster is, or may be, still growing

are marked with crosses. a) W = 4.3; 4 397 atoms, complete.

b) W = 4.4; 7 000 atoms, still growing. c) W = 4.5 ; 6 203 atoms,

complete.

(6)

ture of clusters, J. Physique 34 (1973) 341-344. percolation theory, Adv. Phys. 20 (1971) 325-357.

Références

Documents relatifs

This last quantity is not exactly equal to zero, because it's due to the finite aperture of the diaphragms : we don't study rigorously the light scattered on the axis of the

(ii) The cluster was subject to periodic boundary conditions. - Schematic one-dimensional analogy of the potentials and boundary conditions used in the calculations a)

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

When the number of students is large, an optimal static 6-clustering will isolate four groups and split one in two halves in every snapshots ; whereas the optimal dynamic

On the positive side, we provide a polynomial time algorithm for the decision problem when the schedule length is equal to two, the number of clusters is constant and the number

Schoen discovered the gyroid by applying the Bonnet transformation to the surface system formed by the adjoint P and D surfaces (4), and by monitoring the movements of the

In summary, we conclude that the Co atoms dif- fused in Si and implanted in Si and subsequently annealed form segregated intermetallic phases with Si, similarly to Fe implanted

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des