T OM A. B. S NIJDERS M ARINUS S PREEN
Segmentation in personal networks
Mathématiques et sciences humaines, tome 137 (1997), p. 25-36
<http://www.numdam.org/item?id=MSH_1997__137__25_0>
© Centre d’analyse et de mathématiques sociales de l’EHESS, 1997, tous droits réservés.
L’accès aux archives de la revue « Mathématiques et sciences humaines » (http://
msh.revues.org/) implique l’accord avec les conditions générales d’utilisation (http://www.
numdam.org/conditions). Toute utilisation commerciale ou impression systématique est consti- tutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte- nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
SEGMENTATION IN PERSONAL NETWORKS Tom A.B.
SNIJDERS1 ,
MarinusSPREEN1
RÉSUMÉ 2014 La segmentation des réseaux personnels
On propose un concept et plusieurs mesures de la segmentation des réseaux personnels. Les auteurs défendent la
thèse que les implications de la segmentation des réseaux personnels sont, d’un certain point de vue, opposées de
celles des réseaux totaux. Les mesures sont illustrées par l’exemple d’un réseau de relations de confiance dans un service de fonctionnaires. Des estimateurs du degré de segmentation sont proposés pour le cas où les relations dans le réseau personnel sont observées sur la base d’un échantillonnage plutôt que dans leur totalité.
SUMMARY - A concept and several measures for segmentation of personal networks are proposed. It is argued that the implications of segmentation of personal networks are, in a sense, the opposite of those of segmentation of entire networks. The measures are illustrated by the example of the trust network in a civil
service departement. For the case where relations in the personal network are observed by a sample rather than
completely, estimators for the segmentation measures are given.
INTRODUCTION
The concept of
segmentation
wasproposed
in Baerveldt &Snijders (1994)
as a structural concept forcomplete
networks. The definedsegmentation
index intends to measure thedegree
of division ot’ a social network into
suhgroups
withhigh within-group
and lowbetween-group
densities. The theoretical définition of
segmentation
isgiven
as follows(Baerveldt
&Snijders, p.214):
is the
(legt-ee
to thereis, , for
actors in the netYvork, a contrast,or
(!istance,
their and the rest; oneiiiight
say, bem’eenin-group
«ntl out-grol/jJ.In this paper a local, or
ego-oriented,
concept ofsegmentation
is introducedby using personal
networks as unit of
analysis.
Différences ininterprétation
ofsegmentation
ofpersonal
and ofcomplete
networks will hebriefly
discussed and somehypothèses
will be formulated. Thepersonal segmentation
index will he illustratedby
data from a networkstudy
in a civilorganisation
conducted andreported by
Bulder,Flap
& I,eeuw(1993).
For those situations in which it isimpossihle
to interview all memhers of thepersonal
network(too expensive,
toomuch time,
etc.),
thepersonal
network index must bcestimated;
somesampling possibilities
will he discussed.
1 Dcpartjnent ofStatistics & Measurcment t ’I’lieory/1 CS, University of Groningcn, The Netherlands.
SEGMENTATION IN COMPLETE NETWORKS
Consider an undirected
graph
with N vertices andadjacency
matrix X,reprcsenting
a set ofsocial actors and some
relationship
between them. Theadjacency
matrix is defincdhy X4
= 1if there is an
edge
betwccn vertices i andj,
andXij
= () otherwise(Xii
= 0 for alli).
Sincethe
graph
is undirectedXj
=Xji
for all i,j.
Thedegree
of vertex i isX;+
=X+1
=E j
We say that i
and,j
areacquainted
= 1.Thé
segmentation
index is hased on the concept of sociometric distance in agraph,
defined asthe
length
of the shortestpath
between two vertices in agraph:
(1,d(i,.j)
= 1 if thcre isa relation between i
and j, (J( i,j)
= 2 if there is no rclation between iand .j
but both arcacquainted
with a common vertcx k, etc.. If nopath
is ohserved hetwcen iand /B
l.e.,they
are vertices in diffcrent
components of
thegraph,
the sociometric distancc is infinitc, denotedlJ(i,j)
= 00. In an undirectedgraph
of N persons, there areN(N- 1)/2
distances, some ot’which may hc infinité. Tllc
segmentation
indcx rellects apai-ticular
kindol’ dispcrsion
of thèseN(N- )I2
distanccs. The mathematical definitions l’orsegmentation
in acomplete
networkdefined
hy
an undirectedgraph
on N vertices is thcfollowing :
. Define
hy D,-
the numher ofpairs
of vCl1iccs at mutual distance r;. Detine the 1’raction of
pairs
of vertices at mutual distance rhy
e Detinc the tractions of
pairs
at distance r or greaterhy
Then F 1 is thc
density
of thegraph
whileP2
= 1 -FI
is the fraction ofpairs
of vertices thatare not
acquaintcd.
ThescgmcnLation
mesure is then del-ined as:1,c., the il-actions ol’ vertex
pairs
at distance r or grcater among thosepairs
that arc notdircctly
connected. It is
brieny ai-gued
in Baervcldt &Snijders ( 194)
that in acomplete
nctworkdistance 2 may he
interpretcd
as a closedistance, 3
as intcrmediatc, and 4 or more aslong.
Therefore Baei-veldt &
Snijders
propose to work withS-3
or54.
Note that:and that
Sr
= 1 for /’ > 3 if andonly
if thegraph
eonsists of 2 or more disconnectedcliques.
To illusti-ate the
segmentation
index considerFigure
1.Figure
1 is the network ofrolationships
hascd on mutual trust among 29
employées
in a civilorganization
asreported
at page 46 in Bulder et al.(1993).
The researeh about thisorganisation
is also discussed in Butder, Leeuw,and
Flap ( 1996)).
Mutual trust relations are defined as those rulati()ns hetweenemployées
inwhich the
suhjects
of discussion are notsotety
"ncutral" work-relatedtopies,
such as the futureof the
organization.
The authors consider trust relations as an indication for a certaindegree
oufpersonal
involvement with théorganization
and for a certaindegree
of trust in each other.Therefore the relations are
symmotrica) hy
nature.Figure
1. Mutual trust relations in civilorganization
To determine the
frequency
distribution of sociometric distances between the 406 relations the network program GRADAP(Sprenger
&Stokman, 1989)
was used(see
table1).
Table 1.
Frequency
distribution of distances within the networkNote that the
density
of the network isFi
=.14 ,
and that the network consists of 1 connected component, so that infinite distances do not occur. Thefrequency
distribution of distances leadsto the
following segmentation
measures:SEGMENTATION IN PERSONAL NETWORKS
The concept of
segmentation
make sense also inpersonal
networks. Thepersonal
network ofego may be said to be
highly segmented
when the set of his directacquaintances (alters)
iscomposed
of two or moresubgroups
withhigh within-group
and lowbetween-group
densities.The theoretical définition of
segmentation
of thepersonal
network isanalogous
to that ofcomplete
networks:Segmentation
is thedegree
towhich,
within the setof
actorsdirectly
connected toego
(the "alters "),
thereis for
each alter a gap between the other alters to whom he ishimself directly connected,
and the other alters to whom he is notdirectly
connectedThe
segmentation
ofego’s personal
network is defined as thesegmentation
of the networkformed
by
his alters. Thesegmentation
index thus is based on distances betweenpairs
of alters.The formula will be
given
below.The
interpretation
ofsegmentation
ofpersonal
networks isquite
distinct from theinterpretation
of
segmentation
ofcomplete
networks. If ego has ahighly segmented network,
then:(a) -
ego has moreopportunities
to"manage"
what others know about him: he can offer different views of himself to alters who are not, oronly distantly,
related to eachother;
this willgive
him more freedom ofaction;
(b) -
it islikely, although
not necessary, that there will also be differences in the characteristics of the alters(segmentation
is often associated withsegregation
as describedby Freeman, 1978);
if this is the case, then ego has access to more varied social resources, and his social
capital
willbe
larger.
This
implies that,
in a certain sense, the consequences of this local version ofsegmentation
arejust
the reverse of those of theglobal segmentation:
·
persons/actors
are less influencedby
others whenthey
have a morestrongly segmented personal network;
· behaviour of
persons/actors
is lesspredictable
whenthey
have morestrongly segmented personal
networks.This reversion can be
easily
understoodby noticing
that a person with ahighly segmented personal
network is a"bridge"
betweenparts
of the networkthat,
withouthim,
would havebeen not, or
only distantly,
connected.Therefore,
individuals withhighly segmented personal
networks lead to a smaller
segmentation
of thecomplete
network. Note that this way ofreasoning
is similar to the "structural holes"theory
asproposed by
Burt(1992, 1995).
Theconcept
ofpersonal
networksegmentation, however,
is different from theconcept
of structural holes. The latter is based on thestrength
of thestrategic position
of ego with respect to his alters.E.g.,
when thepersonal
network iscomposed
of two disconnectedcliques
then thesegmentation
is maximal whereas Burt’s measure for structural holes is moderate.Just as in the case of
segmentation
of acomplete network,
thedensity
is an essentialconcept
that can beregarded
as more basic thansegmentation.
Thesegmentation
measure must controlin some sense for the
density.
For the mathematical notation we define:the
neighbourhood
at distance 1 of vertex i. Thisneighbourhood
will be identified with thegraph generated by
vertex set Both the vertex set and thegraph commonly
are referredto as the
(first-order) personal
network of i. The number of elements ofUl(i)
is thedegree
ofvertex i,
denotedXi+.
The sociometric distance within the
graph UI(i)
is denoteddli ; thus,
forvertices j
and hthat both are elements of
Ul(i) , d1i(j,h)
is thelength
of the shortestpath
contained inVI(i) between j
and h. This distance cannot be shorter than the distance in the entirenetwork,
so thatd(j,h) _
The
density
of thepersonal
network is thedensity
of thegraph
where
The
segmentation
of thepersonal
network is defined as:Note that:
i.e.,
the number ofpairs
in thepersonal
network that are notmutually directly
related.If
q(1)
islow,
then there are few direct relations between theacquaintances
of i. Theparameter q(1)
can beinterpreted
as a localtransitivity parameter:
thedegree
to which i’sacquaintances
aremutually acquainted
among themselves. IfSJi)
ishigh,
then thereis,
amongthe
acquaintances
of i, thetendency
that eitherthey
aredirectly related,
orthey
are far apart(as
measured within the set of i’s
acquaintances).
For an illustration of
Sr(i)
consider the first-order network of vertex 17 inFigure
1. Thedegree
of vertex 17 is 6 and its
neighbourhood
isU1(17) _ {2,19,2~,27,29,30}. Figure
2gives
thecorresponding graph.
Figure
2.Graph ~/i(17)
Of the 15
pairs
of alters ingraph Ul(17),
3pairs
are at distance 1 and 12pairs
at distance 00.The
density
islow; T’)( 17) =
.10. Thepersonal segmentation
index for node 17 is800(17)
=1, i.e.,
thepersonal
network ismaximally segmented
because thegraph
consists of a1-clique
and3 isolated nodes.
Which distances may be considered to be
long,
when measured within first orderpersonal
networks? There are two reasons
why
it is sensible to use a lower threshold for"long"
distances in
personal
than incomplete
networks. In the firstplace, personal
networks will oftenbe situated within smaller social circles than
complete networks,
so the same absolute value fora distance may
be,
relative to an overall distribution ofdistances,
consideredlarger
in apersonal
than in a
complete
network. In the secondplace,
when two persons aredistantly
related within thepersonal
network of individuali,
e.g.,d¡i(h,j)
=4,
it isquite
wellpossible
that in thesurrounding complete
networkthey
are moreclosely related,
e.g.,d(h, j)
= 2 or 3. For themeaning
of the indirect socialrelationship
between twoacquaintances
ofi,
it is of minorimportance
whether the shortest socialpath
between them consistsentirely
of other individuals who are alsoacquaintances
ofi,
or whether this shortestpath
is not contained within thenetwork of i’s direct
acquaintances.
It was remarked above thatd(h, j)
for alli, h, j.
However, d1i(h,j)
= 1 if andonly
ifd(j,h)
= 1 : direct relations between i’sacquaintances
canbe traced
already
within his first-orderpersonal
network.For the two reasons
mentioned,
we shallinterpret
a sociometric distance of 3 within apersonal
network
already
as alarge
distance. Theproposed
measure of localsegmentation
is:The
preceding
argumentimplies
that it also makes sense to consider distances betweenacquaintances j
and h ofperson i
as shortestlengths
ofpaths
notnecessarily
contained within i’spersonal
networkU1(i).
Since we arethinking
ofpersonal networks,
it is of doiibtfulapplicability
to try tostudy large
distances between individuals as defined in the whole network:large
distances will beexceedingly
hard to measure. We restrict attention to distances in the second-orderpersonal network,
defined a:All definitions
given
above forsegmentation
measures can berepeated, substituting
distanceswithin
U2(i)
for distances withinV1(i) .
The distance within thegraph
is denotedd2i~
An alternative
segmentation
index of thepersonal
network is definedby:
The second
equality
holds becaused1i(j,h)
= 1 if andonly
ifd2i(j,h)
= 1 if andonly
ifd(j,h)
= 1(see above).
Forsegmentation
measures ofpersonal
networks that refer to the directas well as the indirect social
surroundings
of the focalindividual,
we propose the measure:Since for direct
acquaintances j
and h ofi,
the distanced2i(j,h )
= 2 if andonly
ifd(j,h)
=2,
the définition of
S3 (i)
can also begiven
asThis means that for the definition of
S3 (i) ,
it is not necessary to make referencespecifically
toU2(i).
To compare the relevance of the two
proposed
measuresS3(i)
andS3 (i) ,
thefollowing
considerations can be
put
forward:· Measure
S3 (i)
isconceptually
moremeaningful,
because for the indirect relations between individuals in i’s socialsurroundings,
it is a rather irrelevant restriction to measure themby paths
that must be contained within i’s direct socialsurroundings.
,
· For the determination of
S3 (i),
more information isrequired
than for the determination ofS~(i) ; therefore, S3(i)
can be used in cases when the value ofS3(i)
cannot be determined.e The influence of the second-order network on the
segmentation
ofego’s
direct social environment can beexpressed
aslà(i) = (S3(i) - S3 (i) ).
This isnon-negative.
To calculate these local
segmentation
measures, it is not necessary to observe the total network:it is sufficient to observe the relations in the
ego-centered
network(U1(i)
orrespectively).
For an illustration of this measure, consider
figure 3,
which is the second orderpersonal
environment of vertex 17 in the total network. The dashed lines are the relations of the alters of vertex 17 to vertices in
U2(i)
that are not relevant forcomputing S3 (i).
The circle illustrates theboundary
betweenU, (17)
andU2(17) .
Figure
3.Graph U2(l 7) ( the
part that is relevant forcomputing S3(i)).
Only
those persons inego’s
second-order environment who reduce the distance betweenpairs
of alters in the first-order
environment,
are relevant forcomputing
thepersonal segmentation
index.
In
figure 3,
vertex 10 reduces the sociometric distance betweenpair (27, 30)
from infinite to 2.Vertex 13 also connects two alters
of ego,
but the sociometric distance between these two alters remains 1. This means thatby taking
into account the second-order network of vertex 17 thefrequency
distribution of distanceschanges
into 3pairs
at distance1,
onepair
at distance2,
and11
pairs
at distance ou, which results inS3 (i)=
.92. Thisimplies
that thesegmentation
of theenvironment of vertex 17 remains
high,
but the direct environment of vertex 17 consists now of 1component
of 4 vertices and 2 isolated vertices.Table 2
gives
a summary of allpersonal (local) segmentation
indices of theemployees
in theconfidence network.
Table 2. Personal
segmenttion
indices of confidence networkIt is obvious that for vertices of
degree 1,
thepersonal
networksegmentation
indices are notdefined. For vertices of
degree
2and 3, S3(i)
andS3 (i)
aredefined,
but for such lowdegrees
the
interpretation
of these indices is somewhat troublesome. Thesegmentation
indices of a vertex ofdegree
2 can have two values:S3(i)
= 1 if no line is observed between the two altersor 0
otherwise; S3 (i)
= 1 if there is no personconnecting
both alters or 0 otherwise. A vertex ofdegree
3 can have two values forS3(i) :
1 if no lines oronly
1 line is observed inUl(i)
or0
otherwise; Si(1)
canalready
have 5 values(see figure 4,
values 0 and 1omitted).
Figure
4. Possible valuesS3 (i) strictly
between 0 and 1 fordegree
3 ifS3(i~
= 1(vertices
inU1(i)
areblack,
inU2(i)
arewhite)
The
influence A(1)
of the second-order network on thesegmentation
ofego’s
directenvironment for vertices of
degree
2 is 0 or 1. For vertices ofdegree 3, 6(i)
is0, .33, .5, .67,
or 1. For these cases,
6. (i)
can be used as an indication forexamining
thedegree
ofconnectedness of vertex i in the total network
by assuming
that vertices ofdegree
2 or 3 aremore isolated or
peripheral.
In table 2 there are 13 vertices ofdegree
2 or3,
of which 8 have amaximal first-order
segmentation.
For 5 of them(vertices 3, 5, 10, 12,
and26)
their second- order network does count for a reduction of thesegmentation.
Thisimplies
thatthey
are moreinfluenced
by
the total network structure than the other vertices whose first-order network is notinfluenced
by
total network structure. In otherwords,
the other 8 vertices can be viewed asmore isolated or
peripheral.
For the 12 vertices of
degree 4
orhigher
thefollowing
observations in the first-order networkcan be made:
- The
personal
networks of 2 vertices are notsegmented (S3(i)
=0) .
- The
personal
networks of 5 vertices aresegmented
in therange .25 - .62 .
- The
personal
networks of 5 vertices aremaximally segmented (S3(i) =1) .
Examining
the influence6.(i)
of the second-order environment for vertices i withdegree 4
orhigher,
note thatonly
3 vertices(17, 27, 30)
showpositive
influence of this environment.However,
these influences are rather low. Note that vertices 4 and 19 with both arelatively high degree
andhigh
first-ordersegmentation
have the same second-ordersegmentation.
Also vertex30 with the
highest degree
has more or less the same first-order and second-ordersegmentation
index.
Recalling
that thedensity
of the total network is low(.14),
as a conclusion onemight
say that the
segmentation
of mutual trust relations in the directpersonal
environments of theemployees
isquite locally
determined.ESTIMATION OF
S3(i)
ANDS3 (i)
The mutual trust network is a
relatively
small network which can be"easily"
observed. Forlarger
networks in which it isimpractical
to interview all networkmembers,
asample
is neededto estimate the
proposed segmentation
indices. Forinstance, interviewing only
10ego’s
eachmentioning
15 altersalready requires
150 interviews tocompute
the indices.To
compute segmentation
indicesS3(i)
andS3 (i)
inpersonal
networkdata,
severalsteps
areneeded in the data collection
procedure.
1. From
individual i,
in his/her role of"ego",
the set of hisacquaintances
is elicitedthrough
aname-generating procedure.
2. All the
acquaintances j
ofego i
arepresented
with a list of otheracquaintances (known
fromstep 1),
andeach j
is asked to indicate for each of the otheracquaintances
whether ornot j
has the
particular type,
ortypes,
of relation with him/her.3. All the
acquaintances j
ofego i
are interviewed and for eachalter j
the set of his/heracquaintances
is elicitedthrough
aname-generating procedure.
Because ofstep 2,
it issufficient to ask information on the set
of j’s acquaintances
that are notalready
contained inthe set of i’s
acquaintances.
To compute
S3(i)
step 1 and 2 areenough,
to computeS3 (i)
step 3 is needed. For situations in which these three steps can beapplied,
bothsegmentation
indices can becomputed.
Forthose situations in which it is
impractical
to interview each individual or alter of ego i,sampling
and observation
designs
are needed to estimate thesegmentation
indices. Severalsampling
andobservation
designs
arepossible
of which some will be discussed.SAMPLING AND OBSERVATION DESIGN FOR
Step
1: Eachego i
mentions his set ofacquaintances through
somename-generating procedure.
Step
2: A randomsample S(i)
withoutreplacement
of size n is drawn from theacquaintances j
of
i,
andeach j
ispresented
the list of all other(Xi+ - 1) acquaintances
to indicatewhether or
not j
has theparticular
type of relation with them.Which characteristics are known in this observation
situation,
and which have to be estimated?From
step 1,
we knowXi+.
What is not known and has to be estimated is thedensity q(1)
andthe number of
pairs
of i’sacquaintances
at distance 3 andlarger
within .For estimating i7(i) ,
the number of relationsR1(i)
inUl(i)
must becomputed.
This number is called the size ofgraph vl(i) .
Several statisticalgraph-size
estimators are discussedby Capobianco
& Frank( 1981 ). First, step
2 is considered.A random
sample
of size n is drawn fromU1(i).
Theprobability
thatacquaintance j
isdrawn is denoted For
each j E S(i)
his relations with the(Xi+ -1) acquaintances Xi+
*
are observed.
Capobianco
& Frank(1981)
define various estimators forRi(1) .
One ofthese,
which we also propose to usehere,
was denoted in their paperby
R ’ and isgiven by
where
the
probability
that neither of 2specified
vertices are in thesample.
Note that we can write:
where
We now have estimated
Rl(i),
and we still must find an estimator forR2(1 ) .
For
vertices j
and h inU1(i),
a distanced1i(j,h)
is observed in thesampling design if they
are not
directly
related(i.e., Xjh
=0 )
while at least one vertex k is observedacquainted
tothem both
(i.e.,
maxkXjk Xhk = 1).
TheHorvitz-Thompson
estimator(cf.
Cochran(1977)
oranother text on
sampling theory)
forR2(i)
is the number of suchpairs (j,h)
dividedby
theprobability that j
and h are in thesample,
where 1 This leads to the estimator
SAMPLING AND OBSERVATION DESIGN FOR
S3’(i)
,
To estimate the second-order
personal
networksegmentation
indexS3 (i), step
2 must bereplaced by step
3.Step
3: each selectedacquaintance j
of ego i ispresented
with a list of all the other(Xi+ -1) acquaintances,
andeach j
is asked to mention hisacquaintances
on this list and thosenot on the list of
ego i.
,
For the estimation of
S3 (i),
it is convenient to write:where
H(i)
is the number of i’sacquaintances
who areindirectly
related because both areacquainted
with thesame k,
but not with a common k who is himselfacquainted
with i.Step
3 allows to estimateH(i)
in the same way asR2(i).
The estimator can be written as:The estimator for
S3 (~)
now followsstraightforwardly.
BIBLIOGRAPHY
BAERVELDT,
C. &SNIJDERS, T.A.B.,1994,
"Influences on and from thesegmentation
ofnetworks:
hypotheses
andtests",
SocialNetworks, 16,
213-232.BULDER, A, FLAP,
H. D. & LEEUW F.L., 1993, "Netwerken, Overheidsorganisaties
eneffectiviteit"
(Networks,
GovemmentOrganizations,
andEfficiency),
DenHaag,
Ministerie vanBinnenlandse Zaken.
(In Dutch).
BULDER, A,
LEEUW F. L. &FLAP,
H.D., 1996,
"Networks andevaluating public-sector reforms", Evaluation, 2,
261-276.BURT,
R.S., 1992,
Structural Holes: The social structureof competition,
HarvardUniversity Press, Cambridge, Massachusetts,
andLondon, England.
BURT,
R.S., 1995,
"Lecapital social,
les trous structuraux etl’entrepreneur",
RevueFrancaise de