• Aucun résultat trouvé

Maximizing the Cohesion is NP-hard

N/A
N/A
Protected

Academic year: 2021

Partager "Maximizing the Cohesion is NP-hard"

Copied!
12
0
0

Texte intégral

(1)

HAL Id: inria-00621065

https://hal.inria.fr/inria-00621065v2

Submitted on 9 Oct 2011

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Maximizing the Cohesion is NP-hard

Adrien Friggeri, Eric Fleury

To cite this version:

Adrien Friggeri, Eric Fleury. Maximizing the Cohesion is NP-hard. [Research Report] RR-7734,

INRIA. 2011. �inria-00621065v2�

(2)

a p p o r t

d e r e c h e r c h e

ISSN0249-6399ISRNINRIA/RR--7734--FR+ENG

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE

Maximizing the Cohesion is NP-hard

Adrien Friggeri — Eric Fleury

N° 7734 — version 2

initial version 8 September 2011 — revised version 3 October 2011

(3)
(4)

Centre de recherche INRIA Grenoble – Rhône-Alpes 655, avenue de l’Europe, 38334 Montbonnot Saint Ismier

Téléphone : +33 4 76 61 52 00 — Télécopie +33 4 76 61 52 52

Maximizing the Cohesion is NP-hard

Adrien Friggeri , Eric Fleury

Th`eme : R´eseaux et t´el´ecommunications Equipe-Projet DNET´

Rapport de recherche n°7734 — version 2 — initial version 8 September 2011

— revised version 3 October 2011 — 8 pages

Abstract: We show that the problem of finding a set with maximum cohesion in an undirected network isNP-hard.

Key-words: social networks, complex networks, cohesion, np-complete, com- plexity

(5)

Maximiser la Coh´ esion est NP-dur

R´esum´e : Nous montrons que le probl`eme de trouver un ensemble de coh´esion maximum dans un graphe non orient´e estNP-dur.

Mots-cl´es : r´eseaux sociaux, r´eseaux complexes, co´esion, np-complet, compl´exit´e

(6)

Maximizing the Cohesion is NP-hard 3

Introduction

In [1], we have introduced a new metric called the cohesion which rates the community-ness of a group of people in a social network from a sociological point of view. Through a large scale experiment on Facebook, we have estab- lished that the cohesion is highly correlated to the subjective user perception of the communities. In this article, we show that finding a set of vertices with maximum cohesion isNP-hard.

Notations

LetG= (V, E) be a graph with vertex setV and edge setEof sizen=|V| ≥4.

For all verticesu∈V, we writedG(u) the degree ofu, or more simplyd(u)1. A triangle inGis a triplet of pairwise connected vertices.

For all sets of verticesS ⊆V, letG[S] = (S, ES) be the subgraph induced by S onG. We write m(S) = |ES| the number of edges in G[S], and i(S) =

|{(u, v, w)∈S3: (uv, vw, uw)∈E3}|the number of triangles inG[S]. We define o(S) =|{(u, v, w),(u, v)∈S2, w ∈V \S : (uv, vw, uw)∈E3}|, the number of outbound triangles ofS, that is: triangles inGwhich have exactly two vertices in S.

Moreover, for all (u, v) in E, let△(uv) =|{w∈V : (uw, vw)∈E2}|be the number of triangles the edgeuv belongs to inG.

Finally, we recall the definition of the cohesion of a setS inG: C(S) = i(S)2

|S| 3

(i(S) +o(S))

An example is given on Figure 1. The cohesion of a given set S in Gcan naively be computed in O(n3) by listing all triangles in Gand counting those inside and outbound to S.

S

Figure 1: In this example,i(S) = 2 ando(S) = 1, thusC(S) =16 In this article we examine the problem of finding a set of verticesS ⊆V of maximum cohesion, i.e. for all subsetS ⊆V,C(S)≤ C(S).

Outline

We now proceed to prove that finding a set of vertices with maximum cohesion inGisNP-hard. We will first show in Section 1 that this problem is equivalent

1Here, as elsewhere, we drop the index referring to the underlying graph if the reference is clear.

RR n°7734

(7)

4 Adrien Friggeri , Eric Fleury

to that of finding a connected set of vertices with maximum cohesion inG. The decision problem associated to the latter isConnected-Cohesive.

Then, we shall prove thatConnected-CohesiveisNP-complete by reduc- ingClique(problem GT19 in [2]). From there we deduce that the optimization problem of finding a set of vertices with maximum cohesion isNP-hard.

Problems

1. Connected-Cohesive:

Input A graphG= (V, E),λ∈Q,λ∈[0,1]

Question Is there a subset connectedS ofV such thatC(S)≥λ? 2. Clique:

Input A graphG= (V, E),k∈N, k≤ |V|

Question Is there a subsetS ofV such that|S|=k and the subgraph induced byS is a clique?

1 A maximum cohesive group is connected

In order to prove that a set of vertices with maximum cohesion in a given network is connected, we need the following lemma:

Lemma 1.1. Let S1 ⊆ V and S2 ⊆ V be two disconnected sets of vertices ((S1×S2)∩E=∅). IfC(S1)≤ C(S1∪S2)thenC(S2)>C(S1∪S2).

Proof. SupposeC(S1)≤ C(S1∪S2) andC(S2)≤ C(S1∪S2). Given thatS1and S2are disconnected,i(S1∪S2) =i(S1) +i(S2) ando(S1∪S2) =o(S1) +o(S2).

We can then write:

i(S1)2

|S1| 3

≤(i(S1) +o(S1))C(S1∪S2) (1) i(S2)2

|S2| 3

≤(i(S2) +o(S2))C(S1∪S2) (2) By summing (1) and (2), we obtain:

i(S1)2

|S1| 3

+i(S2)2

|S2| 3

≤(i(S1) +o(S1) +i(S2) +o(S2))C(S1∪S2)

≤(i(S1∪S2) +o(S1∪S2))C(S1∪S2)

≤(i(S1) +i(S2))2

|S1|+|S2| 3

Furthermore, given that|S1|,|S2|>1, |S1|

3

+ |S2|

3

<

|S1|+|S2| 3

INRIA

(8)

Maximizing the Cohesion is NP-hard 5

We then have:

i(S1)2

|S1| 3

+i(S2)2

|S2| 3

< (i(S1) +i(S2))2

|S1| 3

+ |S32| Which simplifies to:

|S2| 3

i(S1)−

|S1| 3

i(S2)

2

<0

Hence the contradiction. Therefore, for allS1, S2⊆V, disconnected:

C(S1)≤ C(S1∪S2)⇒ C(S2)>C(S1∪S2)

Theorem 1.2. Let S be the set of vertices ofGwith the highest cohesion,S is connected.

Proof. Suppose S is not connected, then their exist two disconnect subsets S1, S2 ⊆ S such that S = S1∪S2. Given that S has maximum cohesion, we have C(S)≥ C(S1). Thus per Lemma 1.1: C(S) < C(S2) and S does not have the highest cohesion, hence the contradiction.

Corollary 1.3. Per Theorem 1.2, the problem of searching for a set of ver- tices with maximum cohesion is strictly equivalent to that of searching a set of connected vertices with maximum cohesion.

2 Connected-Cohesive is NP-complete

First note that given a set S of vertices of G, it is possible to verify that S is a solution of Connected-Cohesive by computing its cohesion, its size, its connectivity and the minimum degree of its vertices, all in polynomial time.

ThereforeConnected-Cohesiveis inNP.

Algorithm 1Transforms an instance ofCliquein an instance ofConnected- Cohesive

Require: G= (V, E), k∈N

1: W :=∅

2: E :=E

3: foruv∈V2\E do

4: letK be a clique of size 2 n34

5: W ←W ∪K

6: E ←E∪ {uv} ∪({u, v} ×K)

7: end for

8: return G = (V ∪W, E), λ= (k3) (k3)+(k2)(n−k)

Let us now reduceCliquetoConnected-Cohesive. Let (G= (V, E), k∈ N) be an instance ofClique2. We can assume thatGis connected (if not, we

2We consider here that |G|> 2 and k > 2, although this is not exactly Clique, this problem is clearly NP-complete, given that the complexity of Cliquedoes not arise from those small values.

RR n°7734

(9)

6 Adrien Friggeri , Eric Fleury

u v

w1

wi

w2 n34

Figure 2: Illustration of Algorithm 1. At this step, wejoinuandv, add a clique of size 2 n34

to the network, and joinuandvto all vertices in the added clique.

use the following reasoning separately on each connected component ofG). We construct an instance (G = (V, E), λ) ofConnected-Cohesive by adding an edge between all non connected verticesuandv inGand then linking those two vertices to all vertices in a clique of size 2 n34

which we add to the network, as described in Algorithm 1 and illustrated by Figure 2.

Theorem 2.1. There exist a clique of size k in G iff there exist a connected group of vertices ofG with cohesion λ≥ (k3)

(k3)+(k2)(nk).

Proof. Let K ⊆ V, be a clique of size |K| =k in G. Given that no node or edges are deleted when constructingG,Gis a subgraph ofG and thusK is a clique inG andiG(K) = k3

.

Moreover, by construction,G[V] is a clique and for alluunK, the neighbors ofuare also inV. Therefore, each edge inKforms one triangle with each vertex inV \K, which leads tooG(K) = k2

(n−k). Finally, this gives a cohesion:

CG(K) =

k 3

k 3

+ k2 (n−k)

Conversely, let S ⊆ V be a connected set of vertices such that CG(S) ≥ (k3)

(k3)+(k2)(nk). We will show that S is a clique of size larger than k and that S ⊆V. First note that |S| ≥3, because by definition, if |S|<3, CG(S) = 0 which would lead to a contradiction.

First, suppose thatS is not a clique inG, then let us distinguish two cases:

1. If S ⊆ V and S is not a clique, then S contains two vertices u, v ∈V2 such thatuv6∈E.

2. IfS 6⊆V, then∃u∈S\V, andSbeing connected, there existv∈V such thatuv6∈E.

INRIA

(10)

Maximizing the Cohesion is NP-hard 7

Therefore, ifSis not a clique inG, it contains an edgeuv6∈Eand by construc- tion, this edge belongs to at least 2 n34

triangles, which leads to:

iG(S) +oG(S)≥K CG(S)≤ iG(S)2

2 |S3| n 3

4

≤ 1 2 n32

<

k 3

k 3

+ k2 (n−k)

Hence the contradiction, thereforeSmust be a clique inG. From there it comes that:

CG(S) =

k 3

k

3

+ k2

(n−k) wherek=|S|. Therefore:

CG(S)≥

k 3

k 3

+ k2

(n−k) ⇔

k 2

(n−k)

k

3

k 2

(n−k)

k 3

⇔ n−k

k−3 ≤ n−k k−3

⇔k ≥k

Therefore, we can now conclude that if there exist a connected setSinG with cohesionCG(S)≥ (k3)

(k3)+(k2)(n−k), thenS is a clique of size at leastk in G, and thus there exist a cliqueK⊆S of sizekin G.

Theorem 2.2. Connected-Cohesive isNP-complete.

Proof. Per Theorem 2.1, there exist a clique of size k in G iff there exist a connected subset of vertices ofG of cohesionλ≥ (k3)

(k3)+(k2)(n−k) and the trans- formation fromG, ktoG, λruns in polynomial time. ThusCliqueis reducible toConnected-Cohesive andConnected-Cohesiveis NP-hard.

Given that Connected-Cohesive is in NP, the problem is thus NP- complete.

3 Conclusion

The associated decision problem being NP-complete, the problem of finding a set of vertices with maximum cohesion isNP-hard3.

3Note that the problem of finding a set of vertices of maximum cohesion containing a set of predefined vertices is alsoNP-hard, by an immediate reduction

RR n°7734

(11)

8 Adrien Friggeri , Eric Fleury

References

[1] Adrien Friggeri, Guillaume Chelius, and Eric Fleury. Triangles to Capture Social Cohesion. In Third IEEE International Conference on Social Com- puting, Cambridge, United States, September 2011.

[2] M.R. Garey and D.S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, San Francisco, 1979.

INRIA

(12)

Centre de recherche INRIA Grenoble – Rhône-Alpes 655, avenue de l’Europe - 38334 Montbonnot Saint-Ismier (France)

Centre de recherche INRIA Bordeaux – Sud Ouest : Domaine Universitaire - 351, cours de la Libération - 33405 Talence Cedex Centre de recherche INRIA Lille – Nord Europe : Parc Scientifique de la Haute Borne - 40, avenue Halley - 59650 Villeneuve d’Ascq

Centre de recherche INRIA Nancy – Grand Est : LORIA, Technopôle de Nancy-Brabois - Campus scientifique 615, rue du Jardin Botanique - BP 101 - 54602 Villers-lès-Nancy Cedex

Centre de recherche INRIA Paris – Rocquencourt : Domaine de Voluceau - Rocquencourt - BP 105 - 78153 Le Chesnay Cedex Centre de recherche INRIA Rennes – Bretagne Atlantique : IRISA, Campus universitaire de Beaulieu - 35042 Rennes Cedex Centre de recherche INRIA Saclay – Île-de-France : Parc Orsay Université - ZAC des Vignes : 4, rue Jacques Monod - 91893 Orsay Cedex

Centre de recherche INRIA Sophia Antipolis – Méditerranée : 2004, route des Lucioles - BP 93 - 06902 Sophia Antipolis Cedex

Éditeur

INRIA - Domaine de Voluceau - Rocquencourt, BP 105 - 78153 Le Chesnay Cedex (France)

http://www.inria.fr ISSN 0249-6399

Références

Documents relatifs

نﺎﻓرﻋو رﻜﺸ ،ﻪﺘﻤﺤر باوﺒأ ﺢﺘﻔﺒ ﺎﻴﻠﻋ مﻌﻨأ يذﻝا ﷲا دﻤﺤأو رﻜﺸأ ﻲﻌﺴﻝا ﻲﻓ ﻲﻝ ﻪﻘﻴﻓوﺘ ﻰﻠﻋو ،ﻪﻨﻤ ةدازﺘﺴﻻاو مﻠﻌﻝا بﻠط ﺔﻤﻌﻨ ءارو .مﻴرﻜﻝا ﻪﻝوﺴر ﻰﻠﻋ مﻬﻠﻝا

(3) ( 1 ) ءاجو يف رارقلا يسنرفلا ون زوجي لامعتسا ليجستلا يتوصلا وفصوب لايلد يف تابثلإا كلذو يف تلااحلا يتلا لا اييف طرتشي نوناقلا ةيمكشلا

mRNA expression level (A); and protein expression level (B) of TNF-α, normalized to the corresponding level expression of housekeeping gene β-Actin, in differentiated SH-SY5Y

Sans mener une analyse détaillée des trois textes journalistiques pris en compte pour cette étude, un coup d’œil aux seuls titres des articles cités plus haut suffit à saisir

حلاصإ ىلع زكر رىاط خيشلا نأ وى و ويف ةجيتن مىأ لىإ يثبح ةتماخ في تصلَخ امك يسايسلا و يعامتجلاا حلاصلإا قيقتح في ةيلمعلا هذى ةيهملأ ميلعتلا و ةيبترلا بيلاسأ ,

In Chapter 3 , we explore the hardware architecture developed for simulating Spiking Neural Networks and propose an approach fitting dynamic sensors to be used with Precise

Par exemple, elle peut transformer de manière sûre un int en double, mais elle peut aussi effectuer des opérations intrinsèquement dan- gereuses comme la conversion d’un void*

resolution. 6.25: Simulated absorption of the THEDEX bolometer... 6.26: Simulated reflectivity of the THEDEX FPA surface. 6.27: Comparison between simulated and measured