• Aucun résultat trouvé

Computational aspects of optimal strategic network diffusion

N/A
N/A
Protected

Academic year: 2021

Partager "Computational aspects of optimal strategic network diffusion"

Copied!
17
0
0

Texte intégral

(1)

HAL Id: hal-03058609

https://hal.archives-ouvertes.fr/hal-03058609

Submitted on 4 Mar 2021

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Computational aspects of optimal strategic network

diffusion

Marcin Waniek, Khaled Elbassioni, Flávio Pinheiro, César A. Hidalgo,

Aamena Alshamsi

To cite this version:

Marcin Waniek, Khaled Elbassioni, Flávio Pinheiro, César A. Hidalgo, Aamena Alshamsi.

Computa-tional aspects of optimal strategic network diffusion. Theoretical Computer Science, Elsevier, 2020,

814, pp.153 - 168. �10.1016/j.tcs.2020.01.027�. �hal-03058609�

(2)

Contents lists available atScienceDirect

Theoretical

Computer

Science

www.elsevier.com/locate/tcs

Computational

aspects

of

optimal

strategic

network

diffusion

Marcin Waniek

a

,

b

,

,

Khaled Elbassioni

b

,

Flávio L. Pinheiro

c

,

d

,

César A. Hidalgo

d

,

Aamena Alshamsi

b

aComputerScience,NewYorkUniversityAbuDhabi,AbuDhabi,UnitedArabEmirates bKhalifaUniversityofScienceandTechnology,AbuDhabi,UnitedArabEmirates

cNovaInformationManagementSchool(NOVAIMS),UniversidadeNovadeLisboa,Lisboa,Portugal dMITMediaLab,MassachusettsInstituteofTechnology,Cambridge,USA

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory:

Received 5 July 2019

Received in revised form 19 December 2019 Accepted 26 January 2020

Available online 30 January 2020 Communicated by L.M. Kirousis Keywords: Strategic diffusion Network contagion Complex networks Influence maximization

Diffusiononcomplexnetworksisoftenmodeled asastochasticprocess.Yet,recentwork on strategic diffusionemphasizes the decision power of agents[1] andtreats diffusion as a strategic problem. Herewe studythe computational aspects of strategic diffusion,

i.e.,finding theoptimal sequenceofnodes toactivate anetwork intheminimum time. We provethat finding anoptimal solution tothisproblem isNP-complete inageneral case.To overcomethiscomputationaldifficulty, wepresentan algorithmtocomputean optimal solution based on a dynamic programming technique. We also show that the problemisfixedparameter-tractablewhenparametrizedbytheproductofthetreewidth andmaximumdegree.Weanalyzethepossibilityofdevelopinganefficientapproximation algorithmandshowthattwoheuristicalgorithmsproposedsofarcannothavebetterthan alogarithmicapproximationguarantee.Finally,weprovethattheproblemdoesnotadmit betterthanalogarithmicapproximation,unlessP=NP.

©2020ElsevierB.V.Allrightsreserved.

1. Introduction

Diffusion in social networks has beena widely studied applicationin problems such as epidemic outbreaksand the adoption of behaviors andinnovations [2–7]. The approach to model diffusion can be roughly divided between simple andcomplexcontagion processes.Undersimplecontagion transmissionrequirescontactwithone individualataconstant ratesuch astransmission ofinfectious diseases [8]. An exampleof simplecontagion process isthe independentcascade model [9].Incontrast,complexcontagion requiresreinforcementfrommultiplesources.Anexampleofcomplexcontagion processisthelinearthresholdmodel [9].Complexcontagionhasbeenwidelystudiedinthecontextofcascadephenomena, withparticular applicationsto viralmarketing campaigns [10–12].In that context,the problemunderanalysis isthat of findingthe initialsetofseeds thatwouldmaximize diffusion.Differentapproachestosolving theseedselection problem includeusingcentralitymeasures [13],networkpercolation [14],andpropagationtraces [15],amongmanyothers [16–18]. Someworksintheliteraturetakeanadaptiveapproachtothetopic,basingtheselectionofseedsonadditionalknowledge gatheredduringtheprocess,beiteitherthecurrentstateofadynamicnetworktopology [19],orexpandedpoolofpotential seeds [20,21].Analternativewayofincreasingthediffusioncoverageisspreadingavailableseedsovertime [22,23].Despite thestrategicaspectsofdifferentmethodsofseedselection,thediffusionprocessitselfispurelystochastic.

*

Corresponding author at: Computer Science, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates.

E-mailaddress:mjwaniek@nyu.edu(M. Waniek). https://doi.org/10.1016/j.tcs.2020.01.027

(3)

However, thereare casesinwhichdiffusiontakesplaceaccordingtothesamerules ofcomplexcontagion,butthatare not purely stochastic (i.e., they have a strategic component) andin which thenetwork itself is not, necessarily, a social network. Take for instancehow regions develop neweconomic activities [24] or start production innew research fields [25]. Inthese casesagents are not embedded in the network, insteadthey are takingactions over a networkedsystem, which captures the relatedness between economic or academic activities. More importantly, not only the choice of the initial state is strategic, but also the whole process of diffusion is strategic. Stochasticity in this context is not at the pairwise interactionsbetweenagents,butinthesuccessrateofagentsinenteringnewactivities.Incaseoftheeconomic development, theprobability ofsuccess while entering a newactivity (represented by a node in the network ofrelated economic activities)increaseswiththenumberofotheralreadydevelopedactivities thatareconnectedtoit.Thus,agents canfailorsucceeddependingontheparticularcontextofeachstrategicaction.

Alshamsi et al. [1] proposedtostudythediversificationofeconomicactivitiesthroughthelensesofastrategicdiffusion model.Inthismodeltheentireprocessofdiffusionisguidedbyastrategicagent.Theagent’sactionsconcerntheselection of a node ina network tobe targeted ateach time step.Hence, the node canbecome activated inthe next stepof the action sequenceoftheagent, ornot.Following existingempiricalfindings [24,26],the modelassumesthatthe activation probability (i.e.,the probability ofsuccess ofthe agent’saction) iscaptured by acomplex contagion process. Thegoal is toactivatetheentirenetwork,startingfromasingleactivenode,intheminimumtime possible.WhileAlshamsi et al. [1] showedforawiderangeofnetworktopologiesthatactivatinganetworkinanoptimalmannerrequiresabalancebetween exploitationandexplorationstrategies,manyquestionsregardingthecomputationalfeasibilityoffindingoptimalstrategies remainedopen.Here,weexploreseveraltheoreticalcomputationalconsiderationsofstrategicdiffusionprocesses.

1.1. Resultsandmethods

Westartbyexploringthecomputationalfeasibilityoffindinganoptimalstrategythatminimizesthetotaldiffusiontime,

i.e., answering the question ofwhatis thebest sequenceof activatinga givennumberofnodesin thenetwork (starting fromaseed node)inthe shortestexpectedtime,whiletakingintoconsiderationthat thetimenecessarytoactivateeach nodeisproportionaltothenumberofactiveneighbors.

WeshowthatthedecisionversionoftheproblemisNP-complete(Theorem3).Weproveitusingareductionfromthe Set Cover problem.Thisobservation indicatesthat developinga polynomialalgorithm forfindingan optimalsequence of strategicdiffusionisnotrealistic,unlessP=NP.

Giventhisdifficulty,weproposean algorithmforcomputingan optimalwayofactivatingagivennumberofnodesin theshortesttimepossible(Algorithm1).Whilethealgorithmtakesexponentialtimetofindthesolution,itstilloutperforms simple exhaustivesearch andcan be usedto compute an optimalsolutionfor networksofmoderatesize. Thealgorithm utilizesthedynamicprogrammingtechniquetofindthefastestwaytoreacheverypossiblestateofnetworkactivation.

Wealsopresenttwo algorithmsfindinganoptimalsolutionforamorerestrictiveclassofnetworks,i.e.,networkswith bounded treewidth andboundedmaximal degree. Theytraverse a tree decompositionofa network andfind an optimal wayofactivatingeitheranentirenetworkora givennumberofnodes,respectively.Asa consequence,we showthat the problemisfixedparameter-tractablewhenparametrizedbythesumofthetreewidthandmaximumdegree.

Asfindinganoptimalsolutionprovestobecomputationallydemanding,weturnourattentiontoassessingthepossibility ofobtaininganefficientapproximationalgorithm.Wefirstinvestigatewhethertwoeffectiveheuristicalgorithmsproposed by Alshamsi et al. [1] have constant approximationratios. Oneofthem isthe greedyalgorithm,i.e., always targetingthe nodewiththehighestprobabilityofactivation.Theotheristhemajorityalgorithm,i.e.,targetingthenodewiththehighest number ofactive neighbors.Weshow that the approximationratiofor bothofthese algorithmsis

(

log n

)

(Theorems7

and 8). The proofs are based on constructing a sequence of networks where the ratio between total expected time of activationofthesolutionobtainedusingtheheuristicalgorithmandtheoptimalexpectedtimeofactivationgoestoinfinity. Finally, we show that, unless P=NP, there is no way to approximate the problem within a ratio better than ln n, in particular it is impossibleto construct an r-approximationalgorithm for a constant r (Theorem9). We prove this claim by showing a reduction from the Minimum Set Cover problem and using the fact that Minimum Set Cover cannot be approximatedwithinaratioof

(

1



)

ln n forany



>

0,unlessP=NP [27].

Organizationofthemanuscript. Theremainderofthearticleisorganizedasfollows.Section2describesthenotationsand computational problemsused in our reductions. Section 3 presentsa formal definitionof the strategic diffusionprocess and the main problemconsidered in our study.In Section 4 we present ourhardness resultforthe decision version of theproblemandwedescribeadynamic programmingalgorithm tofindan optimalwayofstrategicdiffusion.Section4.4

introduces algorithms computingoptimalsolutionfornetworkswithboundedtreewidthandmaximaldegree.Section 4.5

describesourresultsconcerning approximationoftheoptimalsolution.Section 5presentsconclusions andpotentialideas forfuturework.

2. Preliminaries&notation

(4)

2.1. Basicnetworknotation

LetG

= (

V

,

E

,

W

)

denoteanetworkwithweightededges,where V

= {

1

,

. . . ,

n

}

denotesthesetofn nodes,E

V

×

V

denotes thesetofedges and W

∈ R

n×n denotes thematrixwithweights ofedges. Wedenote anedge betweennodesi

and j by i j.In thiswork we considernetworksthat are undirected, i.e., we donot discern betweenedges i j and ji.We alsoassumethatnetworksdonotcontainself-loops,i.e.,

iVii

/

E.Wedenoteby NG

(

i

)

thesetofneighbors ofi inG,i.e.,

NG

(

i

)

= {

j

V

:

i j

E

}

.WedenotebydG

(

i

)

thedegree ofi inG,i.e.,dG

(

i

)

= |

NG

(

i

)

|

.

Weconsider networkswithweighted edges.We denoteby wi j

∈ R

theweightoftheconnection fromi to j,wewill

callthisvalueinfluence thati hason j.Wedonotassumethattherelationofinfluenceissymmetric,i.e.,itispossiblethat fori j

E wehavewi j

=

wji.Unlessstatedotherwise,wewillassumethat

i,jVwi j

0.Wealsoassumethatifi j

/

E then

wi j

=

0.Wedenoteby wi thesumofinfluenceonnodei,i.e., wi

=



jN(i)wji.Wewilltypicallyassumethat

iVwi

>

0.

Let

(

V

)

denotethesetofallorderedsequences ofelementsfromV without repetitions.Let

γ

idenotethei-thelement

(node)ofsequence

γ

∈ (

V

)

,and

|

γ

|

denotethenumberofelementsin

γ

∈ (

V

)

.Finally,wecall

γ

∈ (

V

)

afull sequence

if

|

γ

|

= |

V

|

andwedenotethesetoffullsequencesby



(

V

)

.

Tomakethenotation morereadable,wewilloftenomitthenetworkitselffromthenotation whenitisclearfromthe context,e.g.,bywritingN

(

i

)

insteadofNG

(

i

)

.Wesometimestreatsequencesassets,whentheorderisnotimportant.We

use

todenotetheconcatenationoperationoversequences.

2.2. Computationalproblems

InourreductionswewillusetwostandardversionsoftheSetCoverproblem,decisionandcombinatorialoptimization.

Definition1(SetCover [28]).AninstanceoftheSetCoverproblemisdefinedbyauniverseU

= {

u1

,

. . . ,

u|U|

}

,acollectionof

sets

S = {

S1

,

. . . ,

S|S|

}

suchthat

jSj

U ,andanintegerk

≤ |

S|

.Thegoalistodeterminewhetherthereexistk elements

of

S

theunionofwhichequalsU .

SetCoverisoneoftheclassic21 Karp’sNP-completeproblems.

Theorem1([28]).SetCoverproblemisNP-complete.

Fortheproofoftheapproximationhardnesswewillusetheminimizationversionoftheproblem.

Definition2(MinimumSetCover).AninstanceoftheMinimumSetCoverproblemisdefinedbyauniverseU

= {

u1

,

. . . ,

u|U|

}

andacollectionofsets S

= {

S1

,

. . . ,

S|S|

}

suchthat

jSj

U .Thegoalistofindsubset S

S suchthatthe unionof S

equalsU andthesizeofS∗ isminimal.

For

α

1,an

α

-approximationforagiveninstanceofaminimization problemisafeasiblesolutionwhoseobjectiveis withinafactorof

α

ofanyoptimalsolution.WewillusethefactthatMinimumSetCoverproblemishardtoapproximate.

Theorem2([27]).Foranyfixed



>

0 MinimumSetCovercannotbeapproximatedtowithin

(

1



)

ln n,unlessP=NP.

Wenowmovetodefiningthemodelofstrategicdiffusionandthemainoptimizationproblemofourstudy.

3. Problemdefinition

Inthissectionwedescribe theprocessofstrategicnetworkdiffusion,whichisthemainfocusofourstudy,aswellas thecomputationalproblemconcerningit.

Mostdiffusionmodelsare purelystochastic [10,9].In thiswork however,wefocus onthestrategicmodelofdiffusion toaccountforcasesinwhichtheinterconnectionsbetweentargetsaffecttheiractivationtime andthereforechoosingthe orderinwhichnodeswillbetargetedforactivationisstrategicallyplanned.

Inthismodel,theprocess ofdiffusionisdriven bya strategicagent. Atthebeginning oftheprocess onlyone chosen node ofthenetwork,the seednode iS,isactive. Then,theagent chooses asequence

γ

∈ (

V

)

that providestheorder in

whichthenodeswillbeactivated.Theprobabilityofsuccessfulactivationofanodei inoneattemptisgivenby:

p

(

i

)

= β



jN(i)Awji wi



α

where A isthesetofcurrentlyactivenodes(atthebeginningoftheprocessitconsistsonlyoftheseed),and

α

,

β

∈ [

0

,

1

]

areconstants.Unlessstatedotherwise,wewillassumethat

α

= β =

1.Noticethattheexpectedtimeofactivationofnode

(5)

i withnon-zeroactivationprobabilityis

τ

(

i

)

=

p1(i).By

τ

(

γ

)

wewilldenotetheexpectedtimeofactivationofallnodesin thesequence

γ

.

This model was proposed by Alshamsi et al. [1] for undirected networkswith unweighted edges. If we assume that

wi j

=

1

⇐⇒

i j

E thenourmodelisexactlyequivalenttothemodelproposedbyAlshamsi et al. [1].

Wenowdefinethemaincomputationalproblemofourstudy.

Definition3(OptimalPartialDiffusionSequence).Thisproblemisdefinedbyatuple

(

G

,

iS

,

z

)

,whereG

= (

V

,

E

,

W

)

isagiven

networkwithweightededges,iS

V istheseednodeandz

n inthenumberofnodestoactivate.Thegoalistoidentify

γ

∈ (

V

)

suchthat

γ

1∗

=

iS,

|

γ

|

=

z and

τ

(

γ

)

isminimal.

In other words, we intend to find the fastest way to activate z nodes in the network. When z

= |

V

|

in the above definition,theproblemissimplycalledOptimal(Full)DiffusionSequence.

Remark1.Inthecaseofintegerweightsofpolynomiallength,foranyinstanceoftheOptimalDiffusionSequenceproblem thereexistsapolynomiallyequivalentinstancewithbinaryweights.

Proof. Let

(

G

,

iS

,

|

V

|)

be agiveninstanceoftheOptimalDiffusion Sequence.Inordertoconstructanequivalent instance

with 0

/

1-weights we replaceevery edge i j with a set of wi j parallel 2-paths

{

i

,

ki,j,1

,

j

,

. . . ,



i

,

ki,j,wi j

,

j

}

,and we set

weightsofthenewedgesto wiki, j,r

=

wki, j,rj

=

1, wki, j,ri

=

wjki, j,r

=

0,forr

=

1

,

. . . ,

wi j.

Given a feasible solution

γ

to the original instance, we obtain a feasible solution

γ

 to the constructed instanceby activating all nodes ki,j,r immediatelyafter activatingnode i in the original sequence (notice that the expected time of

activationofeverysuchnodeki,j,ris1).Thus,ifthevalueofthesolution

γ

ist,thenthevalueofthe

γ

ist

+



i jE



wi j

+

wji



.

Givenafeasiblesolution

γ

totheconstructedinstanceofthebinaryversionoftheproblem,we canobtainasolution withalesserorequalvaluebyactivatingallnodeski,j,rinoneblockimmediatelyafteractivatingnodei (again,theexpected

timeofactivationofnodeki,j,risalways1,whileitcontributestothetimeofactivationofnode j).Forsuchsequence,we

canobtaincorrespondingsolution

γ

totheoriginalinstanceoftheproblembyremovingfromitallnodeski,j,r.Ifthevalue

ofthesolution

γ

 ist,thenthevalueofthe

γ

ist



i jE



wi j

+

wji



.

NotethatthisreductiondoesnotworkingeneralfortheOptimalPartialDiffusionSequenceifz

<

|

V

|

.

2

4. Computinganoptimalsolution

WenowdescribeourresultsonfindinganefficientwaytocomputeanexactsolutiontotheOptimalDiffusionSequence problem.

4.1. Hardnessoffindinganoptimalsolution

First,weshowthatthedecisionversionofOptimalPartialDiffusionSequenceproblemisNP-completeinageneralcase.

Theorem3.OptimalPartialDiffusionSequenceproblemisNP-complete,evenifallweightsarein

{

0

,

1

}

.

Proof. The decision version of the Optimal Partial Diffusion Sequence problem is the following: given a network G

=

(

V

,

E

,

W

)

, the seed node iS, the number of nodes to activate z, and a value t

∈ R

+, does there exist a sequence of

nodes

γ

∈ (

V

)

startingwithiS suchthat

|

γ

|

=

z and

τ

(

γ

)

t∗?

ThisproblemclearlyisinNP,asgivenasolution,i.e.,asequenceofnodes

γ

∈ (

V

)

,wecancomputeitsexpectedtime ofactivationinpolynomialtime.

ToprovetheNP-hardnessoftheproblemwewillshowareductionfromtheNP-completeSetCoverproblem.

Let

(

U

,

S,

k

)

beaninstanceoftheSetCover problem.Wedefinenetwork G asfollows(anexampleofsuchnetworkis presentedinFig.1):

Thesetofnodes: Forevery Si

S

,wecreatethreenodes,denotedby Si,qiandqi.Foreveryui

U ,wecreateasingle

nodeui.Additionally,wecreateasinglenodeiS.

Thesetofedges: Foreverynode Si,wecreateanedge SiiS.Forevery nodeqi,we createedges qiSi andqiqi.Finally,

foreverynodeuiandevery Sjsuchthatui

Sj wecreateanedgeuiSj.

Theweightmatrix: ForeveryedgeuiSj wesetweightstowuiSj

=

0 andwSjui

=

1.ForeveryedgeSiqi wesetweights

to wSiqi

=

0 and wqiSi

= |

U

||

S|

.Forallotherpairsofnodesconnectedwithanedgewesettheweightsto1.

Let z

=

k

+ |

U

|

+

1 (weintendthat thesolutionsequencewillactivatenodeiS,k nodesin

S

andallnodesinU ),and

(6)

Fig. 1. NetworkG constructed

for the proof of Theorem

3. Blue numbers next to edges express their weights. If there are no arrows next to edge i j then wi j=wji. Otherwise weight wi jis denoted next to the arrow pointing towards node j,

and weight

wjiis denoted next to the arrow pointing towards

node i.

(For interpretation of the colors in the figures, the reader is referred to the web version of this article.)

Wewillnowshowthatasolutiontothisinstanceofvalueatmostt∗ correspondstoasolutiontothegiveninstanceofthe SetCoverproblem.

Noticethattheexpectedtimeofactivationofeverynode Siisalways

|

U

||

S|

+

1,whilefortheexpectedtimeofactivation

ofanodeuiwehave

τ

(

ui

)

≤ |{

Sj

:

ui

Sj

}|

≤ |

S|

.Noticealsothatneitheranyofthenodesqi noranyofthenodesqican

beactivatedwheniS istheseednode,astheinfluenceof Si onqi iszero.

First, wewill show that ifthere exists asolution

S

∗ to the giveninstance ofthe SetCover problem, then therealso existsasolutiontotheconstructedinstanceoftheOptimalPartialDiffusionSequenceproblemofvalueatmostt∗.Wecan constructsuch asolutionby firstactivatingevery node Si

S

(k nodesactivatedinexpectedtime

|

U

||

S|

+

1 each)and

thenactivatingallnodesui(

|

U

|

nodesactivatedinexpectedtimenotexceeding

|

S|

each).Such

γ

∗ activatesk

+ |

U

|

nodes

inexpectedtimenotexceedingk

(

|

U

||

S|

+

1

)

+ |

U

||

S|

,thereforeitisasolutiontotheconstructedinstanceoftheOptimal PartialDiffusionSequenceproblemofvalueatmostt∗.

Second,tocompletetheproofoftheNP-hardness,wehavetoshowthatifthereexistsasolution

γ

∗ totheconstructed instance ofthe Optimal Partial Diffusion Sequence problemofvalue atmostt∗,then there also exists a solutionto the giveninstanceofthe SetCover problem.Such asolution is

S

=

γ

S

, i.e.,choosing sets Si corresponding to nodes Si

occurringinsequence

γ

∗.Noticethat therecannotbemorethank suchnodes,asactivatingk

+

1 nodes Si hasexpected

time

(

k

+

1

)(

|

U

||

S|

+

1

)

>

t∗. Sincethisisthecase, inorderto activatek

+ |

U

|

nodesother than iS,sequence

γ

∗ hasto

activateallnodesinU .However,toactivateanodeui,wefirsthavetoactivateatleastoneofitsneighbors,i.e., node Sj

such that ui

Sj.Therefore, forevery node ui there mustexist atleastone node Sj

γ

S such that ui

Sj.Hence,

S

=

γ

S

isavalidsolutiontothegiveninstanceoftheSetCoverproblem.

Finally,wecanusetheconstructioninRemark1toreplaceeveryedgeqiSibyasetofparallelpathswithedgeweightsin

{

0

,

1

}

.Notethatsuchareplacementdoesnotchangethevalueoftheobjectiveasthenodesqi,andhencetheintermediate

nodesaddedontheparallelpaths,areneveractivated.Thisconcludestheproof.

2

Therefore,thereexistsnopolynomialalgorithmfindingtheoptimalwaytoactivateagivennumberofnodesinthe pro-cessofstrategicdiffusion,unlessP=NP.However,wenowproposeanexponentialalgorithmbasedondynamicprogramming technique.

4.2. Dynamicprogrammingalgorithm

We present an algorithm for computing an optimal solution to the problem using the dynamic programming tech-nique [29]. The mainidea behind dynamic programming isbreaking the mainproblem intoa numberof smaller,easier tosolvesub-problemsinarecursive manner.However,unlike instandard recursionwhereoftenthesamecomputationis repeatedmultipletimes,indynamicprogrammingthesolutiontoeach sub-problemiscomputedonlyonceandstoredin memoryforfutureuse.Overtheyearsthedynamicprogrammingtechniquesweredevelopedinvariousdirections [30–33], howeverouralgorithmisbasedontheoriginalversionofthetechnique.

Noticethatinoursettingtheactivationprobabilityofanygivennodedependsonthesetofactivenodesinthenetwork, howeveritdoesnotdepend ontheorderinwhichthesenodeswere activated.Hence,thesolutionforeach setofactive nodeshavetobecomputedonlyonce.Moreover,solvingthesub-problemoffindinganoptimalwayofactivatingk nodesin thenetworkallowstoefficientlyfindthesolutionfork

+

1 nodes,asthetotalexpectedtimeofactivationcanbeexpressed in a recursive manneras a sumof time ofactivation of the initialk nodes andthe time ofactivation of the last node. The idea of usingsolutions for smaller sub-problemsto solve a larger problem isthe core concept behind the dynamic programming,hencewefinditasuitableoptimizationmethodforsolvingtheOptimalPartialDiffusionSequenceproblem.

Thealgorithmisbasedonthefollowingrecurrencerelation,where

τ

(

C

)

istheminimalexpectedtimeofactivationof asetofnodesC :

(7)

Algorithm 1: Dynamicprogrammingalgorithmforstrategicdiffusion.

Input: A

weighted network

(V,E,W), a seed node iSV ,

and the number of nodes to activate

z. Output: Sequence

of activation of

z nodes

starting with

iSwith minimal expected time of activation.

1 for CV do 2 τ∗[C]← ∞ 3 τ∗[{iS}]←0 4 γ∗[{iS}]← iS 5 for k=1,. . . ,z1 do 6 for CV: (|C|=k)∧ (τ∗[C]<∞)do 7 for iV: (i/C)∧ (N(i)C= ∅)do 8 τ← wi jN(i)Cwji 9 ifτ∗[C]+ τ<τ∗[C∪ {i}]then 10 τ∗[C∪ {i}]←τ∗[C]+ τ 11 γ∗[C∪ {i}]←γ∗[C]⊕ i

12 returnγ∗[arg minCV:|C|=∗[C]]

Fig. 2. Example

of using dynamic programming. Each large orange state represents a possible state of activation of the network, with black nodes

repre-senting active nodes, and white nodes reprerepre-senting inactive nodes. The value of τ∗ indicates the minimal expected time required to reach this state of

activation. Value on each arrow represents the cost of activation of a single node, required to move between states. Red arrows represent optimal sequence of activation, which corresponds to the least expensive path from the initial state to the state where entire network is activated.

τ

(

C

)

=

0 if C

= {

iS

}

min iC

τ

(

C

\ {

i

}) +

 wi jN(i)Cwji if

|

C

| >

1

otherwise

whereweassumethatsummationoveremptysetresultsinzeroanddivisionbyzeroresultsin

.

Outofsetsofsizeone,onlythesetconsistingoftheseednodehasfinitetimeofactivation.Forsetsthatarelargerthan onewedividetheproblemintocomputingtheoptimaltimeofactivationofallnodesbutthelastone,andthencomputing the time of activation of thelast node. Since inthe strategic diffusionprocess the nodes are activated sequentially,one of the nodes i fromthe set C hasto be activated last, andwe are able to compute its time of activation based onthe informationofwhichothernodesofthenetworkarealreadyactive.Noticethatattemptingtoactivateanodewithoutany activeneighbors(orwithsumofinfluencesfromtheseneighborsequaltozero)willresultinthevalueoftheformulaequal to

.

Pseudocode of the dynamic programming algorithm is presented as Algorithm 1, while Fig. 2 presents the intuition behindthealgorithmbyshowinghowitworksonaspecificgraphstructure.

Inentry

τ

[

C

]

wecomputetheminimalexpectedtimenecessarytoactivateallnodesinsetC ,whileinentry

γ

[

C

]

we keepasequenceofactivationallowingustoachievethisoptimalexpectedtime.Inthek-thexecutionoftheloopinline 5 we computevaluesofentriesin

τ

∗ and

γ

∗ forsets C suchthat

|

C

|

=

k

+

1.Wedoitby iterating(in loopinline 6)over all setsofnodesofsizek thatcanbeactivatedwhenwestarttheprocess fromtheseednode iS andtheniterate(inloop

inline 7) overallpossible nodesthat canbe targeted,i.e., nodeswithnon-zeroprobabilityofactivation,whenthe setof active nodesisC .Inlines 8-11weupdatethebestwayofactivatingnodesinsetC

∪ {

i

}

whenactivatingnodesinC first

andactivatingnodei afterwardshaslowerexpectedtimethancurrentlyknownfastestwaytoactivatenodesinC

∪ {

i

}

. Asfortheimplementationdetails,

τ

∗and

γ

∗canbeimplementedashashtables.Inthiscasecheckingwhether

τ

[

C

]

<

isequivalenttocheckingwhether

τ

∗ containstheentryforC andthereforelines 1-2canbeomitted.

Noticealsothatduringthek-thexecutionoftheloopinline 5weonlyneedentriesintables

τ

∗ and

γ

∗ forsetsC such

that

|

C

|

=

k.Hence,toreduce memoryrequirements,wecanonlykeepinmemoryentriesfromtables

τ

∗ and

γ

∗ fortwo sizes ofsets:theonesforsizek,computedinthepreviousexecutionoftheloop(orinitialized inlines 3-4incaseofthe firstexecution)andtheonesforsizek

+

1,beingcomputedinthecurrentexecutionoftheloop.

(8)

Fig. 3. Decomposition

of graph into biconnected components. Each biconnected component is marked with different color, with cut nodes marked white

and seed nodes marked black. Gray nodes mark stubs added to provide proper influence sum for original cut nodes.

Despite theseoptimization possibilities, the dynamic programmingalgorithm remains exponential, as it considers all subsetsofnodespossibletobeactivated.Ifthenumberofnodestobeactivatedisconstantthenthealgorithmis polyno-mial.

Theorem4.Algorithm1solvestheOptimalPartialDiffusionSequenceproblemintime

O(|

V

|

z+1

|

E

|)

.

InSection4.4,wegiveadynamicprogramthatworksmoreefficientlywhenthenetworkhasbothboundeddegreeand boundedtreewidth.Notethatitiscommontousedynamicprogrammingforsolvingoptimizationproblemsongraphswith boundedtreewidth,see,e.g.,[34].

Analternativeapproachtodevelopinganalgorithmoffindinganeffectivewayofactivatingthenetworkwouldbetouse thereinforcementlearningtechniques [35].Reinforcementlearningisbasedontheideaofmakingasequenceofdecisions by finding abalance betweenexploitation (performing thecurrently bestaction toobtain short-term profits) and explo-ration (investigatingother actions in thehope ofa long-termgain). Incaseofstrategic diffusion,the exploitation would be activatinganodewiththehighestactivationprobability,whiletheexploration wouldbe activatinganode withlower activationprobability inordertomakeits neighborseasiertoactivate.Nevertheless,an algorithmbasedonreinforcement learningwouldnotguaranteethatidentifiedactivationsequenceistheoptimalsolution,unlikethepresenteddynamic pro-grammingalgorithm,whichofferssuchcertainty.Hence,weleavethedevelopmentofanalgorithmbasedonreinforcement learningasapotentialfuturework.

Wenowdiscussotherwaysofoptimizationformorerestrictednetworkstructures.

4.3. Decompositionintobiconnectedcomponents

Wewillnowshowthatlookingfortheoptimalwayofactivatingallnodesofthenetworkcanbemadesimplerbyusing decompositionintobiconnectedcomponents.

Cut node ina network (also calledan articulation point)is a node the removalofwhich causes network tofall into twoormoreconnectedcomponents.Byseparatingthenetworkincutnodesweobtainadecompositionintobiconnected components.Biconnectedcomponentisa maximalsubnetwork(intermsofinclusion)suchthat removinganynode from itwillnotdisconnectit.AnexampleofadecompositionintobiconnectedcomponentsisshowninFig.3.Noticethatacopy ofacutnodeappearsineverybiconnectedcomponentitbelongsto.

The strategic diffusion process in every biconnected component can be considered separately. This is because every biconnected components has only one possiblestarting point of the diffusion (being it either the seed node in caseof biconnected componentcontaining itor thecut node closest to theseed node in caseofall other components). Thisis indicatedbyblackcolorofanodeinFig.3.OncethisstartingpointofagivencomponentC isactivated,theactivationstate inotherbiconnectedcomponentsofthenetworkdoesnotaffectactivationisC .

This is because activation probability of a node is only affected by the state of activation of its neighbors and the value of wi (toremind the reader, wi

=



jN(i)wji). Only cut nodes haveneighbors inother biconnected components,

henceforallnon-cutnodesthisobservationistrivial.Asforanycutnodei,noticethatitsneighborsinother biconnected componentscanonly beactivatedafteri is activated(asall thepathsbetweenthemandtheseed noderun throughthe cutnode).Thereforetheonlywayinwhichneighborsofacutnodei inotherbiconnectedcomponentsaffecttheactivation probability ofi is addingtothevalue of wi.Toaccountforthisfact,we createadditionalstubsconnectedto copiesofi

(9)

Toremindthereader,inthisworkweconsiderundirectednetworks(noticethatthedefinitionofbiconnected compo-nents isdifferent fordirected networks). However, we do not assume that the relation ofinfluence is symmetric,i.e., it is possiblethat fori j

E we have wi j

=

wji.Sincewe considerthe decompositionintodisconnectedcomponentsin the

context ofagivenseednode,thereisnevera disambiguationintermsofwhichedge weightshavetobeused incaseof the cut nodes.To compute the activation probability ofa cut node i, foreach neighbor j onlyweight wji is takeninto

consideration. Whatismore,onlyneighborsof i thatareclosertothe seednode thani canhaveapositive contribution intoitsactivationprobability,astheneighborsofi thatarefurtherawayfromtheseednodecanonlybeactivatedafteri is

activated.Atthesametime,i affectstheactivationtimeofitsneighbor j throughtheweight wi j,anditcanhaveapositive

contributionbothwhen j iscloserandwhenitsfurtherawaytothesourcenodethani.

4.4. Optimalsolutionfornetworkswithboundedtreewidth

We will nowshow a polynomial algorithm that findsan optimal waytoactivate anetwork withboth treewidth and degreeboundedbyconstants.

LetG

= (

V

,

E

,

W

)

beanarbitrarynetwork.Let

(

T

,

F

)

beanetworkwhereeverynodecontains asubsetofnodesfrom

V ,i.e.,

tTt

V (wewillcalleachsuchsubsetabag).Such

(

T

,

F

)

isatreedecompositionofG (see,e.g.,[36])ifandonly

ifthefollowingconditionsaremet:

Every nodefromV iscontainedinatleastonebag,i.e.,

tTt

=

V .

For everyedgee

E thereexistsabagthatcontainsbothendsofe,i.e.,

i jE

tT

{

i

,

j

}

T .

For everynode v

V subnetworkof

(

T

,

F

)

inducedbybagscontaining v isconnected.

The treewidth

θ

ofadecompositionisthesize ofits largestbagminus one,i.e.,

θ

=

maxtT

|

t

|

1.The treewidthofa

network is theminimumover treewidthsofall its treedecompositions. Aproblemis said tobe fixed-parametertractable

[37],withrespect to parameterk, ifanyinstance ofthe problemofsize N can be solved intime f

(

k

)

·

NO(1),forsome

computable function f

(

·)

.WeshowthattheOptimal StrategicDiffusionproblemisfixed-parametertractablewithrespect tothesumoftreewidthandmaximumdegree.

4.4.1. Fulldiffusion

Tobetterexplaintheidea,wefirstgiveanalgorithmforthefulldiffusioncase.

Theorem5.LetG

= (

V

,

E

,

W

)

beanetworkwithmaximaldegree

δ

.Thereexistsanalgorithmthat,givenatreedecompositionof treewidth

θ

,findsanoptimalwayofactivatingnodesinV intime

O(θδ(θδ)!

2n

)

.

In whatfollows,weassume that thetreedecompositionisrootedinsome nodetR,thesameforbottom-up and

top-downorder.Weusethefollowingnotation:

c

(

t

)

denotesthesequenceofchildrenoft;

γ

|

X denotesthesubsequenceof

γ

consistingonlyofthenodesinset X ;

γ

x denotesthesubsequenceof

γ

consistingofallelementsprecedingx;

γ

x denotesthesubsequenceof

γ

consistingofx aswellasallelementsfollowingx;

TtopdowndenotessequenceofnodesinT inatop-downorder.

Inwhatfollowswewillcalltwopermutations

γ

and

γ

ofnodesfromG compatible,denoted

γ

γ

,ifandonlyifthey activatethenodesthattheyhaveincommoninthesameorder,i.e.,

γ

|

γ

=

γ



|

γ .

Foreachnodeoft thetreedecompositionT wecomputearecordconsistingofthefollowingfields:

• ϒ[

t

]

storesallpermutationsofnodesinbagt andtheirneighborsthatcanbeapartofavalidsolutiontotheproblem;

τ

[

t

,

γ

,

i

]

storestheexpectedtimeofactivationofeverynodeini

t whenactivatedintheordergivenby

γ

;

τ

[

t

,

γ

]

storesthesmallestexpectedtime to activateallnodes inthesubnetwork of G inducedby thenodes inthe subtreeofT rootedatt,whenthenodesint andtheirneighborsareactivatedintheordergivenby

γ

.

Tocomputetherecordswetraversethetreedecompositioninabottom-uporder(giventheroottR)andforeverynode

t weperformthefollowingsteps:

1. Compute:

ϒ

[

t

] =

γ

∈ 

(

t

it N

(

i

))

: (

iS

/

t

γ

1

=

iS

)



γit: γi=iS

j<i

γ

j

N

(

γ

i

)





t∈c(t)

γ∈ϒ[t]

γ



γ

 

;

(10)

Algorithm 2: Algorithm reconstructingtheoptimalwayofactivatingall nodesofanetworkwithboundedtreewidth anddegree.

Input: Tree

decomposition

(T,F)of a weighted network (V,E,W), with computed values of τ∗.

Output: Sequence

of activation of nodes in

V starting

with

iSwith minimal expected time of activation.

1 γ∗←  2 for tTtopdowndo 3 γ←arg minγ∈ϒ[t]:γ∼γτ∗[t,γ] 4 γ←  5 forγiγ do 6 ifγithen 7 γ←γ⊕ γi 8 else 9 γ∗←γ∗∗ iγ γi 10 γ←  11 γ∗←γ∗⊕γ 12 returnγ

2. Forevery

γ

∈ ϒ[

t

]

andevery

γ

i

γ

|

t compute:

τ

[

t

,

γ

,

γ

i

] =

wγi



j<iwγjγi

;

3. Forevery

γ

∈ ϒ[

t

]

compute:

τ

[

t

,

γ

] =



γiγ|t

τ

[

t

,

γ

,

γ

i

] +



t∈c(t) min γ∈ϒ[t]:γ∼γ

τ

[

t

,

γ



] −



γi∈γ|t

τ

[

t

,

γ



,

γ

i

]

⎠ .

Aftertraversingthetreeinthisfashion,theminimaltimerequiredtoactivatedentirenetworkG startingwithiS canbe

identifiedasminγ∈ϒ[tR]

τ

[

tR

,

γ

]

.Inorderto reconstructthesequenceallowing toactivateentirenetworkinthe optimal

time,wecanrunAlgorithm2.

Letusnowcommentontheprocedureoffillingtherecords.

In step 1we gather all permutationsof nodesin bagt and their neighbors that can be apart ofa valid solutionto theproblem,accordingtothreeconditions.Thefirstcondition assertsthatifthepermutation containstheseednode,itis activatedastheveryfirstnode.Thesecond conditionassures thatevery nodeinbagt otherthanthesourcenode hasat leastoneactiveneighboratthemomentofactivation(noticethatforeverynodeint allofitsneighborsarepresentinthe permutation).Thethirdconditionprovidesthatthepermutation

γ

allowstoactivateallnodesinthesubtreeoft,i.e.,that foreverychildoft inthetreedecompositionthereexistsatleastonevalidpermutation

γ

thatactivatesnodesfrom

γ

in thesameorderas

γ

.

Instep 2wesimplycomputethetimeofactivationofeverynodeinbagt,accordingtothedefinitiongiveninSection3, andstoreitintable

τ

.Again,noticethatforevery nodeint allofitsneighborsarepresentinthepermutation,hencewe haveenoughinformationtocomputeitsexpectedtimeofactivation.

Finally,instep 3wecompute theoptimaltimeofactivationofallnodesinthesubtreeoft whileusingpermutation

γ

andstore itintable

τ

∗.The expressionconsistsofa sumoftime ofactivationofthe nodesinbagt anda sumoverall children oft,whereforeachofthemwe compute theminimumoverall compatiblepermutationsandsubtractthe time ofactivationofnodesint.Notice thatsince

(

T

,

F

)

isthetreedecomposition,theonlypossibleoverlapbetweennodesin thesubtreesofdifferentchildrenoft arenodesinbagt.Otherwisethegraphinducedbybagscontainingsuchoverlapping nodeswould not be connected,andhence one ofthe conditionsof beingthe treedecomposition wouldnot be met for

(

T

,

F

)

.

Afterhavingfilledtables

ϒ

,

τ

∗ and

τ

,wetraverse thetreedecompositionagain, thistime ina top-downorderusing Algorithm2. Weconstructan optimalsolution tothe problemonvariable

γ

∗.In line3we selectas

γ

thepermutation withminimaltimenecessaryto activateall nodesinthesubtree oft,that iscompatiblewiththesolutionconstructedso far.Inlines4-11wemerge

γ

intosequence

γ

∗ usingauxiliaryvariable

γ

.

As for the time complexity of the algorithm, the most costly operations (both of which are equally expensive) are validatingexistence ofacompatiblesequenceforeverychildof t in thethirdconditioninstep 1andcomputingthetime necessary to activatethe nodesin the subtreesof all children of t inthe second sum instep 3. Forassessing the time complexityofthealgorithmwewillfocusonthecostinstep 3.Sinceeverynodetisachildofatmostoneothernodein thetreedecomposition,foreveryt

T thecomputationofminimumisexecuted

O((θδ)!)

times(the numberofdifferent permutations

γ

∈ ϒ[

t

]

). Sincethere are

O(

n

)

nodesinthetreedecomposition,the computationofminimum isexecuted

O((θδ)!

n

)

times.Thecostofexecutingitonceis

O(θδ(θδ)!)

,sincewe havetocheck

O((θδ)!)

manypermutationsin

ϒ

[

t

]

and for each of them check whether

γ



|

γ

=

γ

|

γ in time

O(θδ)

. Hence, the total time complexity of the algorithm is

(11)

Algorithm 3: Algorithmcomputingthevalueof

τ

[

t

,

γ

,

k

]

.

Input: Tree

decomposition

(T,F)of a weighted network (V,E,W), with values of τ∗computed so far, node t with

sequence of children

c(t)= t1,. . . ,t|c(t)|.

Output: The

value of

τ∗[t,γ,k].

1 for ti∈ t1,. . . ,t|c(t)|do 2 for m=0,. . . ,k−γ|tdo 3 if i=1 then 4 τ[t,1,γ,m]← min γ∈ϒ[t1]: γ∼γ  τ∗[t1,m+ |γ|t|] −γi∈γ|[t1 ,γ i]  5 else 6 τ[t,i,γ,m]← min m∈{0,...,m}  τ[t,i−1,γ,m]+ min γ∈ϒ[ti]: γ∼γ  τ∗[ti,γ,mm+ |γ|t|] −γi∈γ|tτ[ti,γ,γi]   7 returnγi∈γ |tτ[t,γ,γi]+τ t,|c(t)|,γ,kγ|t 4.4.2. Partialdiffusion

Wenextpresentanalgorithmforpartialdiffusioninnetworkswithboundedtreewidthandboundeddegree.

Theorem6.LetG

= (

V

,

E

,

W

)

beanetworkwithmaximaldegree

δ

.Thereexistsanalgorithmthat,givenatreedecompositionof treewidth

θ

,findsanoptimalwayofactivatingz nodesinV intime

O(θδ(θδ)!

2z2n

)

.

We useessentially thesame notation asinprevious section.However, in whatfollowswe will calltwo permutations

γ

∈ (

X

)

and

γ



∈ (

X

)

of nodes fromG compatible, denoted

γ

γ

,if andonly if they activate thenodes that they haveincommoninthesameorder,i.e.,

γ

|

γ

=

γ



|

γ ,andagreeonthenon-activatednodes,i.e.,

γ

|

X\γ

=

γ



|

X\γ

= 

(this

differenceistheresultofconsideringalsopermutationsthatarenotfull,i.e.,permutations

γ

∈ (

X

)

suchthat

γ

<

|

X

|

). Therecordthatwecomputeforeachnodeoft thetreedecompositionT consistingnowofthefollowingfields:

• ϒ[

t

]

storesallpermutationsofnodesinbagt andtheirneighborsthatcanbeapartofavalidsolutiontotheproblem;

τ

[

t

,

γ

,

i

]

storestheexpectedtimeofactivationofeverynodeini

t whenactivatedintheordergivenby

γ

;

τ

[

t

,

γ

,

k

]

denotesthesmallestexpectedtimetoactivatek nodesinthesubnetworkofG inducedbythenodesinthe subtreeofT rootedatt,whenthenodesint andtheirneighborsareactivatedintheordergivenby

γ

.

Hence, the only difference in comparison to the full diffusion version is that table

τ

∗ is additionally indexed with the numberofnodestoactivate.

Tocomputetherecordswetraversethetreedecompositioninabottom-uporder(giventheroottR)andforeverynode

t weperformthefollowingsteps:

1. Compute:

ϒ

[

t

] =

γ

∈ (

t

it N

(

i

))

: (

iS

/

t

γ

1

=

iS

)



γit: γi=iS

j<i

γ

j

N

(

γ

i

)





t∈c(t)

γ∈ϒ[t]

γ



γ

 

;

2. Forevery

γ

∈ ϒ[

t

]

andevery

γ

i

γ

|

t compute:

τ

[

t

,

γ

,

γ

i

] =

wγi



j<iwγjγi

;

3. Forevery

γ

∈ ϒ[

t

]

andfork

= |

γ

|

t

|,

. . . ,

z computethevalueof

τ

[

t

,

γ

,

k

]

usingAlgorithm3.

Aftertraversing the treeinthisfashion, theminimal time requiredto activatez nodes innetwork G starting withiS

canbeidentifiedasminγ∈ϒ[tR]

τ

[

tR

,

γ

,

z

]

.Inordertoreconstructthesequenceallowingtoactivatez nodesintheoptimal

time,wecanrunAlgorithm4.

Letusnowcommentontheprocedureoffillingtherecords.

Steps 1and 2 are almostidentical tothose ofthe algorithmforfull diffusion,withnotable difference beingthat this timeweconsidersequencesthatarenotnecessarilyfull.

Instep 3wecallAlgorithm3tocomputethevalueof

τ

[

t

,

γ

,

k

]

,i.e.,theoptimaltimeofactivatingk nodesinthesubtree oft whileusingpermutation

γ

.Tothisend,wevisitallthechildrenoft,intheordert1

,

. . . ,

t|c(t)|.Form

∈ {

0

,

. . . ,

k

− |

γ

|

t

|}

we compute

τ



[

t

,

i

,

γ

,

m

]

,theminimumexpectedtimerequiredtoactivatem nodesotherthantheonesactivatedint,inthe subnetwork inducedbyunionofthesubtreesrootedatt1

,

. . . ,

ti.Wedothisusingthevaluesof

τ

computedforchildren

oft precedingti,i.e., fort1

,

. . . ,

ti−1,aswell asthe valuesof

τ

∗ computedforti.Intheselection processm denotesthe

(12)

Algorithm 4: ReconstructPartial

(

t

,

γ

,

k

)

:algorithmreconstructingtheoptimalwayofactivatingk nodesinanetwork withboundedtreewidthanddegree.

Input: Tree

decomposition

(T,F)of a weighted network (V,E,W), with computed values of τ∗and τ, a bag t,

a sequence

γ∈ (V)and an

integer kn.

Output: Sequence

of activation of

z nodes

in

V starting

with

iSwith minimal expected time of activation.

1 if k>0 then 2 γ←arg minγ∈ϒ[t]:γ∼γτ∗[t,γ,k] 3 γ←  4 forγiγ do 5 ifγithen 6 γ←γ⊕ γi 7 else 8 γ∗←γ∗∗ iγ γi 9 γ←  10 γ∗←γ∗⊕γ 11 k←k− |γ|t| 12 for ti∈ t|c(t)|,. . . ,t1do 13 m←arg minm∈{0,...,k}  τ[t,i−1,γ,m]+minγ∈ϒ[ti]: γ∼γ  τ∗[ti,γ,k−m+ |γ|t|] −γi∈γ|[ti,γ ,γ i]   14 γ∗←ReconstructPartial(ti,γ,k−m+ |γ|t|) 15 k←m 16 returnγ

inthesubtreerootedinti.Thereturnedvalueoftheminimaltimeofactivationofk nodesinthesubtreeoft whileusing

permutation

γ

iscomputedasthe sumofthetime ofactivation ofnodesfromt and theoptimaltime ofactivatingthe remainingk



γ

|

t



nodesinall

|

c

(

k

)

|

childrenoft.

Thetime complexityofthealgorithmis similartothefulldiffusioncase, exceptthatwe havenow thetwoadditional loops,oneoverk instep 3andtheotheroverm inline6ofAlgorithm3,whichcontributeanadditionalfactorof

O(

z2

)

to therunningtime.

We are unable to compute an optimal solutionin polynomial time in generalcase. We may howeverhope to find a polynomialapproximationalgorithm.Wenowmovetotheanalysisofpossiblewaysofapproximatingtheoptimalsolution. We now move to assessing the possibility of approximating the optimal solution to the Optimal Diffusion Sequence Problem.

4.5. Lowerboundsonheuristicalgorithms

Alshamsi et al. [1] suggesttwoheuristicstrategiesforactivatinganetworkinaprocessofstrategicdiffusion:thegreedy strategy(alwaystargetnodewiththehighestprobabilityofactivation)andthemajoritystrategy(alwaystargetnodewith thehighestnumberofactiveneighbors).

Wenowshowthatneitherofthesestrategiesapproximatesanoptimalsolutiontowithinaconstantratio.

Theorem 7.There exists no constant r

>

1 such that choosing the sequence of activation using the greedy strategy is an r-approximationalgorithm.Inparticular,theapproximationratioofthegreedystrategyis

(

logn

)

.

Proof. Wewillnowshowaseriesofnetworkswheretheapproximationratioofthegreedyalgorithmgoestoinfinity. Foragivenk

∈ N

weconstructanetwork G

(

k

)

,wherethesetofnodesisV

= {

iS

,

a1

,

. . . ,

ak2

,

b1

,

. . . ,

bk−1

}

andwhere thesetofedgesis:

E

=

k2

i=1

{

iSai

} ∪

k2

i=1 k

−1 j=1

{

aibj

}.

Weassumethattheweightofeveryedgeis1.ThestructureofnetworkG

(

k

)

ispresentedinFig.4.

Letx bethenumberofcurrentlyactivatedai nodesandlet y bethenumberofcurrentlyactivatedbi nodes.We have

that

τ

(

ai

)

=

y+k1 andwehavethat

τ

(

bi

)

=

k

2

x.

Wewillnowanalyzetwodifferentwaysofactivatingallnodesinnetwork G

(

k

)

.First,letusconsiderthegreedy algo-rithm.Lettheindicesofnodesai,aswell astheindicesofnodesbj,beordered accordingtotheorderofactivation,i.e.,

nodea1isthefirstactivatednodeai,whilenodeb1 isthefirstactivatednodebi.Noticethatatleastik nodesa hastobe

activatedbeforetheactivationofnodebi.Wewillprovethisclaimbycontradiction.Assumethatbicanhave j

<

ik active

neighborsatthemomentofactivation.Takenodeaj+1 (thefirstnodea activatedafterbi).Atthemomentofactivationof

bi itsprobability ofactivation is kj2,whilethe probabilityofactivation ofaj+1 is

i

(13)

Fig. 4. Network G(k)constructed for the proofs of Theorems7and8.

andbi wasactivatedbeforeaj+1,we havetohave kj2

i

k.However,thisisnottruesince j

<

ik.Wehaveacontradiction,

thereforeatleastik nodesa hastobeactivatedbeforetheactivationofnodebi.

Becauseofthisatleastk nodesai hastobeactivatedwithonlyoneneighboractive,thenatleastk nodesaihastobe

activatedwithonlytwoneighborsactive,andsoon.Focusingonlyonthetimeofactivationofnodesai wehavethatthe

totalexpectedtimeofactivationforthegreedyalgorithmis:

τ

G

(

G

(

k

))

k2



i=1

τ

(

ai

)

k



j=1 kk j

=

k 2H k

where Hkisthek-thharmonicnumber.

Now, let usconsider an algorithm A, wherewe first activatek nodes ai,then all nodes bi andfinally theremaining

nodesai.Timeofactivationoftheentirenetworkisthen:

τ

A

(

G

(

k

))

=

k2

+ (

k

1

)

k

+ (

k2

k

)

=

3k2

2k

sinceeachofthefirstk nodesai isactivatedinexpectedtimek (asitonlyhasoneactiveneighbor,namelyiS),eachnode

bi isactivatedinexpectedtimek (asithasdegreek2andk activeneighborsatthemomentofactivation)andeachofthe

remainingk2

k nodesa

iisactivatedinexpectedtime1.

ConsidersequenceofnetworksG

(

k

)

.Forthissequencewehave

lim k→∞

τ

G

(

G

(

k

))

τ

(

G

(

k

))

klim→∞

τ

G

(

G

(

k

))

τ

A

(

G

(

k

))

lim k→∞ Hk 3

= ∞.

Therefore,solutionprovidedbythegreedyalgorithmcanbearbitrarilyworsethantheoptimalsolution.

2

Wealsoshowanalogicalresultforthemajoritystrategy.

Theorem 8.There exists no constant r

>

1 such that choosing the sequence of activation using the majority strategy is an r-approximationalgorithm.Inparticular,theapproximationratioofthemajoritystrategyis

(

logn

)

.

Proof. WewillnowshowthatfortheseriesofnetworksconstructedintheproofofTheorem7theapproximationratioof themajorityalgorithmalsogoestoinfinity.

Letx be thenumberofcurrentlyactivatedai nodesandlet y bethenumberofcurrentlyactivatedbi nodes.Wehave

that

τ

(

ai

)

=

y+k1 andwehavethat

τ

(

bi

)

=

k

2

x.

LetusconsidertheexpectedactivationtimeofanentirenetworkG

(

k

)

usingmajorityalgorithm.Lettheindicesofnodes

ai,aswell astheindices ofnodes bj, be ordered accordingto the orderof activation,i.e., node a1 isthe first activated node ai,whilenodeb1 isthefirstactivatednodebi.Noticethatnodebiatthemomentofactivationhasatmosti

+

1 active

neighbors.Wewillprovethisclaimbycontradiction.Assumethatbi canhave j

>

i

+

1 activeneighborsatthemomentof

activation. Consider numberofactiveneighborsatthe momentofactivationofnode aj (the last nodea activated before

bi).Sincewe areusingthemajorityalgorithm,ithastohaveatleast j

1 active neighbors(otherwise nodebi,havingat

themoment j

1 activeneighbors,wouldhavebeenchosenbeforenodeaj).However,nodeajcanhaveatmosti

<

j

1

activeneighbors,namelynodesb1

,

. . . ,

bi−1 andnodeiS (asnodebi isnotactiveyet).Wehaveacontradiction,thereforebi

atthemomentofactivationhasatmosti

+

1 activeneighbors.

Focusing onlyonthetimeofactivationofnodesbi wehavethat thetotalexpectedtime ofactivationforthemajority

(14)

Fig. 5. NetworkG(X,r)constructed for the proof of Theorem9. Blue numbers next to edges express their weights. If there are no arrows next to edge

i j thenwi j=wji. Otherwise weight wi j is denoted next to the arrow pointing towards node j,

and weight

wji is denoted next to the arrow pointing

towards node i.

τ

C

(

G

(

k

))

k−1



i=1

τ

(

bi

)

k−1



i=1 k2 i

+

1

=

k 2

(

H k

1

)

whereHkisthek-thharmonicnumber.

Let A bethe alternativealgorithmdescribed inthe proofofTheorem 7.Consider sequenceofnetworks G

(

k

)

.Forthis sequencewehave lim k→∞

τ

C

(

G

(

k

))

τ

(

G

(

k

))

klim→∞

τ

C

(

G

(

k

))

τ

A

(

G

(

k

))

lim k→∞ Hk 3

= ∞

Therefore,solutionprovidedbythemajorityalgorithmcanbearbitrarilyworsethantheoptimalsolution.

2

4.6. Inapproximability

Finally,we show that infact the problemcannot be approximatedwithin a ratioof

(

1



)

ln n for any



>

0,unless

P

=

N P .

Theorem9.TheOptimalPartialDiffusionSequenceproblemcannotbeapproximatedwithinaratioof

(

1



)

ln n forany



>

0,

unlessP

=

N P .

Proof. Inordertoprovethetheorem,wewillusetheresultbyDinurandSteurer [27] thattheMinimumSetCoverproblem cannotbeapproximatedwithinaratioof

(

1



)

ln n forany



>

0,unlessP

=

N P .

LetX

= (

U

,

S

)

beaninstanceoftheMinimumSetCoverproblem.Toremindthereader,U istheuniverse

{

u1

,

. . . ,

u|U|

}

, whileS isacollection

{

S1

,

. . . ,

S|S|

}

ofsubsetsofU .Thegoalhereistofindsubset S

S suchthattheunionofS∗equals

U andthesizeofS∗ isminimal.

First, we will show a function f

(

X

)

that based on an instance of the Minimum Set Cover problem X constructs an instanceoftheOptimalPartialDiffusionSequenceproblem.

LetnetworkG

(

X

)

bedefinedasfollows(anexampleofsuchnetworkispresentedinFig.5):

Thesetofnodes: Forevery Si

S,wecreatethreenodes,denotedby Si,qi andqi.Foreveryui

U ,wecreate

|

S

|

+

1

nodes,denotedbyui,1

,

. . . ,

ui,|S|+1.Additionally,wecreateasinglenodeiS.

Thesetofedges: Forevery node Si,wecreateanedge SiiS.Foreverynode qi,wecreateedgesqiSi andqiqi.Finally,

foreverynodeui,i andevery Sjsuchthatui

Sj wecreateanedgeui,iSj.

Theweightmatrix: Foreveryedge SiqiwesetitsweightstowSiqi

=

0 andwqiSi

=

α

1,where

α

=

z

|

S

|

λ+1 forsome

λ

>

0.Forevery edgeui,iSj weset itsweightsto wui,iSj

=

0 and wSjui,i

=

1.Forallother pairsofnodesconnected

withanedgewesettheirweightsto1.

Tocomplete the constructedinstance ofthe Optimal PartialDiffusion Sequenceproblem, we set theseed node to be

iS andwe setthe numberof nodesto be activatedto z

= |

U

|(|

S

|

+

1

)

+

1.Hence, the formulaoffunction f is f

(

X

)

=

Figure

Fig. 1. Network G constructed for the proof of Theorem 3. Blue numbers next to edges express their weights
Fig. 2. Example of using dynamic programming. Each large orange state represents a possible state of activation of the network, with black nodes repre- repre-senting active nodes, and white nodes representing inactive nodes
Fig. 3. Decomposition of graph into biconnected components. Each biconnected component is marked with different color, with cut nodes marked white and seed nodes marked black
Fig. 4. Network G ( k ) constructed for the proofs of Theorems 7 and 8.
+2

Références

Documents relatifs

The article is structured as follows: the second section highlights some aspects on bayesian networks and particularly on bayesian network classifiers; the third

Building on this construction, we address the issue of pasting weak solutions to (1), or, more, generally, the issue of pasting stopping strategies for the optimal stopping problem

Naderi, “Reducing the data transmission in wireless sensor networks using the principal component analysis,” in Intelligent Sensors, Sensor Networks and Information Processing

Under small interactions, a more central agent always plays a higher action and a shock on one agent ends up affecting every other agent in connected networks.. In contrast under

Participatory epidemiology and formal disease surveillance network did not detect the same number of outbreaks.. This is likely to be due to sequential implementation of the

Pour commencer, nous avons récolté après reconstruction des spectres, l’intégral des métabolites contenus dans les voxels d’une coupe sélectionnée sur deux

In the previous work, the sensitivity analysis was performed on a parallel asynchronous cellular genetic algorithm [2] for scheduling of independent tasks in a computational grid..

It aims at analysing MERCOSUL road network measures of integration, synergy and centrality in order to display the correlations between the borderline twin cities