Computational aspects of optimal strategic network diffusion

(1)

HAL Id: hal-03058609

https://hal.archives-ouvertes.fr/hal-03058609

Submitted on 4 Mar 2021

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Computational aspects of optimal strategic network

diffusion

Marcin Waniek, Khaled Elbassioni, Flávio Pinheiro, César A. Hidalgo,

Aamena Alshamsi

To cite this version:

Marcin Waniek, Khaled Elbassioni, Flávio Pinheiro, César A. Hidalgo, Aamena Alshamsi.

Computa-tional aspects of optimal strategic network diffusion. Theoretical Computer Science, Elsevier, 2020,

814, pp.153 - 168. �10.1016/j.tcs.2020.01.027�. �hal-03058609�

(2)

Contents lists available atScienceDirect

Theoretical

Computer

Science

www.elsevier.com/locate/tcs

Computational

aspects

of

optimal

strategic

network

diffusion

Marcin Waniek

a

,

b

,

∗

,

Khaled Elbassioni

b

,

Flávio L. Pinheiro

c

,

d

,

César A. Hidalgo

d

,

Aamena Alshamsi

b

a_Computer_Science,_New_York_University_Abu_Dhabi,_Abu_Dhabi,_United_Arab_Emirates b_Khalifa_University_of_Science_and_Technology,_Abu_Dhabi,_United_Arab_Emirates

c_Nova_Information_Management_School_(NOVA_IMS),_Universidade_Nova_de_Lisboa,_Lisboa,_Portugal d_MIT_Media_Lab,_{Massachusetts}_Institute_of_Technology,_Cambridge,_USA

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory:

Received 5 July 2019

Received in revised form 19 December 2019 Accepted 26 January 2020

Available online 30 January 2020 Communicated by L.M. Kirousis Keywords: Strategic diffusion Network contagion Complex networks Inﬂuence maximization

Diffusiononcomplexnetworksisoftenmodeled asastochasticprocess.Yet,recentwork on strategic diffusionemphasizes the decision power of agents[1] andtreats diffusion as a strategic problem. Herewe studythe computational aspects of strategic diffusion,

i.e.,finding theoptimal sequenceofnodes toactivate anetwork intheminimum time. We provethat finding anoptimal solution tothisproblem isNP-complete inageneral case.To overcomethiscomputationaldifficulty, wepresentan algorithmtocomputean optimal solution based on a dynamic programming technique. We also show that the problemisfixedparameter-tractablewhenparametrizedbytheproductofthetreewidth andmaximumdegree.Weanalyzethepossibilityofdevelopinganefficientapproximation algorithmandshowthattwoheuristicalgorithmsproposedsofarcannothavebetterthan alogarithmicapproximationguarantee.Finally,weprovethattheproblemdoesnotadmit betterthanalogarithmicapproximation,unlessP=NP.

1. Introduction

Diffusion in social networks has beena widely studied applicationin problems such as epidemic outbreaksand the adoption of behaviors andinnovations [2–7]. The approach to model diffusion can be roughly divided between simple andcomplexcontagion processes.Undersimplecontagion transmissionrequirescontactwithone individualataconstant ratesuch astransmission ofinfectious diseases [8]. An exampleof simplecontagion process isthe independentcascade model [9].Incontrast,complexcontagion requiresreinforcementfrommultiplesources.Anexampleofcomplexcontagion processisthelinearthresholdmodel [9].Complexcontagionhasbeenwidelystudiedinthecontextofcascadephenomena, withparticular applicationsto viralmarketing campaigns [10–12].In that context,the problemunderanalysis isthat of ﬁndingthe initialsetofseeds thatwouldmaximize diffusion.Differentapproachestosolving theseedselection problem includeusingcentralitymeasures [13],networkpercolation [14],andpropagationtraces [15],amongmanyothers [16–18]. Someworksintheliteraturetakeanadaptiveapproachtothetopic,basingtheselectionofseedsonadditionalknowledge gatheredduringtheprocess,beiteitherthecurrentstateofadynamicnetworktopology [19],orexpandedpoolofpotential seeds [20,21].Analternativewayofincreasingthediffusioncoverageisspreadingavailableseedsovertime [22,23].Despite thestrategicaspectsofdifferentmethodsofseedselection,thediffusionprocessitselfispurelystochastic.

*

Corresponding author at: Computer Science, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates.

E-mailaddress:mjwaniek@nyu.edu(M. Waniek). https://doi.org/10.1016/j.tcs.2020.01.027

(3)

However, thereare casesinwhichdiffusiontakesplaceaccordingtothesamerules ofcomplexcontagion,butthatare not purely stochastic (i.e., they have a strategic component) andin which thenetwork itself is not, necessarily, a social network. Take for instancehow regions develop neweconomic activities [24] or start production innew research ﬁelds [25]. Inthese casesagents are not embedded in the network, insteadthey are takingactions over a networkedsystem, which captures the relatedness between economic or academic activities. More importantly, not only the choice of the initial state is strategic, but also the whole process of diffusion is strategic. Stochasticity in this context is not at the pairwise interactionsbetweenagents,butinthesuccessrateofagentsinenteringnewactivities.Incaseoftheeconomic development, theprobability ofsuccess while entering a newactivity (represented by a node in the network ofrelated economic activities)increaseswiththenumberofotheralreadydevelopedactivities thatareconnectedtoit.Thus,agents canfailorsucceeddependingontheparticularcontextofeachstrategicaction.

Alshamsi et al. [1] proposedtostudythediversificationofeconomicactivitiesthroughthelensesofastrategicdiffusion model.Inthismodeltheentireprocessofdiffusionisguidedbyastrategicagent.Theagent’sactionsconcerntheselection of a node ina network tobe targeted ateach time step.Hence, the node canbecome activated inthe next stepof the action sequenceoftheagent, ornot.Following existingempiricalfindings [24,26],the modelassumesthatthe activation probability (i.e.,the probability ofsuccess ofthe agent’saction) iscaptured by acomplex contagion process. Thegoal is toactivatetheentirenetwork,startingfromasingleactivenode,intheminimumtime possible.WhileAlshamsi et al. [1] showedforawiderangeofnetworktopologiesthatactivatinganetworkinanoptimalmannerrequiresabalancebetween exploitationandexplorationstrategies,manyquestionsregardingthecomputationalfeasibilityoffindingoptimalstrategies remainedopen.Here,weexploreseveraltheoreticalcomputationalconsiderationsofstrategicdiffusionprocesses.

1.1. Resultsandmethods

Westartbyexploringthecomputationalfeasibilityofﬁndinganoptimalstrategythatminimizesthetotaldiffusiontime,

i.e., answering the question ofwhatis thebest sequenceof activatinga givennumberofnodesin thenetwork (starting fromaseed node)inthe shortestexpectedtime,whiletakingintoconsiderationthat thetimenecessarytoactivateeach nodeisproportionaltothenumberofactiveneighbors.

WeshowthatthedecisionversionoftheproblemisNP-complete(Theorem3).Weproveitusingareductionfromthe Set Cover problem.Thisobservation indicatesthat developinga polynomialalgorithm forﬁndingan optimalsequence of strategicdiffusionisnotrealistic,unlessP=NP.

Giventhisdifficulty,weproposean algorithmforcomputingan optimalwayofactivatingagivennumberofnodesin theshortesttimepossible(Algorithm1).Whilethealgorithmtakesexponentialtimetofindthesolution,itstilloutperforms simple exhaustivesearch andcan be usedto compute an optimalsolutionfor networksofmoderatesize. Thealgorithm utilizesthedynamicprogrammingtechniquetofindthefastestwaytoreacheverypossiblestateofnetworkactivation.

Wealsopresenttwo algorithmsfindinganoptimalsolutionforamorerestrictiveclassofnetworks,i.e.,networkswith bounded treewidth andboundedmaximal degree. Theytraverse a tree decompositionofa network andfind an optimal wayofactivatingeitheranentirenetworkora givennumberofnodes,respectively.Asa consequence,we showthat the problemisfixedparameter-tractablewhenparametrizedbythesumofthetreewidthandmaximumdegree.

Asfindinganoptimalsolutionprovestobecomputationallydemanding,weturnourattentiontoassessingthepossibility ofobtaininganefficientapproximationalgorithm.Wefirstinvestigatewhethertwoeffectiveheuristicalgorithmsproposed by Alshamsi et al. [1] have constant approximationratios. Oneofthem isthe greedyalgorithm,i.e., always targetingthe nodewiththehighestprobabilityofactivation.Theotheristhemajorityalgorithm,i.e.,targetingthenodewiththehighest number ofactive neighbors.Weshow that the approximationratiofor bothofthese algorithmsis

(

log n

)

(Theorems7

and 8). The proofs are based on constructing a sequence of networks where the ratio between total expected time of activationofthesolutionobtainedusingtheheuristicalgorithmandtheoptimalexpectedtimeofactivationgoestoinﬁnity. Finally, we show that, unless P=NP, there is no way to approximate the problem within a ratio better than ln n, in particular it is impossibleto construct an r-approximationalgorithm for a constant r (Theorem9). We prove this claim by showing a reduction from the Minimum Set Cover problem and using the fact that Minimum Set Cover cannot be approximatedwithinaratioof

(

1

−

)

ln n forany

>

0,unlessP=NP [27].

Organizationofthemanuscript. Theremainderofthearticleisorganizedasfollows.Section2describesthenotationsand computational problemsused in our reductions. Section 3 presentsa formal deﬁnitionof the strategic diffusionprocess and the main problemconsidered in our study.In Section 4 we present ourhardness resultforthe decision version of theproblemandwedescribeadynamic programmingalgorithm toﬁndan optimalwayofstrategicdiffusion.Section4.4

introduces algorithms computingoptimalsolutionfornetworkswithboundedtreewidthandmaximaldegree.Section 4.5

describesourresultsconcerning approximationoftheoptimalsolution.Section 5presentsconclusions andpotentialideas forfuturework.

2. Preliminaries&notation

(4)

2.1. Basicnetworknotation

LetG

= (

V

,

E

,

W

)

denoteanetworkwithweightededges,where V

= {

1

,

. . . ,

n

}

denotesthesetofn nodes,E

⊆

V

×

V

denotes thesetofedges and W

∈ R

n×n _denotes _the_matrix_with_weights _of_edges. _We_denote _an_edge _between_nodes_i

and j by i j.In thiswork we considernetworksthat are undirected, i.e., we donot discern betweenedges i j and ji.We alsoassumethatnetworksdonotcontainself-loops,i.e.,

∀

i∈Vii

∈

/

E.Wedenoteby NG

(

i

)

thesetofneighbors ofi inG,i.e.,

NG

(

i

)

= {

j

∈

V

:

i j

∈

E

}

.WedenotebydG

(

i

)

thedegree ofi inG,i.e.,dG

(

i

)

= |

NG

(

i

)

|

.

Weconsider networkswithweighted edges.We denoteby wi j

∈ R

theweightoftheconnection fromi to j,wewill

callthisvalueinﬂuence thati hason j.Wedonotassumethattherelationofinﬂuenceissymmetric,i.e.,itispossiblethat fori j

∈

E wehavewi j

=

wji.Unlessstatedotherwise,wewillassumethat

∀

i,j∈Vwi j

≥

0.Wealsoassumethatifi j

∈

/

E then

wi j

=

0.Wedenoteby wi thesumofinﬂuenceonnodei,i.e., wi

=

j∈N(i)wji.Wewilltypicallyassumethat

∀

i∈Vwi

>

0.

Let

(

V

)

denotethesetofallorderedsequences ofelementsfromV without repetitions.Let

γ

idenotethei-thelement

(node)ofsequence

γ

∈ (

V

)

,and

|

γ

|

denotethenumberofelementsin

γ

∈ (

V

)

.Finally,wecall

γ

∈ (

V

)

afull sequence

if

|

γ

|

= |

V

|

andwedenotethesetoffullsequencesby

∗

(

V

)

.

Tomakethenotation morereadable,wewilloftenomitthenetworkitselffromthenotation whenitisclearfromthe context,e.g.,bywritingN

(

i

)

insteadofNG

(

i

)

.Wesometimestreatsequencesassets,whentheorderisnotimportant.We

use

⊕

todenotetheconcatenationoperationoversequences.

2.2. Computationalproblems

InourreductionswewillusetwostandardversionsoftheSetCoverproblem,decisionandcombinatorialoptimization.

Deﬁnition1(SetCover [28]).AninstanceoftheSetCoverproblemisdeﬁnedbyauniverseU

= {

u1

,

. . . ,

u|U|

}

,acollectionof

sets

S = {

S1

,

. . . ,

S|S|

}

suchthat

∀

jSj

⊂

U ,andanintegerk

≤ |

S|

.Thegoalistodeterminewhetherthereexistk elements

of

S

theunionofwhichequalsU .

SetCoverisoneoftheclassic21 Karp’sNP-completeproblems.

Theorem1([28]).SetCoverproblemisNP-complete.

Fortheproofoftheapproximationhardnesswewillusetheminimizationversionoftheproblem.

Deﬁnition2(MinimumSetCover).AninstanceoftheMinimumSetCoverproblemisdeﬁnedbyauniverseU

= {

u1

,

. . . ,

u|U|

}

andacollectionofsets S

= {

S1

,

. . . ,

S|S|

}

suchthat

∀

jSj

⊂

U .Thegoalistoﬁndsubset S∗

⊆

S suchthatthe unionof S∗

equalsU andthesizeofS∗ isminimal.

For

α

≥

1,an

α

-approximationforagiveninstanceofaminimization problemisafeasiblesolutionwhoseobjectiveis withinafactorof

α

ofanyoptimalsolution.WewillusethefactthatMinimumSetCoverproblemishardtoapproximate.

Theorem2([27]).Foranyﬁxed

>

0 MinimumSetCovercannotbeapproximatedtowithin

(

1

−

)

ln n,unlessP=NP.

Wenowmovetodeﬁningthemodelofstrategicdiffusionandthemainoptimizationproblemofourstudy.

3. Problemdeﬁnition

Inthissectionwedescribe theprocessofstrategicnetworkdiffusion,whichisthemainfocusofourstudy,aswellas thecomputationalproblemconcerningit.

Mostdiffusionmodelsare purelystochastic [10,9].In thiswork however,wefocus onthestrategicmodelofdiffusion toaccountforcasesinwhichtheinterconnectionsbetweentargetsaffecttheiractivationtime andthereforechoosingthe orderinwhichnodeswillbetargetedforactivationisstrategicallyplanned.

Inthismodel,theprocess ofdiffusionisdriven bya strategicagent. Atthebeginning oftheprocess onlyone chosen node ofthenetwork,the seednode iS,isactive. Then,theagent chooses asequence

γ

∈ (

V

)

that providestheorder in

whichthenodeswillbeactivated.Theprobabilityofsuccessfulactivationofanodei inoneattemptisgivenby:

p

(

i

)

= β

j∈N(i)∩Awji wi

α

where A isthesetofcurrentlyactivenodes(atthebeginningoftheprocessitconsistsonlyoftheseed),and

α

,

β

∈ [

0

,

1

]

areconstants.Unlessstatedotherwise,wewillassumethat

α

= β =

1.Noticethattheexpectedtimeofactivationofnode

(5)

i withnon-zeroactivationprobabilityis

τ

(

i

)

=

_p1₍_i₎.By

τ

(

γ

)

wewilldenotetheexpectedtimeofactivationofallnodesin thesequence

γ

.

This model was proposed by Alshamsi et al. [1] for undirected networkswith unweighted edges. If we assume that

wi j

=

1

⇐⇒

i j

∈

E thenourmodelisexactlyequivalenttothemodelproposedbyAlshamsi et al. [1].

Wenowdeﬁnethemaincomputationalproblemofourstudy.

Deﬁnition3(OptimalPartialDiffusionSequence).Thisproblemisdeﬁnedbyatuple

(

G

,

iS

,

z

)

,whereG

= (

V

,

E

,

W

)

isagiven

networkwithweightededges,iS

∈

V istheseednodeandz

≤

n inthenumberofnodestoactivate.Thegoalistoidentify

γ

∗

∈ (

V

)

suchthat

γ

1∗

=

iS,

|

γ

∗

|

=

z and

τ

(

γ

∗

)

isminimal.

In other words, we intend to ﬁnd the fastest way to activate z nodes in the network. When z

= |

V

|

in the above deﬁnition,theproblemissimplycalledOptimal(Full)DiffusionSequence.

Remark1.Inthecaseofintegerweightsofpolynomiallength,foranyinstanceoftheOptimalDiffusionSequenceproblem thereexistsapolynomiallyequivalentinstancewithbinaryweights.

Proof. Let

(

G

,

iS

,

|

V

|)

be agiveninstanceoftheOptimalDiffusion Sequence.Inordertoconstructanequivalent instance

with 0

/

1-weights we replaceevery edge i j with a set of wi j parallel 2-paths

{

i

,

ki,j,1

,

j

,

. . . ,

i

,

ki,j,wi j

,

j

}

,and we set

weightsofthenewedgesto wiki, j,r

=

wki, j,rj

=

1, wki, j,ri

=

wjki, j,r

=

0,forr

=

1

,

. . . ,

wi j.

Given a feasible solution

γ

to the original instance, we obtain a feasible solution

γ

to the constructed instanceby activating all nodes ki,j,r immediatelyafter activatingnode i in the original sequence (notice that the expected time of

activationofeverysuchnodeki,j,ris1).Thus,ifthevalueofthesolution

γ

ist,thenthevalueofthe

γ

ist

+

i j∈E

wi j

+

wji

.

Givenafeasiblesolution

γ

totheconstructedinstanceofthebinaryversionoftheproblem,we canobtainasolution withalesserorequalvaluebyactivatingallnodeski,j,rinoneblockimmediatelyafteractivatingnodei (again,theexpected

timeofactivationofnodeki,j,risalways1,whileitcontributestothetimeofactivationofnode j).Forsuchsequence,we

canobtaincorrespondingsolution

γ

totheoriginalinstanceoftheproblembyremovingfromitallnodeski,j,r.Ifthevalue

ofthesolution

γ

ist,thenthevalueofthe

γ

ist

−

_{i j}_∈_E

wi j

+

wji

.

NotethatthisreductiondoesnotworkingeneralfortheOptimalPartialDiffusionSequenceifz

<

|

V

|

.

2

4. Computinganoptimalsolution

WenowdescribeourresultsonﬁndinganeﬃcientwaytocomputeanexactsolutiontotheOptimalDiffusionSequence problem.

4.1. Hardnessofﬁndinganoptimalsolution

First,weshowthatthedecisionversionofOptimalPartialDiffusionSequenceproblemisNP-completeinageneralcase.

Theorem3.OptimalPartialDiffusionSequenceproblemisNP-complete,evenifallweightsarein

{

0

,

1

}

.

Proof. The decision version of the Optimal Partial Diffusion Sequence problem is the following: given a network G

=

(

V

,

E

,

W

)

, the seed node iS, the number of nodes to activate z, and a value t∗

∈ R

+, does there exist a sequence of

nodes

γ

∗

∈ (

V

)

startingwithiS suchthat

|

γ

∗

|

=

z and

τ

(

γ

∗

)

≤

t∗?

ThisproblemclearlyisinNP,asgivenasolution,i.e.,asequenceofnodes

γ

∗

∈ (

V

)

,wecancomputeitsexpectedtime ofactivationinpolynomialtime.

ToprovetheNP-hardnessoftheproblemwewillshowareductionfromtheNP-completeSetCoverproblem.

Let

(

U

,

S,

k

)

beaninstanceoftheSetCover problem.Wedeﬁnenetwork G asfollows(anexampleofsuchnetworkis presentedinFig.1):

•

Thesetofnodes: Forevery Si

∈

S

,wecreatethreenodes,denotedby Si,qiandqi.Foreveryui

∈

U ,wecreateasingle

nodeui.Additionally,wecreateasinglenodeiS.

•

Thesetofedges: Foreverynode Si,wecreateanedge SiiS.Forevery nodeqi,we createedges qiSi andqiq_i.Finally,

foreverynodeuiandevery Sjsuchthatui

∈

Sj wecreateanedgeuiSj.

•

Theweightmatrix: ForeveryedgeuiSj wesetweightstowuiSj

=

0 andwSjui

=

1.ForeveryedgeSiqi wesetweights

to wSiqi

=

0 and wqiSi

= |

U

||

S|

.Forallotherpairsofnodesconnectedwithanedgewesettheweightsto1.

Let z

=

k

+ |

U

|

+

1 (weintendthat thesolutionsequencewillactivatenodeiS,k nodesin

S

andallnodesinU ),and

(6)

Fig. 1. NetworkG constructed

for the proof of Theorem

3. Blue numbers next to edges express their weights. If there are no arrows next to edge i j then wi j=wji. Otherwise weight wi jis denoted next to the arrow pointing towards node j,

and weight

wjiis denoted next to the arrow pointing towards

node i.

(For interpretation of the colors in the ﬁgures, the reader is referred to the web version of this article.)

Wewillnowshowthatasolutiontothisinstanceofvalueatmostt∗ correspondstoasolutiontothegiveninstanceofthe SetCoverproblem.

Noticethattheexpectedtimeofactivationofeverynode Siisalways

|

U

||

S|

+

1,whilefortheexpectedtimeofactivation

ofanodeuiwehave

τ

(

ui

)

≤ |{

Sj

:

ui

∈

Sj

}|

≤ |

S|

.Noticealsothatneitheranyofthenodesqi noranyofthenodesq_ican

beactivatedwheniS istheseednode,astheinﬂuenceof Si onqi iszero.

First, wewill show that ifthere exists asolution

S

∗ to the giveninstance ofthe SetCover problem, then therealso existsasolutiontotheconstructedinstanceoftheOptimalPartialDiffusionSequenceproblemofvalueatmostt∗.Wecan constructsuch asolutionby ﬁrstactivatingevery node Si

∈

S

∗ (k nodesactivatedinexpectedtime

|

U

||

S|

+

1 each)and

thenactivatingallnodesui(

|

U

|

nodesactivatedinexpectedtimenotexceeding

|

S|

each).Such

γ

∗ activatesk

+ |

U

|

nodes

inexpectedtimenotexceedingk

(

|

U

||

S|

+

1

)

+ |

U

||

S|

,thereforeitisasolutiontotheconstructedinstanceoftheOptimal PartialDiffusionSequenceproblemofvalueatmostt∗.

Second,tocompletetheproofoftheNP-hardness,wehavetoshowthatifthereexistsasolution

γ

∗ totheconstructed instance ofthe Optimal Partial Diffusion Sequence problemofvalue atmostt∗,then there also exists a solutionto the giveninstanceofthe SetCover problem.Such asolution is

S

∗

=

γ

∗

∩

S

, i.e.,choosing sets Si corresponding to nodes Si

occurringinsequence

γ

∗.Noticethat therecannotbemorethank suchnodes,asactivatingk

+

1 nodes Si hasexpected

time

(

k

+

1

)(

|

U

||

S|

+

1

)

>

t∗. Sincethisisthecase, inorderto activatek

+ |

U

|

nodesother than iS,sequence

γ

∗ hasto

activateallnodesinU .However,toactivateanodeui,weﬁrsthavetoactivateatleastoneofitsneighbors,i.e., node Sj

such that ui

∈

Sj.Therefore, forevery node ui there mustexist atleastone node Sj

∈

γ

∗

∩

S such that ui

∈

Sj.Hence,

S∗

=

γ

∗

∩

S

isavalidsolutiontothegiveninstanceoftheSetCoverproblem.

Finally,wecanusetheconstructioninRemark1toreplaceeveryedgeqiSibyasetofparallelpathswithedgeweightsin

{

0

,

1

}

.Notethatsuchareplacementdoesnotchangethevalueoftheobjectiveasthenodesqi,andhencetheintermediate

nodesaddedontheparallelpaths,areneveractivated.Thisconcludestheproof.

2

Therefore,thereexistsnopolynomialalgorithmﬁndingtheoptimalwaytoactivateagivennumberofnodesinthe pro-cessofstrategicdiffusion,unlessP=NP.However,wenowproposeanexponentialalgorithmbasedondynamicprogramming technique.

4.2. Dynamicprogrammingalgorithm

We present an algorithm for computing an optimal solution to the problem using the dynamic programming tech-nique [29]. The mainidea behind dynamic programming isbreaking the mainproblem intoa numberof smaller,easier tosolvesub-problemsinarecursive manner.However,unlike instandard recursionwhereoftenthesamecomputationis repeatedmultipletimes,indynamicprogrammingthesolutiontoeach sub-problemiscomputedonlyonceandstoredin memoryforfutureuse.Overtheyearsthedynamicprogrammingtechniquesweredevelopedinvariousdirections [30–33], howeverouralgorithmisbasedontheoriginalversionofthetechnique.

Noticethatinoursettingtheactivationprobabilityofanygivennodedependsonthesetofactivenodesinthenetwork, howeveritdoesnotdepend ontheorderinwhichthesenodeswere activated.Hence,thesolutionforeach setofactive nodeshavetobecomputedonlyonce.Moreover,solvingthesub-problemoffindinganoptimalwayofactivatingk nodesin thenetworkallowstoefficientlyfindthesolutionfork

+

1 nodes,asthetotalexpectedtimeofactivationcanbeexpressed in a recursive manneras a sumof time ofactivation of the initialk nodes andthe time ofactivation of the last node. The idea of usingsolutions for smaller sub-problemsto solve a larger problem isthe core concept behind the dynamic programming,henceweﬁnditasuitableoptimizationmethodforsolvingtheOptimalPartialDiffusionSequenceproblem.

Thealgorithmisbasedonthefollowingrecurrencerelation,where

τ

∗

(

C

)

istheminimalexpectedtimeofactivationof asetofnodesC :

(7)

Algorithm 1: Dynamicprogrammingalgorithmforstrategicdiffusion.

Input: A

weighted network

(V,E,W), a seed node iS∈V ,

and the number of nodes to activate

z. Output: Sequence

of activation of

z nodes

starting with

iSwith minimal expected time of activation.

1 for C⊆V do 2 τ∗[C]← ∞ 3 τ∗[{iS}]←0 4 γ∗[{iS}]← iS 5 for k=1,. . . ,z−1 do 6 for C⊂V: (|C|=k)∧ (τ∗[C]<∞)do 7 for i∈V: (i∈/C)∧ (N(i)∩C= ∅)do 8 τ← wi j∈N(i)∩Cwji 9 ifτ∗[C]+ τ<τ∗[C∪ {i}]then 10 τ∗[C∪ {i}]←τ∗[C]+ τ 11 γ∗[C∪ {i}]←γ∗[C]⊕ i

12 returnγ∗[arg minC⊆V:|C|=zτ∗[C]]

Fig. 2. Example

of using dynamic programming. Each large orange state represents a possible state of activation of the network, with black nodes

repre-senting active nodes, and white nodes reprerepre-senting inactive nodes. The value of τ∗ indicates the minimal expected time required to reach this state of

activation. Value on each arrow represents the cost of activation of a single node, required to move between states. Red arrows represent optimal sequence of activation, which corresponds to the least expensive path from the initial state to the state where entire network is activated.

τ

∗

(

C

)

=

⎧

⎪

⎨

⎪

⎩

0 if C

= {

iS

}

min i∈C

τ

∗

₍

_C

_{\ {}

_i

_{}) +}

wi j∈N(i)∩Cwji if

|

C

| >

1

∞

otherwise

whereweassumethatsummationoveremptysetresultsinzeroanddivisionbyzeroresultsin

∞

.

Outofsetsofsizeone,onlythesetconsistingoftheseednodehasﬁnitetimeofactivation.Forsetsthatarelargerthan onewedividetheproblemintocomputingtheoptimaltimeofactivationofallnodesbutthelastone,andthencomputing the time of activation of thelast node. Since inthe strategic diffusionprocess the nodes are activated sequentially,one of the nodes i fromthe set C hasto be activated last, andwe are able to compute its time of activation based onthe informationofwhichothernodesofthenetworkarealreadyactive.Noticethatattemptingtoactivateanodewithoutany activeneighbors(orwithsumofinﬂuencesfromtheseneighborsequaltozero)willresultinthevalueoftheformulaequal to

∞

.

Pseudocode of the dynamic programming algorithm is presented as Algorithm 1, while Fig. 2 presents the intuition behindthealgorithmbyshowinghowitworksonaspeciﬁcgraphstructure.

Inentry

τ

∗

[

C

]

wecomputetheminimalexpectedtimenecessarytoactivateallnodesinsetC ,whileinentry

γ

∗

[

C

]

we keepasequenceofactivationallowingustoachievethisoptimalexpectedtime.Inthek-thexecutionoftheloopinline 5 we computevaluesofentriesin

τ

∗ and

γ

∗ forsets C suchthat

|

C

|

=

k

+

1.Wedoitby iterating(in loopinline 6)over all setsofnodesofsizek thatcanbeactivatedwhenwestarttheprocess fromtheseednode iS andtheniterate(inloop

inline 7) overallpossible nodesthat canbe targeted,i.e., nodeswithnon-zeroprobabilityofactivation,whenthe setof active nodesisC .Inlines 8-11weupdatethebestwayofactivatingnodesinsetC

∪ {

i

}

whenactivatingnodesinC ﬁrst

andactivatingnodei afterwardshaslowerexpectedtimethancurrentlyknownfastestwaytoactivatenodesinC

∪ {

i

}

. Asfortheimplementationdetails,

τ

∗and

γ

∗canbeimplementedashashtables.Inthiscasecheckingwhether

τ

∗

[

C

]

<

∞

isequivalenttocheckingwhether

τ

∗ containstheentryforC andthereforelines 1-2canbeomitted.

Noticealsothatduringthek-thexecutionoftheloopinline 5weonlyneedentriesintables

τ

∗ and

γ

∗ forsetsC such

that

|

C

|

=

k.Hence,toreduce memoryrequirements,wecanonlykeepinmemoryentriesfromtables

τ

∗ and

γ

∗ fortwo sizes ofsets:theonesforsizek,computedinthepreviousexecutionoftheloop(orinitialized inlines 3-4incaseofthe ﬁrstexecution)andtheonesforsizek

+

1,beingcomputedinthecurrentexecutionoftheloop.

(8)

Fig. 3. Decomposition

of graph into biconnected components. Each biconnected component is marked with different color, with cut nodes marked white

and seed nodes marked black. Gray nodes mark stubs added to provide proper inﬂuence sum for original cut nodes.

Despite theseoptimization possibilities, the dynamic programmingalgorithm remains exponential, as it considers all subsetsofnodespossibletobeactivated.Ifthenumberofnodestobeactivatedisconstantthenthealgorithmis polyno-mial.

Theorem4.Algorithm1solvestheOptimalPartialDiffusionSequenceproblemintime

O(|

V

|

z+1

_|

_E

_|)

_.

InSection4.4,wegiveadynamicprogramthatworksmoreeﬃcientlywhenthenetworkhasbothboundeddegreeand boundedtreewidth.Notethatitiscommontousedynamicprogrammingforsolvingoptimizationproblemsongraphswith boundedtreewidth,see,e.g.,[34].

Analternativeapproachtodevelopinganalgorithmoffindinganeffectivewayofactivatingthenetworkwouldbetouse thereinforcementlearningtechniques [35].Reinforcementlearningisbasedontheideaofmakingasequenceofdecisions by finding abalance betweenexploitation (performing thecurrently bestaction toobtain short-term profits) and explo-ration (investigatingother actions in thehope ofa long-termgain). Incaseofstrategic diffusion,the exploitation would be activatinganodewiththehighestactivationprobability,whiletheexploration wouldbe activatinganode withlower activationprobability inordertomakeits neighborseasiertoactivate.Nevertheless,an algorithmbasedonreinforcement learningwouldnotguaranteethatidentifiedactivationsequenceistheoptimalsolution,unlikethepresenteddynamic pro-grammingalgorithm,whichofferssuchcertainty.Hence,weleavethedevelopmentofanalgorithmbasedonreinforcement learningasapotentialfuturework.

Wenowdiscussotherwaysofoptimizationformorerestrictednetworkstructures.

4.3. Decompositionintobiconnectedcomponents

Wewillnowshowthatlookingfortheoptimalwayofactivatingallnodesofthenetworkcanbemadesimplerbyusing decompositionintobiconnectedcomponents.

Cut node ina network (also calledan articulation point)is a node the removalofwhich causes network tofall into twoormoreconnectedcomponents.Byseparatingthenetworkincutnodesweobtainadecompositionintobiconnected components.Biconnectedcomponentisa maximalsubnetwork(intermsofinclusion)suchthat removinganynode from itwillnotdisconnectit.AnexampleofadecompositionintobiconnectedcomponentsisshowninFig.3.Noticethatacopy ofacutnodeappearsineverybiconnectedcomponentitbelongsto.

The strategic diffusion process in every biconnected component can be considered separately. This is because every biconnected components has only one possiblestarting point of the diffusion (being it either the seed node in caseof biconnected componentcontaining itor thecut node closest to theseed node in caseofall other components). Thisis indicatedbyblackcolorofanodeinFig.3.OncethisstartingpointofagivencomponentC isactivated,theactivationstate inotherbiconnectedcomponentsofthenetworkdoesnotaffectactivationisC .

This is because activation probability of a node is only affected by the state of activation of its neighbors and the value of wi (toremind the reader, wi

=

j∈N(i)wji). Only cut nodes haveneighbors inother biconnected components,

henceforallnon-cutnodesthisobservationistrivial.Asforanycutnodei,noticethatitsneighborsinother biconnected componentscanonly beactivatedafteri is activated(asall thepathsbetweenthemandtheseed noderun throughthe cutnode).Thereforetheonlywayinwhichneighborsofacutnodei inotherbiconnectedcomponentsaffecttheactivation probability ofi is addingtothevalue of wi.Toaccountforthisfact,we createadditionalstubsconnectedto copiesofi

(9)

Toremindthereader,inthisworkweconsiderundirectednetworks(noticethatthedeﬁnitionofbiconnected compo-nents isdifferent fordirected networks). However, we do not assume that the relation ofinﬂuence is symmetric,i.e., it is possiblethat fori j

∈

E we have wi j

=

wji.Sincewe considerthe decompositionintodisconnectedcomponentsin the

context ofagivenseednode,thereisnevera disambiguationintermsofwhichedge weightshavetobeused incaseof the cut nodes.To compute the activation probability ofa cut node i, foreach neighbor j onlyweight wji is takeninto

consideration. Whatismore,onlyneighborsof i thatareclosertothe seednode thani canhaveapositive contribution intoitsactivationprobability,astheneighborsofi thatarefurtherawayfromtheseednodecanonlybeactivatedafteri is

activated.Atthesametime,i affectstheactivationtimeofitsneighbor j throughtheweight wi j,anditcanhaveapositive

contributionbothwhen j iscloserandwhenitsfurtherawaytothesourcenodethani.

4.4. Optimalsolutionfornetworkswithboundedtreewidth

We will nowshow a polynomial algorithm that ﬁndsan optimal waytoactivate anetwork withboth treewidth and degreeboundedbyconstants.

LetG

= (

V

,

E

,

W

)

beanarbitrarynetwork.Let

(

T

,

F

)

beanetworkwhereeverynodecontains asubsetofnodesfrom

V ,i.e.,

∀

t∈Tt

⊆

V (wewillcalleachsuchsubsetabag).Such

(

T

,

F

)

isatreedecompositionofG (see,e.g.,[36])ifandonly

ifthefollowingconditionsaremet:

•

Every nodefromV iscontainedinatleastonebag,i.e.,

_t_∈_Tt

=

V .

•

For everyedgee

∈

E thereexistsabagthatcontainsbothendsofe,i.e.,

∀

i j∈E

∃

t∈T

{

i

,

j

}

⊆

T .

•

For everynode v

∈

V subnetworkof

(

T

,

F

)

inducedbybagscontaining v isconnected.

The treewidth

θ

ofadecompositionisthesize ofits largestbagminus one,i.e.,

θ

=

maxt∈T

|

t

|

−

1.The treewidthofa

network is theminimumover treewidthsofall its treedecompositions. Aproblemis said tobe ﬁxed-parametertractable

[37],withrespect to parameterk, ifanyinstance ofthe problemofsize N can be solved intime f

(

k

)

·

NO(1)_,_for_some

computable function f

(

·)

.WeshowthattheOptimal StrategicDiffusionproblemisﬁxed-parametertractablewithrespect tothesumoftreewidthandmaximumdegree.

4.4.1. Fulldiffusion

Tobetterexplaintheidea,weﬁrstgiveanalgorithmforthefulldiffusioncase.

Theorem5.LetG

= (

V

,

E

,

W

)

beanetworkwithmaximaldegree

δ

.Thereexistsanalgorithmthat,givenatreedecompositionof treewidth

θ

,ﬁndsanoptimalwayofactivatingnodesinV intime

O(θδ(θδ)!

2n

)

.

In whatfollows,weassume that thetreedecompositionisrootedinsome nodetR,thesameforbottom-up and

top-downorder.Weusethefollowingnotation:

•

c

(

t

)

denotesthesequenceofchildrenoft;

• γ

|

X denotesthesubsequenceof

γ

consistingonlyofthenodesinset X ;

• γ

x denotesthesubsequenceof

γ

consistingofallelementsprecedingx;

• γ

x denotesthesubsequenceof

γ

consistingofx aswellasallelementsfollowingx;

•

TtopdowndenotessequenceofnodesinT inatop-downorder.

Inwhatfollowswewillcalltwopermutations

γ

and

γ

ofnodesfromG compatible,denoted

γ

∼

γ

,ifandonlyifthey activatethenodesthattheyhaveincommoninthesameorder,i.e.,

γ

|

γ

=

γ

|

γ .

Foreachnodeoft thetreedecompositionT wecomputearecordconsistingofthefollowingﬁelds:

• ϒ[

t

]

storesallpermutationsofnodesinbagt andtheirneighborsthatcanbeapartofavalidsolutiontotheproblem;

• τ

[

t

,

γ

,

i

]

storestheexpectedtimeofactivationofeverynodeini

∈

t whenactivatedintheordergivenby

γ

;

• τ

∗

[

t

,

γ

]

storesthesmallestexpectedtime to activateallnodes inthesubnetwork of G inducedby thenodes inthe subtreeofT rootedatt,whenthenodesint andtheirneighborsareactivatedintheordergivenby

γ

.

Tocomputetherecordswetraversethetreedecompositioninabottom-uporder(giventheroottR)andforeverynode

t weperformthefollowingsteps:

1. Compute:

ϒ

[

t

] =

γ

∈

∗

(

t

∪

i∈t N

(

i

))

: (

iS

∈

/

t

∨

γ

1

=

iS

)

∧

∀

γi∈t: γi=iS

∃

j<i

γ

j

∈

N

(

γ

i

)

∧

∀

t∈c(t)

∃

γ_∈ϒ[t]

γ

∼

γ

;

(10)

Algorithm 2: Algorithm reconstructingtheoptimalwayofactivatingall nodesofanetworkwithboundedtreewidth anddegree.

Input: Tree

decomposition

(T,F)of a weighted network (V,E,W), with computed values of τ∗.

Output: Sequence

of activation of nodes in

V starting

with

1 γ∗← 2 for t∈Ttopdowndo 3 γ←arg minγ_∈ϒ[t]:γ_∼γ∗τ∗[t,γ] 4 γ← 5 forγi∈γ do 6 ifγi∈/γ∗then 7 γ←γ⊕ γi 8 else 9 γ∗←γ_γ∗∗ i ⊕γ _⊕_γ∗ γ∗ i 10 γ← 11 γ∗←γ∗⊕γ 12 returnγ∗

2. Forevery

γ

∈ ϒ[

t

]

andevery

γ

i

∈

γ

|

t compute:

τ

[

t

,

γ

,

γ

i

] =

wγi

j<iwγjγi

;

3. Forevery

γ

∈ ϒ[

t

]

compute:

τ

∗

[

t

,

γ

] =

γi∈γ|t

τ

[

t

,

γ

,

γ

i

] +

t∈c(t) min γ∈ϒ[t]:γ∼γ

⎛

⎝

τ

∗

[

t

,

γ

] −

γ_i∈γ|t

τ

[

t

,

γ

,

γ

_i

]

⎞

⎠ .

Aftertraversingthetreeinthisfashion,theminimaltimerequiredtoactivatedentirenetworkG startingwithiS canbe

identiﬁedasminγ∈ϒ[tR]

τ

∗

[

tR

,

γ

]

.Inorderto reconstructthesequenceallowing toactivateentirenetworkinthe optimal

time,wecanrunAlgorithm2.

Letusnowcommentontheprocedureofﬁllingtherecords.

In step 1we gather all permutationsof nodesin bagt and their neighbors that can be apart ofa valid solutionto theproblem,accordingtothreeconditions.Theﬁrstcondition assertsthatifthepermutation containstheseednode,itis activatedastheveryﬁrstnode.Thesecond conditionassures thatevery nodeinbagt otherthanthesourcenode hasat leastoneactiveneighboratthemomentofactivation(noticethatforeverynodeint allofitsneighborsarepresentinthe permutation).Thethirdconditionprovidesthatthepermutation

γ

allowstoactivateallnodesinthesubtreeoft,i.e.,that foreverychildoft inthetreedecompositionthereexistsatleastonevalidpermutation

γ

thatactivatesnodesfrom

γ

in thesameorderas

γ

.

Instep 2wesimplycomputethetimeofactivationofeverynodeinbagt,accordingtothedeﬁnitiongiveninSection3, andstoreitintable

τ

.Again,noticethatforevery nodeint allofitsneighborsarepresentinthepermutation,hencewe haveenoughinformationtocomputeitsexpectedtimeofactivation.

Finally,instep 3wecompute theoptimaltimeofactivationofallnodesinthesubtreeoft whileusingpermutation

γ

andstore itintable

τ

∗.The expressionconsistsofa sumoftime ofactivationofthe nodesinbagt anda sumoverall children oft,whereforeachofthemwe compute theminimumoverall compatiblepermutationsandsubtractthe time ofactivationofnodesint.Notice thatsince

(

T

,

F

)

isthetreedecomposition,theonlypossibleoverlapbetweennodesin thesubtreesofdifferentchildrenoft arenodesinbagt.Otherwisethegraphinducedbybagscontainingsuchoverlapping nodeswould not be connected,andhence one ofthe conditionsof beingthe treedecomposition wouldnot be met for

(

T

,

F

)

.

Afterhavingﬁlledtables

ϒ

,

τ

∗ and

τ

,wetraverse thetreedecompositionagain, thistime ina top-downorderusing Algorithm2. Weconstructan optimalsolution tothe problemonvariable

γ

∗.In line3we selectas

γ

thepermutation withminimaltimenecessaryto activateall nodesinthesubtree oft,that iscompatiblewiththesolutionconstructedso far.Inlines4-11wemerge

γ

intosequence

γ

∗ usingauxiliaryvariable

γ

.

As for the time complexity of the algorithm, the most costly operations (both of which are equally expensive) are validatingexistence ofacompatiblesequenceforeverychildof t in thethirdconditioninstep 1andcomputingthetime necessary to activatethe nodesin the subtreesof all children of t inthe second sum instep 3. Forassessing the time complexityofthealgorithmwewillfocusonthecostinstep 3.Sinceeverynodetisachildofatmostoneothernodein thetreedecomposition,foreveryt

∈

T thecomputationofminimumisexecuted

O((θδ)!)

times(the numberofdifferent permutations

γ

∈ ϒ[

t

]

). Sincethere are

O(

n

)

nodesinthetreedecomposition,the computationofminimum isexecuted

O((θδ)!

n

)

times.Thecostofexecutingitonceis

O(θδ(θδ)!)

,sincewe havetocheck

O((θδ)!)

manypermutationsin

ϒ

[

t

]

and for each of them check whether

γ

|

γ

=

γ

|

γ in time

O(θδ)

. Hence, the total time complexity of the algorithm is

(11)

Algorithm 3: Algorithmcomputingthevalueof

τ

∗

[

t

,

γ

,

k

]

.

Input: Tree

decomposition

(T,F)of a weighted network (V,E,W), with values of τ∗computed so far, node t with

sequence of children

c(t)= t1,. . . ,t|c(t)|.

Output: The

value of

τ∗[t,γ,k].

1 for ti∈ t1,. . . ,t|c(t)|do 2 for m=0,. . . ,k−γ|tdo 3 if i=1 then 4 τ[t,1,γ,m]← min γ∈ϒ[t1]: γ∼γ τ∗[t1,γ,m+ |γ|t|] −γi∈γ|tτ[t1,γ _,_γ i] 5 else 6 τ[t,i,γ,m]← min m∈{0,...,m} τ[t,i−1,γ,m]+ min γ_∈ϒ[ti]: γ∼γ τ∗[ti,γ,m−m+ |γ|t|] −γ_i∈γ_|_tτ[ti,γ,γi] 7 returnγi∈γ |tτ[t,γ,γi]+τ _t_,_|_c₍_t₎_|,_γ_,_k₋_γ_|_t 4.4.2. Partialdiffusion

Wenextpresentanalgorithmforpartialdiffusioninnetworkswithboundedtreewidthandboundeddegree.

Theorem6.LetG

= (

V

,

E

,

W

)

beanetworkwithmaximaldegree

δ

.Thereexistsanalgorithmthat,givenatreedecompositionof treewidth

θ

,ﬁndsanoptimalwayofactivatingz nodesinV intime

O(θδ(θδ)!

2_z2_n

₎

_.

We useessentially thesame notation asinprevious section.However, in whatfollowswe will calltwo permutations

γ

∈ (

X

)

and

γ

∈ (

X

)

of nodes fromG compatible, denoted

γ

∼

γ

,if andonly if they activate thenodes that they haveincommoninthesameorder,i.e.,

γ

|

γ

=

γ

|

γ ,andagreeonthenon-activatednodes,i.e.,

γ

|

X\γ

=

γ

|

X\γ

=

(this

differenceistheresultofconsideringalsopermutationsthatarenotfull,i.e.,permutations

γ

∈ (

X

)

suchthat

γ

<

|

X

|

). Therecordthatwecomputeforeachnodeoft thetreedecompositionT consistingnowofthefollowingﬁelds:

• ϒ[

t

]

storesallpermutationsofnodesinbagt andtheirneighborsthatcanbeapartofavalidsolutiontotheproblem;

• τ

[

t

,

γ

,

i

]

storestheexpectedtimeofactivationofeverynodeini

∈

t whenactivatedintheordergivenby

γ

;

• τ

∗

[

t

,

γ

,

k

]

denotesthesmallestexpectedtimetoactivatek nodesinthesubnetworkofG inducedbythenodesinthe subtreeofT rootedatt,whenthenodesint andtheirneighborsareactivatedintheordergivenby

γ

.

Hence, the only difference in comparison to the full diffusion version is that table

τ

∗ is additionally indexed with the numberofnodestoactivate.

Tocomputetherecordswetraversethetreedecompositioninabottom-uporder(giventheroottR)andforeverynode

t weperformthefollowingsteps:

1. Compute:

ϒ

[

t

] =

γ

∈ (

t

∪

i∈t N

(

i

))

: (

iS

∈

/

t

∨

γ

1

=

iS

)

∧

∀

γi∈t: γi=iS

∃

j<i

γ

j

∈

N

(

γ

i

)

∧

∀

t∈c(t)

∃

γ∈ϒ[t]

γ

∼

γ

;

2. Forevery

γ

∈ ϒ[

t

]

andevery

γ

i

∈

γ

|

t compute:

τ

[

t

,

γ

,

γ

i

] =

wγi

j<iwγjγi

;

3. Forevery

γ

∈ ϒ[

t

]

andfork

= |

γ

|

t

|,

. . . ,

z computethevalueof

τ

∗

[

t

,

γ

,

k

]

usingAlgorithm3.

Aftertraversing the treeinthisfashion, theminimal time requiredto activatez nodes innetwork G starting withiS

canbeidentiﬁedasminγ∈ϒ[tR]

τ

∗

[

tR

,

γ

,

z

]

.Inordertoreconstructthesequenceallowingtoactivatez nodesintheoptimal

time,wecanrunAlgorithm4.

Letusnowcommentontheprocedureofﬁllingtherecords.

Steps 1and 2 are almostidentical tothose ofthe algorithmforfull diffusion,withnotable difference beingthat this timeweconsidersequencesthatarenotnecessarilyfull.

Instep 3wecallAlgorithm3tocomputethevalueof

τ

∗

[

t

,

γ

,

k

]

,i.e.,theoptimaltimeofactivatingk nodesinthesubtree oft whileusingpermutation

γ

.Tothisend,wevisitallthechildrenoft,intheordert1

,

. . . ,

t|c(t)|.Form

∈ {

0

,

. . . ,

k

− |

γ

|

t

|}

we compute

τ

[

t

,

i

,

γ

,

m

]

,theminimumexpectedtimerequiredtoactivatem nodesotherthantheonesactivatedint,inthe subnetwork inducedbyunionofthesubtreesrootedatt1

,

. . . ,

ti.Wedothisusingthevaluesof

τ

computedforchildren

oft precedingti,i.e., fort1

,

. . . ,

ti−1,aswell asthe valuesof

τ

∗ computedforti.Intheselection processm denotesthe

(12)

Algorithm 4: ReconstructPartial

(

t

,

γ

∗

,

k

)

:algorithmreconstructingtheoptimalwayofactivatingk nodesinanetwork withboundedtreewidthanddegree.

Input: Tree

decomposition

(T,F)of a weighted network (V,E,W), with computed values of τ∗and τ, a bag t,

a sequence

γ∗∈ (V)and an

integer k≤n.

Output: Sequence

of activation of

z nodes

in

V starting

with

1 if k>0 then 2 γ←arg minγ∈ϒ[t]:γ∼γ∗τ∗[t,γ,k] 3 γ← 4 forγi∈γ do 5 ifγi∈/γ∗then 6 γ←γ⊕ γi 7 else 8 γ∗←γ_γ∗∗ i ⊕γ _⊕_γ∗ γ∗ i 9 γ← 10 γ∗←γ∗⊕γ 11 k←k− |γ|t| 12 for ti∈ t|c(t)|,. . . ,t1do 13 m←arg minm_∈{0,...,k_} τ[t,i−1,γ,m]+minγ∈ϒ[ti]: γ∼γ τ∗[ti,γ,k−m+ |γ|t|] −γi∈γ|tτ[ti,γ _,_γ i] 14 γ∗←ReconstructPartial(ti,γ∗,k−m+ |γ|t|) 15 k←m 16 returnγ∗

inthesubtreerootedinti.Thereturnedvalueoftheminimaltimeofactivationofk nodesinthesubtreeoft whileusing

permutation

γ

iscomputedasthe sumofthetime ofactivation ofnodesfromt and theoptimaltime ofactivatingthe remainingk

−

γ

|

t

nodesinall

|

c

(

k

)

|

childrenoft.

Thetime complexityofthealgorithmis similartothefulldiffusioncase, exceptthatwe havenow thetwoadditional loops,oneoverk instep 3andtheotheroverm inline6ofAlgorithm3,whichcontributeanadditionalfactorof

O(

z2

₎

_to therunningtime.

We are unable to compute an optimal solutionin polynomial time in generalcase. We may howeverhope to ﬁnd a polynomialapproximationalgorithm.Wenowmovetotheanalysisofpossiblewaysofapproximatingtheoptimalsolution. We now move to assessing the possibility of approximating the optimal solution to the Optimal Diffusion Sequence Problem.

4.5. Lowerboundsonheuristicalgorithms

Alshamsi et al. [1] suggesttwoheuristicstrategiesforactivatinganetworkinaprocessofstrategicdiffusion:thegreedy strategy(alwaystargetnodewiththehighestprobabilityofactivation)andthemajoritystrategy(alwaystargetnodewith thehighestnumberofactiveneighbors).

Wenowshowthatneitherofthesestrategiesapproximatesanoptimalsolutiontowithinaconstantratio.

Theorem 7.There exists no constant r

>

1 such that choosing the sequence of activation using the greedy strategy is an r-approximationalgorithm.Inparticular,theapproximationratioofthegreedystrategyis

(

logn

)

.

Proof. Wewillnowshowaseriesofnetworkswheretheapproximationratioofthegreedyalgorithmgoestoinﬁnity. Foragivenk

∈ N

weconstructanetwork G

(

k

)

,wherethesetofnodesisV

= {

iS

,

a1

,

. . . ,

ak2

,

b1

,

. . . ,

bk−1

}

andwhere thesetofedgesis:

E

=

k2

i=1

{

iSai

} ∪

k2

i=1 k

−1 j=1

{

aibj

}.

Weassumethattheweightofeveryedgeis1.ThestructureofnetworkG

(

k

)

ispresentedinFig.4.

Letx bethenumberofcurrentlyactivatedai nodesandlet y bethenumberofcurrentlyactivatedbi nodes.We have

that

τ

(

ai

)

=

_y₊k₁ andwehavethat

τ

(

bi

)

=

k

2

x.

Wewillnowanalyzetwodifferentwaysofactivatingallnodesinnetwork G

(

k

)

.First,letusconsiderthegreedy algo-rithm.Lettheindicesofnodesai,aswell astheindicesofnodesbj,beordered accordingtotheorderofactivation,i.e.,

nodea1istheﬁrstactivatednodeai,whilenodeb1 istheﬁrstactivatednodebi.Noticethatatleastik nodesa hastobe

activatedbeforetheactivationofnodebi.Wewillprovethisclaimbycontradiction.Assumethatbicanhave j

<

ik active

neighborsatthemomentofactivation.Takenodeaj+1 (theﬁrstnodea activatedafterbi).Atthemomentofactivationof

bi itsprobability ofactivation is _kj2,whilethe probabilityofactivation ofaj+1 is

i

(13)

Fig. 4. Network G(k)constructed for the proofs of Theorems7and8.

andbi wasactivatedbeforeaj+1,we havetohave _kj2

≥

i

k.However,thisisnottruesince j

<

ik.Wehaveacontradiction,

thereforeatleastik nodesa hastobeactivatedbeforetheactivationofnodebi.

Becauseofthisatleastk nodesai hastobeactivatedwithonlyoneneighboractive,thenatleastk nodesaihastobe

activatedwithonlytwoneighborsactive,andsoon.Focusingonlyonthetimeofactivationofnodesai wehavethatthe

totalexpectedtimeofactivationforthegreedyalgorithmis:

τ

G

(

G

(

k

))

≥

k2

i=1

τ

(

ai

)

≥

k

j=1 kk j

=

k 2_H k

where Hkisthek-thharmonicnumber.

Now, let usconsider an algorithm A, wherewe ﬁrst activatek nodes ai,then all nodes bi andﬁnally theremaining

nodesai.Timeofactivationoftheentirenetworkisthen:

τ

A

(

G

(

k

))

=

k2

+ (

k

−

1

)

k

+ (

k2

−

k

)

=

3k2

−

2k

sinceeachoftheﬁrstk nodesai isactivatedinexpectedtimek (asitonlyhasoneactiveneighbor,namelyiS),eachnode

bi isactivatedinexpectedtimek (asithasdegreek2andk activeneighborsatthemomentofactivation)andeachofthe

remainingk2

₋

_{k nodes}_a

iisactivatedinexpectedtime1.

ConsidersequenceofnetworksG

(

k

)

.Forthissequencewehave

lim k→∞

τ

G

(

G

(

k

))

τ

∗

(

G

(

k

))

≥

klim→∞

τ

G

(

G

(

k

))

τ

A

(

G

(

k

))

≥

lim k→∞ Hk 3

= ∞.

Therefore,solutionprovidedbythegreedyalgorithmcanbearbitrarilyworsethantheoptimalsolution.

2

Wealsoshowanalogicalresultforthemajoritystrategy.

Theorem 8.There exists no constant r

>

1 such that choosing the sequence of activation using the majority strategy is an r-approximationalgorithm.Inparticular,theapproximationratioofthemajoritystrategyis

(

logn

)

.

Proof. WewillnowshowthatfortheseriesofnetworksconstructedintheproofofTheorem7theapproximationratioof themajorityalgorithmalsogoestoinﬁnity.

Letx be thenumberofcurrentlyactivatedai nodesandlet y bethenumberofcurrentlyactivatedbi nodes.Wehave

that

τ

(

ai

)

=

y+k1 andwehavethat

τ

(

bi

)

=

k

2

x.

LetusconsidertheexpectedactivationtimeofanentirenetworkG

(

k

)

usingmajorityalgorithm.Lettheindicesofnodes

ai,aswell astheindices ofnodes bj, be ordered accordingto the orderof activation,i.e., node a1 isthe ﬁrst activated node ai,whilenodeb1 istheﬁrstactivatednodebi.Noticethatnodebiatthemomentofactivationhasatmosti

+

1 active

neighbors.Wewillprovethisclaimbycontradiction.Assumethatbi canhave j

>

i

+

1 activeneighborsatthemomentof

activation. Consider numberofactiveneighborsatthe momentofactivationofnode aj (the last nodea activated before

bi).Sincewe areusingthemajorityalgorithm,ithastohaveatleast j

−

1 active neighbors(otherwise nodebi,havingat

themoment j

−

1 activeneighbors,wouldhavebeenchosenbeforenodeaj).However,nodeajcanhaveatmosti

<

j

−

1

activeneighbors,namelynodesb1

,

. . . ,

bi−1 andnodeiS (asnodebi isnotactiveyet).Wehaveacontradiction,thereforebi

atthemomentofactivationhasatmosti

+

1 activeneighbors.

Focusing onlyonthetimeofactivationofnodesbi wehavethat thetotalexpectedtime ofactivationforthemajority

(14)

Fig. 5. NetworkG(X,r)constructed for the proof of Theorem9. Blue numbers next to edges express their weights. If there are no arrows next to edge

i j thenwi j=wji. Otherwise weight wi j is denoted next to the arrow pointing towards node j,

and weight

wji is denoted next to the arrow pointing

towards node i.

τ

C

(

G

(

k

))

≥

k−1

i=1

τ

(

bi

)

≥

k−1

i=1 k2 i

+

1

=

k 2

₍

_H k

−

1

)

whereHkisthek-thharmonicnumber.

Let A bethe alternativealgorithmdescribed inthe proofofTheorem 7.Consider sequenceofnetworks G

(

k

)

.Forthis sequencewehave lim k→∞

τ

C

(

G

(

k

))

τ

∗

(

G

(

k

))

≥

klim→∞

τ

C

(

G

(

k

))

τ

A

(

G

(

k

))

≥

lim k→∞ Hk 3

= ∞

Therefore,solutionprovidedbythemajorityalgorithmcanbearbitrarilyworsethantheoptimalsolution.

2

4.6. Inapproximability

Finally,we show that infact the problemcannot be approximatedwithin a ratioof

(

1

−

)

ln n for any

>

0,unless

P

=

N P .

Theorem9.TheOptimalPartialDiffusionSequenceproblemcannotbeapproximatedwithinaratioof

(

1

−

)

ln n forany

>

0,

unlessP

=

N P .

Proof. Inordertoprovethetheorem,wewillusetheresultbyDinurandSteurer [27] thattheMinimumSetCoverproblem cannotbeapproximatedwithinaratioof

(

1

−

)

ln n forany

>

0,unlessP

=

N P .

LetX

= (

U

,

S

)

beaninstanceoftheMinimumSetCoverproblem.Toremindthereader,U istheuniverse

{

u1

,

. . . ,

u|U|

}

, whileS isacollection

{

S1

,

. . . ,

S|S|

}

ofsubsetsofU .Thegoalhereistoﬁndsubset S∗

⊆

S suchthattheunionofS∗equals

U andthesizeofS∗ isminimal.

First, we will show a function f

(

X

)

that based on an instance of the Minimum Set Cover problem X constructs an instanceoftheOptimalPartialDiffusionSequenceproblem.

LetnetworkG

(

X

)

bedeﬁnedasfollows(anexampleofsuchnetworkispresentedinFig.5):

•

Thesetofnodes: Forevery Si

∈

S,wecreatethreenodes,denotedby Si,qi andqi.Foreveryui

∈

U ,wecreate

|

S

|

+

1

nodes,denotedbyui,1

,

. . . ,

ui,|S|+1.Additionally,wecreateasinglenodeiS.

•

Thesetofedges: Forevery node Si,wecreateanedge SiiS.Foreverynode qi,wecreateedgesqiSi andqiq_i.Finally,

foreverynodeui,i andevery Sjsuchthatui

∈

Sj wecreateanedgeui,iSj.

•

Theweightmatrix: Foreveryedge SiqiwesetitsweightstowSiqi

=

0 andwqiSi

=

α

−

1,where

α

=

z

|

S

|

λ+1 _for_some

λ

>

0.Forevery edgeui,iSj weset itsweightsto wu_i,iSj

=

0 and wSjui,i

=

1.Forallother pairsofnodesconnected

withanedgewesettheirweightsto1.

Tocomplete the constructedinstance ofthe Optimal PartialDiffusion Sequenceproblem, we set theseed node to be