HAL Id: hal-03058609
https://hal.archives-ouvertes.fr/hal-03058609
Submitted on 4 Mar 2021
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Computational aspects of optimal strategic network
diffusion
Marcin Waniek, Khaled Elbassioni, Flávio Pinheiro, César A. Hidalgo,
Aamena Alshamsi
To cite this version:
Marcin Waniek, Khaled Elbassioni, Flávio Pinheiro, César A. Hidalgo, Aamena Alshamsi.
Computa-tional aspects of optimal strategic network diffusion. Theoretical Computer Science, Elsevier, 2020,
814, pp.153 - 168. �10.1016/j.tcs.2020.01.027�. �hal-03058609�
Contents lists available atScienceDirect
Theoretical
Computer
Science
www.elsevier.com/locate/tcs
Computational
aspects
of
optimal
strategic
network
diffusion
Marcin Waniek
a,
b,
∗
,
Khaled Elbassioni
b,
Flávio L. Pinheiro
c,
d,
César A. Hidalgo
d,
Aamena Alshamsi
baComputerScience,NewYorkUniversityAbuDhabi,AbuDhabi,UnitedArabEmirates bKhalifaUniversityofScienceandTechnology,AbuDhabi,UnitedArabEmirates
cNovaInformationManagementSchool(NOVAIMS),UniversidadeNovadeLisboa,Lisboa,Portugal dMITMediaLab,MassachusettsInstituteofTechnology,Cambridge,USA
a
r
t
i
c
l
e
i
n
f
o
a
b
s
t
r
a
c
t
Articlehistory:
Received 5 July 2019
Received in revised form 19 December 2019 Accepted 26 January 2020
Available online 30 January 2020 Communicated by L.M. Kirousis Keywords: Strategic diffusion Network contagion Complex networks Influence maximization
Diffusiononcomplexnetworksisoftenmodeled asastochasticprocess.Yet,recentwork on strategic diffusionemphasizes the decision power of agents[1] andtreats diffusion as a strategic problem. Herewe studythe computational aspects of strategic diffusion,
i.e.,finding theoptimal sequenceofnodes toactivate anetwork intheminimum time. We provethat finding anoptimal solution tothisproblem isNP-complete inageneral case.To overcomethiscomputationaldifficulty, wepresentan algorithmtocomputean optimal solution based on a dynamic programming technique. We also show that the problemisfixedparameter-tractablewhenparametrizedbytheproductofthetreewidth andmaximumdegree.Weanalyzethepossibilityofdevelopinganefficientapproximation algorithmandshowthattwoheuristicalgorithmsproposedsofarcannothavebetterthan alogarithmicapproximationguarantee.Finally,weprovethattheproblemdoesnotadmit betterthanalogarithmicapproximation,unlessP=NP.
©2020ElsevierB.V.Allrightsreserved.
1. Introduction
Diffusion in social networks has beena widely studied applicationin problems such as epidemic outbreaksand the adoption of behaviors andinnovations [2–7]. The approach to model diffusion can be roughly divided between simple andcomplexcontagion processes.Undersimplecontagion transmissionrequirescontactwithone individualataconstant ratesuch astransmission ofinfectious diseases [8]. An exampleof simplecontagion process isthe independentcascade model [9].Incontrast,complexcontagion requiresreinforcementfrommultiplesources.Anexampleofcomplexcontagion processisthelinearthresholdmodel [9].Complexcontagionhasbeenwidelystudiedinthecontextofcascadephenomena, withparticular applicationsto viralmarketing campaigns [10–12].In that context,the problemunderanalysis isthat of findingthe initialsetofseeds thatwouldmaximize diffusion.Differentapproachestosolving theseedselection problem includeusingcentralitymeasures [13],networkpercolation [14],andpropagationtraces [15],amongmanyothers [16–18]. Someworksintheliteraturetakeanadaptiveapproachtothetopic,basingtheselectionofseedsonadditionalknowledge gatheredduringtheprocess,beiteitherthecurrentstateofadynamicnetworktopology [19],orexpandedpoolofpotential seeds [20,21].Analternativewayofincreasingthediffusioncoverageisspreadingavailableseedsovertime [22,23].Despite thestrategicaspectsofdifferentmethodsofseedselection,thediffusionprocessitselfispurelystochastic.
*
Corresponding author at: Computer Science, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates.E-mailaddress:mjwaniek@nyu.edu(M. Waniek). https://doi.org/10.1016/j.tcs.2020.01.027
However, thereare casesinwhichdiffusiontakesplaceaccordingtothesamerules ofcomplexcontagion,butthatare not purely stochastic (i.e., they have a strategic component) andin which thenetwork itself is not, necessarily, a social network. Take for instancehow regions develop neweconomic activities [24] or start production innew research fields [25]. Inthese casesagents are not embedded in the network, insteadthey are takingactions over a networkedsystem, which captures the relatedness between economic or academic activities. More importantly, not only the choice of the initial state is strategic, but also the whole process of diffusion is strategic. Stochasticity in this context is not at the pairwise interactionsbetweenagents,butinthesuccessrateofagentsinenteringnewactivities.Incaseoftheeconomic development, theprobability ofsuccess while entering a newactivity (represented by a node in the network ofrelated economic activities)increaseswiththenumberofotheralreadydevelopedactivities thatareconnectedtoit.Thus,agents canfailorsucceeddependingontheparticularcontextofeachstrategicaction.
Alshamsi et al. [1] proposedtostudythediversificationofeconomicactivitiesthroughthelensesofastrategicdiffusion model.Inthismodeltheentireprocessofdiffusionisguidedbyastrategicagent.Theagent’sactionsconcerntheselection of a node ina network tobe targeted ateach time step.Hence, the node canbecome activated inthe next stepof the action sequenceoftheagent, ornot.Following existingempiricalfindings [24,26],the modelassumesthatthe activation probability (i.e.,the probability ofsuccess ofthe agent’saction) iscaptured by acomplex contagion process. Thegoal is toactivatetheentirenetwork,startingfromasingleactivenode,intheminimumtime possible.WhileAlshamsi et al. [1] showedforawiderangeofnetworktopologiesthatactivatinganetworkinanoptimalmannerrequiresabalancebetween exploitationandexplorationstrategies,manyquestionsregardingthecomputationalfeasibilityoffindingoptimalstrategies remainedopen.Here,weexploreseveraltheoreticalcomputationalconsiderationsofstrategicdiffusionprocesses.
1.1. Resultsandmethods
Westartbyexploringthecomputationalfeasibilityoffindinganoptimalstrategythatminimizesthetotaldiffusiontime,
i.e., answering the question ofwhatis thebest sequenceof activatinga givennumberofnodesin thenetwork (starting fromaseed node)inthe shortestexpectedtime,whiletakingintoconsiderationthat thetimenecessarytoactivateeach nodeisproportionaltothenumberofactiveneighbors.
WeshowthatthedecisionversionoftheproblemisNP-complete(Theorem3).Weproveitusingareductionfromthe Set Cover problem.Thisobservation indicatesthat developinga polynomialalgorithm forfindingan optimalsequence of strategicdiffusionisnotrealistic,unlessP=NP.
Giventhisdifficulty,weproposean algorithmforcomputingan optimalwayofactivatingagivennumberofnodesin theshortesttimepossible(Algorithm1).Whilethealgorithmtakesexponentialtimetofindthesolution,itstilloutperforms simple exhaustivesearch andcan be usedto compute an optimalsolutionfor networksofmoderatesize. Thealgorithm utilizesthedynamicprogrammingtechniquetofindthefastestwaytoreacheverypossiblestateofnetworkactivation.
Wealsopresenttwo algorithmsfindinganoptimalsolutionforamorerestrictiveclassofnetworks,i.e.,networkswith bounded treewidth andboundedmaximal degree. Theytraverse a tree decompositionofa network andfind an optimal wayofactivatingeitheranentirenetworkora givennumberofnodes,respectively.Asa consequence,we showthat the problemisfixedparameter-tractablewhenparametrizedbythesumofthetreewidthandmaximumdegree.
Asfindinganoptimalsolutionprovestobecomputationallydemanding,weturnourattentiontoassessingthepossibility ofobtaininganefficientapproximationalgorithm.Wefirstinvestigatewhethertwoeffectiveheuristicalgorithmsproposed by Alshamsi et al. [1] have constant approximationratios. Oneofthem isthe greedyalgorithm,i.e., always targetingthe nodewiththehighestprobabilityofactivation.Theotheristhemajorityalgorithm,i.e.,targetingthenodewiththehighest number ofactive neighbors.Weshow that the approximationratiofor bothofthese algorithmsis
(
log n)
(Theorems7and 8). The proofs are based on constructing a sequence of networks where the ratio between total expected time of activationofthesolutionobtainedusingtheheuristicalgorithmandtheoptimalexpectedtimeofactivationgoestoinfinity. Finally, we show that, unless P=NP, there is no way to approximate the problem within a ratio better than ln n, in particular it is impossibleto construct an r-approximationalgorithm for a constant r (Theorem9). We prove this claim by showing a reduction from the Minimum Set Cover problem and using the fact that Minimum Set Cover cannot be approximatedwithinaratioof
(
1−
)
ln n forany>
0,unlessP=NP [27].Organizationofthemanuscript. Theremainderofthearticleisorganizedasfollows.Section2describesthenotationsand computational problemsused in our reductions. Section 3 presentsa formal definitionof the strategic diffusionprocess and the main problemconsidered in our study.In Section 4 we present ourhardness resultforthe decision version of theproblemandwedescribeadynamic programmingalgorithm tofindan optimalwayofstrategicdiffusion.Section4.4
introduces algorithms computingoptimalsolutionfornetworkswithboundedtreewidthandmaximaldegree.Section 4.5
describesourresultsconcerning approximationoftheoptimalsolution.Section 5presentsconclusions andpotentialideas forfuturework.
2. Preliminaries¬ation
2.1. Basicnetworknotation
LetG
= (
V,
E,
W)
denoteanetworkwithweightededges,where V= {
1,
. . . ,
n}
denotesthesetofn nodes,E⊆
V×
Vdenotes thesetofedges and W
∈ R
n×n denotes thematrixwithweights ofedges. Wedenote anedge betweennodesiand j by i j.In thiswork we considernetworksthat are undirected, i.e., we donot discern betweenedges i j and ji.We alsoassumethatnetworksdonotcontainself-loops,i.e.,
∀
i∈Vii∈
/
E.Wedenoteby NG(
i)
thesetofneighbors ofi inG,i.e.,NG
(
i)
= {
j∈
V:
i j∈
E}
.WedenotebydG(
i)
thedegree ofi inG,i.e.,dG(
i)
= |
NG(
i)
|
.Weconsider networkswithweighted edges.We denoteby wi j
∈ R
theweightoftheconnection fromi to j,wewillcallthisvalueinfluence thati hason j.Wedonotassumethattherelationofinfluenceissymmetric,i.e.,itispossiblethat fori j
∈
E wehavewi j=
wji.Unlessstatedotherwise,wewillassumethat∀
i,j∈Vwi j≥
0.Wealsoassumethatifi j∈
/
E thenwi j
=
0.Wedenoteby wi thesumofinfluenceonnodei,i.e., wi=
j∈N(i)wji.Wewilltypicallyassumethat
∀
i∈Vwi>
0.Let
(
V)
denotethesetofallorderedsequences ofelementsfromV without repetitions.Letγ
idenotethei-thelement(node)ofsequence
γ
∈ (
V)
,and|
γ
|
denotethenumberofelementsinγ
∈ (
V)
.Finally,wecallγ
∈ (
V)
afull sequenceif
|
γ
|
= |
V|
andwedenotethesetoffullsequencesby∗
(
V)
.Tomakethenotation morereadable,wewilloftenomitthenetworkitselffromthenotation whenitisclearfromthe context,e.g.,bywritingN
(
i)
insteadofNG(
i)
.Wesometimestreatsequencesassets,whentheorderisnotimportant.Weuse
⊕
todenotetheconcatenationoperationoversequences.2.2. Computationalproblems
InourreductionswewillusetwostandardversionsoftheSetCoverproblem,decisionandcombinatorialoptimization.
Definition1(SetCover [28]).AninstanceoftheSetCoverproblemisdefinedbyauniverseU
= {
u1,
. . . ,
u|U|}
,acollectionofsets
S = {
S1,
. . . ,
S|S|}
suchthat∀
jSj⊂
U ,andanintegerk≤ |
S|
.Thegoalistodeterminewhetherthereexistk elementsof
S
theunionofwhichequalsU .SetCoverisoneoftheclassic21 Karp’sNP-completeproblems.
Theorem1([28]).SetCoverproblemisNP-complete.
Fortheproofoftheapproximationhardnesswewillusetheminimizationversionoftheproblem.
Definition2(MinimumSetCover).AninstanceoftheMinimumSetCoverproblemisdefinedbyauniverseU
= {
u1,
. . . ,
u|U|}
andacollectionofsets S= {
S1,
. . . ,
S|S|}
suchthat∀
jSj⊂
U .Thegoalistofindsubset S∗⊆
S suchthatthe unionof S∗equalsU andthesizeofS∗ isminimal.
For
α
≥
1,anα
-approximationforagiveninstanceofaminimization problemisafeasiblesolutionwhoseobjectiveis withinafactorofα
ofanyoptimalsolution.WewillusethefactthatMinimumSetCoverproblemishardtoapproximate.Theorem2([27]).Foranyfixed
>
0 MinimumSetCovercannotbeapproximatedtowithin(
1−
)
ln n,unlessP=NP.Wenowmovetodefiningthemodelofstrategicdiffusionandthemainoptimizationproblemofourstudy.
3. Problemdefinition
Inthissectionwedescribe theprocessofstrategicnetworkdiffusion,whichisthemainfocusofourstudy,aswellas thecomputationalproblemconcerningit.
Mostdiffusionmodelsare purelystochastic [10,9].In thiswork however,wefocus onthestrategicmodelofdiffusion toaccountforcasesinwhichtheinterconnectionsbetweentargetsaffecttheiractivationtime andthereforechoosingthe orderinwhichnodeswillbetargetedforactivationisstrategicallyplanned.
Inthismodel,theprocess ofdiffusionisdriven bya strategicagent. Atthebeginning oftheprocess onlyone chosen node ofthenetwork,the seednode iS,isactive. Then,theagent chooses asequence
γ
∈ (
V)
that providestheorder inwhichthenodeswillbeactivated.Theprobabilityofsuccessfulactivationofanodei inoneattemptisgivenby:
p
(
i)
= β
j∈N(i)∩Awji wi
αwhere A isthesetofcurrentlyactivenodes(atthebeginningoftheprocessitconsistsonlyoftheseed),and
α
,
β
∈ [
0,
1]
areconstants.Unlessstatedotherwise,wewillassumethatα
= β =
1.Noticethattheexpectedtimeofactivationofnodei withnon-zeroactivationprobabilityis
τ
(
i)
=
p1(i).Byτ
(
γ
)
wewilldenotetheexpectedtimeofactivationofallnodesin thesequenceγ
.This model was proposed by Alshamsi et al. [1] for undirected networkswith unweighted edges. If we assume that
wi j
=
1⇐⇒
i j∈
E thenourmodelisexactlyequivalenttothemodelproposedbyAlshamsi et al. [1].Wenowdefinethemaincomputationalproblemofourstudy.
Definition3(OptimalPartialDiffusionSequence).Thisproblemisdefinedbyatuple
(
G,
iS,
z)
,whereG= (
V,
E,
W)
isagivennetworkwithweightededges,iS
∈
V istheseednodeandz≤
n inthenumberofnodestoactivate.Thegoalistoidentifyγ
∗∈ (
V)
suchthatγ
1∗=
iS,|
γ
∗|
=
z andτ
(
γ
∗)
isminimal.In other words, we intend to find the fastest way to activate z nodes in the network. When z
= |
V|
in the above definition,theproblemissimplycalledOptimal(Full)DiffusionSequence.Remark1.Inthecaseofintegerweightsofpolynomiallength,foranyinstanceoftheOptimalDiffusionSequenceproblem thereexistsapolynomiallyequivalentinstancewithbinaryweights.
Proof. Let
(
G,
iS,
|
V|)
be agiveninstanceoftheOptimalDiffusion Sequence.Inordertoconstructanequivalent instancewith 0
/
1-weights we replaceevery edge i j with a set of wi j parallel 2-paths{
i,
ki,j,1,
j,
. . . ,
i,
ki,j,wi j,
j}
,and we setweightsofthenewedgesto wiki, j,r
=
wki, j,rj=
1, wki, j,ri=
wjki, j,r=
0,forr=
1,
. . . ,
wi j.Given a feasible solution
γ
to the original instance, we obtain a feasible solutionγ
to the constructed instanceby activating all nodes ki,j,r immediatelyafter activatingnode i in the original sequence (notice that the expected time ofactivationofeverysuchnodeki,j,ris1).Thus,ifthevalueofthesolution
γ
ist,thenthevalueoftheγ
ist+
i j∈E wi j+
wji .Givenafeasiblesolution
γ
totheconstructedinstanceofthebinaryversionoftheproblem,we canobtainasolution withalesserorequalvaluebyactivatingallnodeski,j,rinoneblockimmediatelyafteractivatingnodei (again,theexpectedtimeofactivationofnodeki,j,risalways1,whileitcontributestothetimeofactivationofnode j).Forsuchsequence,we
canobtaincorrespondingsolution
γ
totheoriginalinstanceoftheproblembyremovingfromitallnodeski,j,r.Ifthevalueofthesolution
γ
ist,thenthevalueoftheγ
ist−
i j∈Ewi j+
wji.
NotethatthisreductiondoesnotworkingeneralfortheOptimalPartialDiffusionSequenceifz
<
|
V|
.2
4. Computinganoptimalsolution
WenowdescribeourresultsonfindinganefficientwaytocomputeanexactsolutiontotheOptimalDiffusionSequence problem.
4.1. Hardnessoffindinganoptimalsolution
First,weshowthatthedecisionversionofOptimalPartialDiffusionSequenceproblemisNP-completeinageneralcase.
Theorem3.OptimalPartialDiffusionSequenceproblemisNP-complete,evenifallweightsarein
{
0,
1}
.Proof. The decision version of the Optimal Partial Diffusion Sequence problem is the following: given a network G
=
(
V,
E,
W)
, the seed node iS, the number of nodes to activate z, and a value t∗∈ R
+, does there exist a sequence ofnodes
γ
∗∈ (
V)
startingwithiS suchthat|
γ
∗|
=
z andτ
(
γ
∗)
≤
t∗?ThisproblemclearlyisinNP,asgivenasolution,i.e.,asequenceofnodes
γ
∗∈ (
V)
,wecancomputeitsexpectedtime ofactivationinpolynomialtime.ToprovetheNP-hardnessoftheproblemwewillshowareductionfromtheNP-completeSetCoverproblem.
Let
(
U,
S,
k)
beaninstanceoftheSetCover problem.Wedefinenetwork G asfollows(anexampleofsuchnetworkis presentedinFig.1):•
Thesetofnodes: Forevery Si∈
S
,wecreatethreenodes,denotedby Si,qiandqi.Foreveryui∈
U ,wecreateasinglenodeui.Additionally,wecreateasinglenodeiS.
•
Thesetofedges: Foreverynode Si,wecreateanedge SiiS.Forevery nodeqi,we createedges qiSi andqiqi.Finally,foreverynodeuiandevery Sjsuchthatui
∈
Sj wecreateanedgeuiSj.•
Theweightmatrix: ForeveryedgeuiSj wesetweightstowuiSj=
0 andwSjui=
1.ForeveryedgeSiqi wesetweightsto wSiqi
=
0 and wqiSi= |
U||
S|
.Forallotherpairsofnodesconnectedwithanedgewesettheweightsto1.Let z
=
k+ |
U|
+
1 (weintendthat thesolutionsequencewillactivatenodeiS,k nodesinS
andallnodesinU ),andFig. 1. NetworkG constructed
for the proof of Theorem
3. Blue numbers next to edges express their weights. If there are no arrows next to edge i j then wi j=wji. Otherwise weight wi jis denoted next to the arrow pointing towards node j,and weight
wjiis denoted next to the arrow pointing towardsnode i.
(For interpretation of the colors in the figures, the reader is referred to the web version of this article.)
Wewillnowshowthatasolutiontothisinstanceofvalueatmostt∗ correspondstoasolutiontothegiveninstanceofthe SetCoverproblem.
Noticethattheexpectedtimeofactivationofeverynode Siisalways
|
U||
S|
+
1,whilefortheexpectedtimeofactivationofanodeuiwehave
τ
(
ui)
≤ |{
Sj:
ui∈
Sj}|
≤ |
S|
.Noticealsothatneitheranyofthenodesqi noranyofthenodesqicanbeactivatedwheniS istheseednode,astheinfluenceof Si onqi iszero.
First, wewill show that ifthere exists asolution
S
∗ to the giveninstance ofthe SetCover problem, then therealso existsasolutiontotheconstructedinstanceoftheOptimalPartialDiffusionSequenceproblemofvalueatmostt∗.Wecan constructsuch asolutionby firstactivatingevery node Si∈
S
∗ (k nodesactivatedinexpectedtime|
U||
S|
+
1 each)andthenactivatingallnodesui(
|
U|
nodesactivatedinexpectedtimenotexceeding|
S|
each).Suchγ
∗ activatesk+ |
U|
nodesinexpectedtimenotexceedingk
(
|
U||
S|
+
1)
+ |
U||
S|
,thereforeitisasolutiontotheconstructedinstanceoftheOptimal PartialDiffusionSequenceproblemofvalueatmostt∗.Second,tocompletetheproofoftheNP-hardness,wehavetoshowthatifthereexistsasolution
γ
∗ totheconstructed instance ofthe Optimal Partial Diffusion Sequence problemofvalue atmostt∗,then there also exists a solutionto the giveninstanceofthe SetCover problem.Such asolution isS
∗=
γ
∗∩
S
, i.e.,choosing sets Si corresponding to nodes Sioccurringinsequence
γ
∗.Noticethat therecannotbemorethank suchnodes,asactivatingk+
1 nodes Si hasexpectedtime
(
k+
1)(
|
U||
S|
+
1)
>
t∗. Sincethisisthecase, inorderto activatek+ |
U|
nodesother than iS,sequenceγ
∗ hastoactivateallnodesinU .However,toactivateanodeui,wefirsthavetoactivateatleastoneofitsneighbors,i.e., node Sj
such that ui
∈
Sj.Therefore, forevery node ui there mustexist atleastone node Sj∈
γ
∗∩
S such that ui∈
Sj.Hence,S∗
=
γ
∗∩
S
isavalidsolutiontothegiveninstanceoftheSetCoverproblem.Finally,wecanusetheconstructioninRemark1toreplaceeveryedgeqiSibyasetofparallelpathswithedgeweightsin
{
0,
1}
.Notethatsuchareplacementdoesnotchangethevalueoftheobjectiveasthenodesqi,andhencetheintermediatenodesaddedontheparallelpaths,areneveractivated.Thisconcludestheproof.
2
Therefore,thereexistsnopolynomialalgorithmfindingtheoptimalwaytoactivateagivennumberofnodesinthe pro-cessofstrategicdiffusion,unlessP=NP.However,wenowproposeanexponentialalgorithmbasedondynamicprogramming technique.
4.2. Dynamicprogrammingalgorithm
We present an algorithm for computing an optimal solution to the problem using the dynamic programming tech-nique [29]. The mainidea behind dynamic programming isbreaking the mainproblem intoa numberof smaller,easier tosolvesub-problemsinarecursive manner.However,unlike instandard recursionwhereoftenthesamecomputationis repeatedmultipletimes,indynamicprogrammingthesolutiontoeach sub-problemiscomputedonlyonceandstoredin memoryforfutureuse.Overtheyearsthedynamicprogrammingtechniquesweredevelopedinvariousdirections [30–33], howeverouralgorithmisbasedontheoriginalversionofthetechnique.
Noticethatinoursettingtheactivationprobabilityofanygivennodedependsonthesetofactivenodesinthenetwork, howeveritdoesnotdepend ontheorderinwhichthesenodeswere activated.Hence,thesolutionforeach setofactive nodeshavetobecomputedonlyonce.Moreover,solvingthesub-problemoffindinganoptimalwayofactivatingk nodesin thenetworkallowstoefficientlyfindthesolutionfork
+
1 nodes,asthetotalexpectedtimeofactivationcanbeexpressed in a recursive manneras a sumof time ofactivation of the initialk nodes andthe time ofactivation of the last node. The idea of usingsolutions for smaller sub-problemsto solve a larger problem isthe core concept behind the dynamic programming,hencewefinditasuitableoptimizationmethodforsolvingtheOptimalPartialDiffusionSequenceproblem.Thealgorithmisbasedonthefollowingrecurrencerelation,where
τ
∗(
C)
istheminimalexpectedtimeofactivationof asetofnodesC :Algorithm 1: Dynamicprogrammingalgorithmforstrategicdiffusion.
Input: A
weighted network
(V,E,W), a seed node iS∈V ,and the number of nodes to activate
z. Output: Sequenceof activation of
z nodesstarting with
iSwith minimal expected time of activation.1 for C⊆V do 2 τ∗[C]← ∞ 3 τ∗[{iS}]←0 4 γ∗[{iS}]← iS 5 for k=1,. . . ,z−1 do 6 for C⊂V: (|C|=k)∧ (τ∗[C]<∞)do 7 for i∈V: (i∈/C)∧ (N(i)∩C= ∅)do 8 τ← wi j∈N(i)∩Cwji 9 ifτ∗[C]+ τ<τ∗[C∪ {i}]then 10 τ∗[C∪ {i}]←τ∗[C]+ τ 11 γ∗[C∪ {i}]←γ∗[C]⊕ i
12 returnγ∗[arg minC⊆V:|C|=zτ∗[C]]
Fig. 2. Example
of using dynamic programming. Each large orange state represents a possible state of activation of the network, with black nodes
repre-senting active nodes, and white nodes reprerepre-senting inactive nodes. The value of τ∗ indicates the minimal expected time required to reach this state of
activation. Value on each arrow represents the cost of activation of a single node, required to move between states. Red arrows represent optimal sequence of activation, which corresponds to the least expensive path from the initial state to the state where entire network is activated.
τ
∗(
C)
=
⎧
⎪
⎪
⎨
⎪
⎪
⎩
0 if C= {
iS}
min i∈Cτ
∗(
C\ {
i}) +
wi j∈N(i)∩Cwji if|
C| >
1∞
otherwisewhereweassumethatsummationoveremptysetresultsinzeroanddivisionbyzeroresultsin
∞
.Outofsetsofsizeone,onlythesetconsistingoftheseednodehasfinitetimeofactivation.Forsetsthatarelargerthan onewedividetheproblemintocomputingtheoptimaltimeofactivationofallnodesbutthelastone,andthencomputing the time of activation of thelast node. Since inthe strategic diffusionprocess the nodes are activated sequentially,one of the nodes i fromthe set C hasto be activated last, andwe are able to compute its time of activation based onthe informationofwhichothernodesofthenetworkarealreadyactive.Noticethatattemptingtoactivateanodewithoutany activeneighbors(orwithsumofinfluencesfromtheseneighborsequaltozero)willresultinthevalueoftheformulaequal to
∞
.Pseudocode of the dynamic programming algorithm is presented as Algorithm 1, while Fig. 2 presents the intuition behindthealgorithmbyshowinghowitworksonaspecificgraphstructure.
Inentry
τ
∗[
C]
wecomputetheminimalexpectedtimenecessarytoactivateallnodesinsetC ,whileinentryγ
∗[
C]
we keepasequenceofactivationallowingustoachievethisoptimalexpectedtime.Inthek-thexecutionoftheloopinline 5 we computevaluesofentriesinτ
∗ andγ
∗ forsets C suchthat|
C|
=
k+
1.Wedoitby iterating(in loopinline 6)over all setsofnodesofsizek thatcanbeactivatedwhenwestarttheprocess fromtheseednode iS andtheniterate(inloopinline 7) overallpossible nodesthat canbe targeted,i.e., nodeswithnon-zeroprobabilityofactivation,whenthe setof active nodesisC .Inlines 8-11weupdatethebestwayofactivatingnodesinsetC
∪ {
i}
whenactivatingnodesinC firstandactivatingnodei afterwardshaslowerexpectedtimethancurrentlyknownfastestwaytoactivatenodesinC
∪ {
i}
. Asfortheimplementationdetails,τ
∗andγ
∗canbeimplementedashashtables.Inthiscasecheckingwhetherτ
∗[
C]
<
∞
isequivalenttocheckingwhetherτ
∗ containstheentryforC andthereforelines 1-2canbeomitted.Noticealsothatduringthek-thexecutionoftheloopinline 5weonlyneedentriesintables
τ
∗ andγ
∗ forsetsC suchthat
|
C|
=
k.Hence,toreduce memoryrequirements,wecanonlykeepinmemoryentriesfromtablesτ
∗ andγ
∗ fortwo sizes ofsets:theonesforsizek,computedinthepreviousexecutionoftheloop(orinitialized inlines 3-4incaseofthe firstexecution)andtheonesforsizek+
1,beingcomputedinthecurrentexecutionoftheloop.Fig. 3. Decomposition
of graph into biconnected components. Each biconnected component is marked with different color, with cut nodes marked white
and seed nodes marked black. Gray nodes mark stubs added to provide proper influence sum for original cut nodes.
Despite theseoptimization possibilities, the dynamic programmingalgorithm remains exponential, as it considers all subsetsofnodespossibletobeactivated.Ifthenumberofnodestobeactivatedisconstantthenthealgorithmis polyno-mial.
Theorem4.Algorithm1solvestheOptimalPartialDiffusionSequenceproblemintime
O(|
V|
z+1|
E|)
.InSection4.4,wegiveadynamicprogramthatworksmoreefficientlywhenthenetworkhasbothboundeddegreeand boundedtreewidth.Notethatitiscommontousedynamicprogrammingforsolvingoptimizationproblemsongraphswith boundedtreewidth,see,e.g.,[34].
Analternativeapproachtodevelopinganalgorithmoffindinganeffectivewayofactivatingthenetworkwouldbetouse thereinforcementlearningtechniques [35].Reinforcementlearningisbasedontheideaofmakingasequenceofdecisions by finding abalance betweenexploitation (performing thecurrently bestaction toobtain short-term profits) and explo-ration (investigatingother actions in thehope ofa long-termgain). Incaseofstrategic diffusion,the exploitation would be activatinganodewiththehighestactivationprobability,whiletheexploration wouldbe activatinganode withlower activationprobability inordertomakeits neighborseasiertoactivate.Nevertheless,an algorithmbasedonreinforcement learningwouldnotguaranteethatidentifiedactivationsequenceistheoptimalsolution,unlikethepresenteddynamic pro-grammingalgorithm,whichofferssuchcertainty.Hence,weleavethedevelopmentofanalgorithmbasedonreinforcement learningasapotentialfuturework.
Wenowdiscussotherwaysofoptimizationformorerestrictednetworkstructures.
4.3. Decompositionintobiconnectedcomponents
Wewillnowshowthatlookingfortheoptimalwayofactivatingallnodesofthenetworkcanbemadesimplerbyusing decompositionintobiconnectedcomponents.
Cut node ina network (also calledan articulation point)is a node the removalofwhich causes network tofall into twoormoreconnectedcomponents.Byseparatingthenetworkincutnodesweobtainadecompositionintobiconnected components.Biconnectedcomponentisa maximalsubnetwork(intermsofinclusion)suchthat removinganynode from itwillnotdisconnectit.AnexampleofadecompositionintobiconnectedcomponentsisshowninFig.3.Noticethatacopy ofacutnodeappearsineverybiconnectedcomponentitbelongsto.
The strategic diffusion process in every biconnected component can be considered separately. This is because every biconnected components has only one possiblestarting point of the diffusion (being it either the seed node in caseof biconnected componentcontaining itor thecut node closest to theseed node in caseofall other components). Thisis indicatedbyblackcolorofanodeinFig.3.OncethisstartingpointofagivencomponentC isactivated,theactivationstate inotherbiconnectedcomponentsofthenetworkdoesnotaffectactivationisC .
This is because activation probability of a node is only affected by the state of activation of its neighbors and the value of wi (toremind the reader, wi
=
j∈N(i)wji). Only cut nodes haveneighbors inother biconnected components,
henceforallnon-cutnodesthisobservationistrivial.Asforanycutnodei,noticethatitsneighborsinother biconnected componentscanonly beactivatedafteri is activated(asall thepathsbetweenthemandtheseed noderun throughthe cutnode).Thereforetheonlywayinwhichneighborsofacutnodei inotherbiconnectedcomponentsaffecttheactivation probability ofi is addingtothevalue of wi.Toaccountforthisfact,we createadditionalstubsconnectedto copiesofi
Toremindthereader,inthisworkweconsiderundirectednetworks(noticethatthedefinitionofbiconnected compo-nents isdifferent fordirected networks). However, we do not assume that the relation ofinfluence is symmetric,i.e., it is possiblethat fori j
∈
E we have wi j=
wji.Sincewe considerthe decompositionintodisconnectedcomponentsin thecontext ofagivenseednode,thereisnevera disambiguationintermsofwhichedge weightshavetobeused incaseof the cut nodes.To compute the activation probability ofa cut node i, foreach neighbor j onlyweight wji is takeninto
consideration. Whatismore,onlyneighborsof i thatareclosertothe seednode thani canhaveapositive contribution intoitsactivationprobability,astheneighborsofi thatarefurtherawayfromtheseednodecanonlybeactivatedafteri is
activated.Atthesametime,i affectstheactivationtimeofitsneighbor j throughtheweight wi j,anditcanhaveapositive
contributionbothwhen j iscloserandwhenitsfurtherawaytothesourcenodethani.
4.4. Optimalsolutionfornetworkswithboundedtreewidth
We will nowshow a polynomial algorithm that findsan optimal waytoactivate anetwork withboth treewidth and degreeboundedbyconstants.
LetG
= (
V,
E,
W)
beanarbitrarynetwork.Let(
T,
F)
beanetworkwhereeverynodecontains asubsetofnodesfromV ,i.e.,
∀
t∈Tt⊆
V (wewillcalleachsuchsubsetabag).Such(
T,
F)
isatreedecompositionofG (see,e.g.,[36])ifandonlyifthefollowingconditionsaremet:
•
Every nodefromV iscontainedinatleastonebag,i.e.,t∈Tt=
V .•
For everyedgee∈
E thereexistsabagthatcontainsbothendsofe,i.e.,∀
i j∈E∃
t∈T{
i,
j}
⊆
T .•
For everynode v∈
V subnetworkof(
T,
F)
inducedbybagscontaining v isconnected.The treewidth
θ
ofadecompositionisthesize ofits largestbagminus one,i.e.,θ
=
maxt∈T|
t|
−
1.The treewidthofanetwork is theminimumover treewidthsofall its treedecompositions. Aproblemis said tobe fixed-parametertractable
[37],withrespect to parameterk, ifanyinstance ofthe problemofsize N can be solved intime f
(
k)
·
NO(1),forsomecomputable function f
(
·)
.WeshowthattheOptimal StrategicDiffusionproblemisfixed-parametertractablewithrespect tothesumoftreewidthandmaximumdegree.4.4.1. Fulldiffusion
Tobetterexplaintheidea,wefirstgiveanalgorithmforthefulldiffusioncase.
Theorem5.LetG
= (
V,
E,
W)
beanetworkwithmaximaldegreeδ
.Thereexistsanalgorithmthat,givenatreedecompositionof treewidthθ
,findsanoptimalwayofactivatingnodesinV intimeO(θδ(θδ)!
2n)
.In whatfollows,weassume that thetreedecompositionisrootedinsome nodetR,thesameforbottom-up and
top-downorder.Weusethefollowingnotation:
•
c(
t)
denotesthesequenceofchildrenoft;•
γ
|
X denotesthesubsequenceofγ
consistingonlyofthenodesinset X ;•
γ
x denotesthesubsequenceofγ
consistingofallelementsprecedingx;•
γ
x denotesthesubsequenceofγ
consistingofx aswellasallelementsfollowingx;•
TtopdowndenotessequenceofnodesinT inatop-downorder.Inwhatfollowswewillcalltwopermutations
γ
andγ
ofnodesfromG compatible,denotedγ
∼
γ
,ifandonlyifthey activatethenodesthattheyhaveincommoninthesameorder,i.e.,γ
|
γ=
γ
|
γ .Foreachnodeoft thetreedecompositionT wecomputearecordconsistingofthefollowingfields:
• ϒ[
t]
storesallpermutationsofnodesinbagt andtheirneighborsthatcanbeapartofavalidsolutiontotheproblem;•
τ
[
t,
γ
,
i]
storestheexpectedtimeofactivationofeverynodeini∈
t whenactivatedintheordergivenbyγ
;•
τ
∗[
t,
γ
]
storesthesmallestexpectedtime to activateallnodes inthesubnetwork of G inducedby thenodes inthe subtreeofT rootedatt,whenthenodesint andtheirneighborsareactivatedintheordergivenbyγ
.Tocomputetherecordswetraversethetreedecompositioninabottom-uporder(giventheroottR)andforeverynode
t weperformthefollowingsteps:
1. Compute:
ϒ
[
t] =
γ
∈
∗(
t∪
i∈t N
(
i))
: (
iS∈
/
t∨
γ
1=
iS)
∧
∀
γi∈t: γi=iS∃
j<iγ
j∈
N(
γ
i)
∧
∀
t∈c(t)∃
γ∈ϒ[t]γ
∼
γ
;
Algorithm 2: Algorithm reconstructingtheoptimalwayofactivatingall nodesofanetworkwithboundedtreewidth anddegree.
Input: Tree
decomposition
(T,F)of a weighted network (V,E,W), with computed values of τ∗.Output: Sequence
of activation of nodes in
V startingwith
iSwith minimal expected time of activation.1 γ∗← 2 for t∈Ttopdowndo 3 γ←arg minγ∈ϒ[t]:γ∼γ∗τ∗[t,γ] 4 γ← 5 forγi∈γ do 6 ifγi∈/γ∗then 7 γ←γ⊕ γi 8 else 9 γ∗←γγ∗∗ i ⊕γ ⊕γ∗ γ∗ i 10 γ← 11 γ∗←γ∗⊕γ 12 returnγ∗
2. Forevery
γ
∈ ϒ[
t]
andeveryγ
i∈
γ
|
t compute:τ
[
t,
γ
,
γ
i] =
wγi
j<iwγjγi
;
3. Foreveryγ
∈ ϒ[
t]
compute:τ
∗[
t,
γ
] =
γi∈γ|tτ
[
t,
γ
,
γ
i] +
t∈c(t) min γ∈ϒ[t]:γ∼γ⎛
⎝
τ
∗[
t,
γ
] −
γi∈γ|tτ
[
t,
γ
,
γ
i]
⎞
⎠ .
Aftertraversingthetreeinthisfashion,theminimaltimerequiredtoactivatedentirenetworkG startingwithiS canbe
identifiedasminγ∈ϒ[tR]
τ
∗[
tR,
γ
]
.Inorderto reconstructthesequenceallowing toactivateentirenetworkinthe optimaltime,wecanrunAlgorithm2.
Letusnowcommentontheprocedureoffillingtherecords.
In step 1we gather all permutationsof nodesin bagt and their neighbors that can be apart ofa valid solutionto theproblem,accordingtothreeconditions.Thefirstcondition assertsthatifthepermutation containstheseednode,itis activatedastheveryfirstnode.Thesecond conditionassures thatevery nodeinbagt otherthanthesourcenode hasat leastoneactiveneighboratthemomentofactivation(noticethatforeverynodeint allofitsneighborsarepresentinthe permutation).Thethirdconditionprovidesthatthepermutation
γ
allowstoactivateallnodesinthesubtreeoft,i.e.,that foreverychildoft inthetreedecompositionthereexistsatleastonevalidpermutationγ
thatactivatesnodesfromγ
in thesameorderasγ
.Instep 2wesimplycomputethetimeofactivationofeverynodeinbagt,accordingtothedefinitiongiveninSection3, andstoreitintable
τ
.Again,noticethatforevery nodeint allofitsneighborsarepresentinthepermutation,hencewe haveenoughinformationtocomputeitsexpectedtimeofactivation.Finally,instep 3wecompute theoptimaltimeofactivationofallnodesinthesubtreeoft whileusingpermutation
γ
andstore itintableτ
∗.The expressionconsistsofa sumoftime ofactivationofthe nodesinbagt anda sumoverall children oft,whereforeachofthemwe compute theminimumoverall compatiblepermutationsandsubtractthe time ofactivationofnodesint.Notice thatsince(
T,
F)
isthetreedecomposition,theonlypossibleoverlapbetweennodesin thesubtreesofdifferentchildrenoft arenodesinbagt.Otherwisethegraphinducedbybagscontainingsuchoverlapping nodeswould not be connected,andhence one ofthe conditionsof beingthe treedecomposition wouldnot be met for(
T,
F)
.Afterhavingfilledtables
ϒ
,τ
∗ andτ
,wetraverse thetreedecompositionagain, thistime ina top-downorderusing Algorithm2. Weconstructan optimalsolution tothe problemonvariableγ
∗.In line3we selectasγ
thepermutation withminimaltimenecessaryto activateall nodesinthesubtree oft,that iscompatiblewiththesolutionconstructedso far.Inlines4-11wemergeγ
intosequenceγ
∗ usingauxiliaryvariableγ
.As for the time complexity of the algorithm, the most costly operations (both of which are equally expensive) are validatingexistence ofacompatiblesequenceforeverychildof t in thethirdconditioninstep 1andcomputingthetime necessary to activatethe nodesin the subtreesof all children of t inthe second sum instep 3. Forassessing the time complexityofthealgorithmwewillfocusonthecostinstep 3.Sinceeverynodetisachildofatmostoneothernodein thetreedecomposition,foreveryt
∈
T thecomputationofminimumisexecutedO((θδ)!)
times(the numberofdifferent permutationsγ
∈ ϒ[
t]
). Sincethere areO(
n)
nodesinthetreedecomposition,the computationofminimum isexecutedO((θδ)!
n)
times.ThecostofexecutingitonceisO(θδ(θδ)!)
,sincewe havetocheckO((θδ)!)
manypermutationsinϒ
[
t]
and for each of them check whether
γ
|
γ=
γ
|
γ in timeO(θδ)
. Hence, the total time complexity of the algorithm isAlgorithm 3: Algorithmcomputingthevalueof
τ
∗[
t,
γ
,
k]
.Input: Tree
decomposition
(T,F)of a weighted network (V,E,W), with values of τ∗computed so far, node t withsequence of children
c(t)= t1,. . . ,t|c(t)|.Output: The
value of
τ∗[t,γ,k].1 for ti∈ t1,. . . ,t|c(t)|do 2 for m=0,. . . ,k−γ|tdo 3 if i=1 then 4 τ[t,1,γ,m]← min γ∈ϒ[t1]: γ∼γ τ∗[t1,γ,m+ |γ|t|] −γi∈γ|tτ[t1,γ ,γ i] 5 else 6 τ[t,i,γ,m]← min m∈{0,...,m} τ[t,i−1,γ,m]+ min γ∈ϒ[ti]: γ∼γ τ∗[ti,γ,m−m+ |γ|t|] −γi∈γ|tτ[ti,γ,γi] 7 returnγi∈γ |tτ[t,γ,γi]+τ t,|c(t)|,γ,k−γ|t 4.4.2. Partialdiffusion
Wenextpresentanalgorithmforpartialdiffusioninnetworkswithboundedtreewidthandboundeddegree.
Theorem6.LetG
= (
V,
E,
W)
beanetworkwithmaximaldegreeδ
.Thereexistsanalgorithmthat,givenatreedecompositionof treewidthθ
,findsanoptimalwayofactivatingz nodesinV intimeO(θδ(θδ)!
2z2n)
.We useessentially thesame notation asinprevious section.However, in whatfollowswe will calltwo permutations
γ
∈ (
X)
andγ
∈ (
X)
of nodes fromG compatible, denotedγ
∼
γ
,if andonly if they activate thenodes that they haveincommoninthesameorder,i.e.,γ
|
γ=
γ
|
γ ,andagreeonthenon-activatednodes,i.e.,γ
|
X\γ=
γ
|
X\γ=
(thisdifferenceistheresultofconsideringalsopermutationsthatarenotfull,i.e.,permutations
γ
∈ (
X)
suchthatγ
<
|
X|
). Therecordthatwecomputeforeachnodeoft thetreedecompositionT consistingnowofthefollowingfields:• ϒ[
t]
storesallpermutationsofnodesinbagt andtheirneighborsthatcanbeapartofavalidsolutiontotheproblem;•
τ
[
t,
γ
,
i]
storestheexpectedtimeofactivationofeverynodeini∈
t whenactivatedintheordergivenbyγ
;•
τ
∗[
t,
γ
,
k]
denotesthesmallestexpectedtimetoactivatek nodesinthesubnetworkofG inducedbythenodesinthe subtreeofT rootedatt,whenthenodesint andtheirneighborsareactivatedintheordergivenbyγ
.Hence, the only difference in comparison to the full diffusion version is that table
τ
∗ is additionally indexed with the numberofnodestoactivate.Tocomputetherecordswetraversethetreedecompositioninabottom-uporder(giventheroottR)andforeverynode
t weperformthefollowingsteps:
1. Compute:
ϒ
[
t] =
γ
∈ (
t∪
i∈t N
(
i))
: (
iS∈
/
t∨
γ
1=
iS)
∧
∀
γi∈t: γi=iS∃
j<iγ
j∈
N(
γ
i)
∧
∀
t∈c(t)∃
γ∈ϒ[t]γ
∼
γ
;
2. Foreveryγ
∈ ϒ[
t]
andeveryγ
i∈
γ
|
t compute:τ
[
t,
γ
,
γ
i] =
wγij<iwγjγi
;
3. Forevery
γ
∈ ϒ[
t]
andfork= |
γ
|
t|,
. . . ,
z computethevalueofτ
∗[
t,
γ
,
k]
usingAlgorithm3.Aftertraversing the treeinthisfashion, theminimal time requiredto activatez nodes innetwork G starting withiS
canbeidentifiedasminγ∈ϒ[tR]
τ
∗[
tR,
γ
,
z]
.Inordertoreconstructthesequenceallowingtoactivatez nodesintheoptimaltime,wecanrunAlgorithm4.
Letusnowcommentontheprocedureoffillingtherecords.
Steps 1and 2 are almostidentical tothose ofthe algorithmforfull diffusion,withnotable difference beingthat this timeweconsidersequencesthatarenotnecessarilyfull.
Instep 3wecallAlgorithm3tocomputethevalueof
τ
∗[
t,
γ
,
k]
,i.e.,theoptimaltimeofactivatingk nodesinthesubtree oft whileusingpermutationγ
.Tothisend,wevisitallthechildrenoft,intheordert1,
. . . ,
t|c(t)|.Form∈ {
0,
. . . ,
k− |
γ
|
t|}
we compute
τ
[
t,
i,
γ
,
m]
,theminimumexpectedtimerequiredtoactivatem nodesotherthantheonesactivatedint,inthe subnetwork inducedbyunionofthesubtreesrootedatt1,
. . . ,
ti.Wedothisusingthevaluesofτ
computedforchildrenoft precedingti,i.e., fort1
,
. . . ,
ti−1,aswell asthe valuesofτ
∗ computedforti.Intheselection processm denotestheAlgorithm 4: ReconstructPartial
(
t,
γ
∗,
k)
:algorithmreconstructingtheoptimalwayofactivatingk nodesinanetwork withboundedtreewidthanddegree.Input: Tree
decomposition
(T,F)of a weighted network (V,E,W), with computed values of τ∗and τ, a bag t,a sequence
γ∗∈ (V)and aninteger k≤n.
Output: Sequence
of activation of
z nodesin
V startingwith
iSwith minimal expected time of activation.1 if k>0 then 2 γ←arg minγ∈ϒ[t]:γ∼γ∗τ∗[t,γ,k] 3 γ← 4 forγi∈γ do 5 ifγi∈/γ∗then 6 γ←γ⊕ γi 7 else 8 γ∗←γγ∗∗ i ⊕γ ⊕γ∗ γ∗ i 9 γ← 10 γ∗←γ∗⊕γ 11 k←k− |γ|t| 12 for ti∈ t|c(t)|,. . . ,t1do 13 m←arg minm∈{0,...,k} τ[t,i−1,γ,m]+minγ∈ϒ[ti]: γ∼γ τ∗[ti,γ,k−m+ |γ|t|] −γi∈γ|tτ[ti,γ ,γ i] 14 γ∗←ReconstructPartial(ti,γ∗,k−m+ |γ|t|) 15 k←m 16 returnγ∗
inthesubtreerootedinti.Thereturnedvalueoftheminimaltimeofactivationofk nodesinthesubtreeoft whileusing
permutation
γ
iscomputedasthe sumofthetime ofactivation ofnodesfromt and theoptimaltime ofactivatingthe remainingk−
γ
|
tnodesinall|
c(
k)
|
childrenoft.Thetime complexityofthealgorithmis similartothefulldiffusioncase, exceptthatwe havenow thetwoadditional loops,oneoverk instep 3andtheotheroverm inline6ofAlgorithm3,whichcontributeanadditionalfactorof
O(
z2)
to therunningtime.We are unable to compute an optimal solutionin polynomial time in generalcase. We may howeverhope to find a polynomialapproximationalgorithm.Wenowmovetotheanalysisofpossiblewaysofapproximatingtheoptimalsolution. We now move to assessing the possibility of approximating the optimal solution to the Optimal Diffusion Sequence Problem.
4.5. Lowerboundsonheuristicalgorithms
Alshamsi et al. [1] suggesttwoheuristicstrategiesforactivatinganetworkinaprocessofstrategicdiffusion:thegreedy strategy(alwaystargetnodewiththehighestprobabilityofactivation)andthemajoritystrategy(alwaystargetnodewith thehighestnumberofactiveneighbors).
Wenowshowthatneitherofthesestrategiesapproximatesanoptimalsolutiontowithinaconstantratio.
Theorem 7.There exists no constant r
>
1 such that choosing the sequence of activation using the greedy strategy is an r-approximationalgorithm.Inparticular,theapproximationratioofthegreedystrategyis(
logn)
.Proof. Wewillnowshowaseriesofnetworkswheretheapproximationratioofthegreedyalgorithmgoestoinfinity. Foragivenk
∈ N
weconstructanetwork G(
k)
,wherethesetofnodesisV= {
iS,
a1,
. . . ,
ak2,
b1,
. . . ,
bk−1}
andwhere thesetofedgesis:E
=
k2i=1
{
iSai} ∪
k2i=1 k
−1 j=1
{
aibj}.
Weassumethattheweightofeveryedgeis1.ThestructureofnetworkG
(
k)
ispresentedinFig.4.Letx bethenumberofcurrentlyactivatedai nodesandlet y bethenumberofcurrentlyactivatedbi nodes.We have
that
τ
(
ai)
=
y+k1 andwehavethatτ
(
bi)
=
k2
x.
Wewillnowanalyzetwodifferentwaysofactivatingallnodesinnetwork G
(
k)
.First,letusconsiderthegreedy algo-rithm.Lettheindicesofnodesai,aswell astheindicesofnodesbj,beordered accordingtotheorderofactivation,i.e.,nodea1isthefirstactivatednodeai,whilenodeb1 isthefirstactivatednodebi.Noticethatatleastik nodesa hastobe
activatedbeforetheactivationofnodebi.Wewillprovethisclaimbycontradiction.Assumethatbicanhave j
<
ik activeneighborsatthemomentofactivation.Takenodeaj+1 (thefirstnodea activatedafterbi).Atthemomentofactivationof
bi itsprobability ofactivation is kj2,whilethe probabilityofactivation ofaj+1 is
i
Fig. 4. Network G(k)constructed for the proofs of Theorems7and8.
andbi wasactivatedbeforeaj+1,we havetohave kj2
≥
i
k.However,thisisnottruesince j
<
ik.Wehaveacontradiction,thereforeatleastik nodesa hastobeactivatedbeforetheactivationofnodebi.
Becauseofthisatleastk nodesai hastobeactivatedwithonlyoneneighboractive,thenatleastk nodesaihastobe
activatedwithonlytwoneighborsactive,andsoon.Focusingonlyonthetimeofactivationofnodesai wehavethatthe
totalexpectedtimeofactivationforthegreedyalgorithmis:
τ
G(
G(
k))
≥
k2 i=1τ
(
ai)
≥
k j=1 kk j=
k 2H kwhere Hkisthek-thharmonicnumber.
Now, let usconsider an algorithm A, wherewe first activatek nodes ai,then all nodes bi andfinally theremaining
nodesai.Timeofactivationoftheentirenetworkisthen:
τ
A(
G(
k))
=
k2+ (
k−
1)
k+ (
k2−
k)
=
3k2−
2ksinceeachofthefirstk nodesai isactivatedinexpectedtimek (asitonlyhasoneactiveneighbor,namelyiS),eachnode
bi isactivatedinexpectedtimek (asithasdegreek2andk activeneighborsatthemomentofactivation)andeachofthe
remainingk2
−
k nodesaiisactivatedinexpectedtime1.
ConsidersequenceofnetworksG
(
k)
.Forthissequencewehavelim k→∞
τ
G(
G(
k))
τ
∗(
G(
k))
≥
klim→∞τ
G(
G(
k))
τ
A(
G(
k))
≥
lim k→∞ Hk 3= ∞.
Therefore,solutionprovidedbythegreedyalgorithmcanbearbitrarilyworsethantheoptimalsolution.
2
Wealsoshowanalogicalresultforthemajoritystrategy.Theorem 8.There exists no constant r
>
1 such that choosing the sequence of activation using the majority strategy is an r-approximationalgorithm.Inparticular,theapproximationratioofthemajoritystrategyis(
logn)
.Proof. WewillnowshowthatfortheseriesofnetworksconstructedintheproofofTheorem7theapproximationratioof themajorityalgorithmalsogoestoinfinity.
Letx be thenumberofcurrentlyactivatedai nodesandlet y bethenumberofcurrentlyactivatedbi nodes.Wehave
that
τ
(
ai)
=
y+k1 andwehavethatτ
(
bi)
=
k2
x.
LetusconsidertheexpectedactivationtimeofanentirenetworkG
(
k)
usingmajorityalgorithm.Lettheindicesofnodesai,aswell astheindices ofnodes bj, be ordered accordingto the orderof activation,i.e., node a1 isthe first activated node ai,whilenodeb1 isthefirstactivatednodebi.Noticethatnodebiatthemomentofactivationhasatmosti
+
1 activeneighbors.Wewillprovethisclaimbycontradiction.Assumethatbi canhave j
>
i+
1 activeneighborsatthemomentofactivation. Consider numberofactiveneighborsatthe momentofactivationofnode aj (the last nodea activated before
bi).Sincewe areusingthemajorityalgorithm,ithastohaveatleast j
−
1 active neighbors(otherwise nodebi,havingatthemoment j
−
1 activeneighbors,wouldhavebeenchosenbeforenodeaj).However,nodeajcanhaveatmosti<
j−
1activeneighbors,namelynodesb1
,
. . . ,
bi−1 andnodeiS (asnodebi isnotactiveyet).Wehaveacontradiction,thereforebiatthemomentofactivationhasatmosti
+
1 activeneighbors.Focusing onlyonthetimeofactivationofnodesbi wehavethat thetotalexpectedtime ofactivationforthemajority
Fig. 5. NetworkG(X,r)constructed for the proof of Theorem9. Blue numbers next to edges express their weights. If there are no arrows next to edge
i j thenwi j=wji. Otherwise weight wi j is denoted next to the arrow pointing towards node j,
and weight
wji is denoted next to the arrow pointingtowards node i.
τ
C(
G(
k))
≥
k−1 i=1τ
(
bi)
≥
k−1 i=1 k2 i+
1=
k 2(
H k−
1)
whereHkisthek-thharmonicnumber.
Let A bethe alternativealgorithmdescribed inthe proofofTheorem 7.Consider sequenceofnetworks G
(
k)
.Forthis sequencewehave lim k→∞τ
C(
G(
k))
τ
∗(
G(
k))
≥
klim→∞τ
C(
G(
k))
τ
A(
G(
k))
≥
lim k→∞ Hk 3= ∞
Therefore,solutionprovidedbythemajorityalgorithmcanbearbitrarilyworsethantheoptimalsolution.
2
4.6. Inapproximability
Finally,we show that infact the problemcannot be approximatedwithin a ratioof
(
1−
)
ln n for any>
0,unlessP
=
N P .Theorem9.TheOptimalPartialDiffusionSequenceproblemcannotbeapproximatedwithinaratioof
(
1−
)
ln n forany>
0,unlessP
=
N P .Proof. Inordertoprovethetheorem,wewillusetheresultbyDinurandSteurer [27] thattheMinimumSetCoverproblem cannotbeapproximatedwithinaratioof
(
1−
)
ln n forany>
0,unlessP=
N P .LetX
= (
U,
S)
beaninstanceoftheMinimumSetCoverproblem.Toremindthereader,U istheuniverse{
u1,
. . . ,
u|U|}
, whileS isacollection{
S1,
. . . ,
S|S|}
ofsubsetsofU .Thegoalhereistofindsubset S∗⊆
S suchthattheunionofS∗equalsU andthesizeofS∗ isminimal.
First, we will show a function f
(
X)
that based on an instance of the Minimum Set Cover problem X constructs an instanceoftheOptimalPartialDiffusionSequenceproblem.LetnetworkG
(
X)
bedefinedasfollows(anexampleofsuchnetworkispresentedinFig.5):•
Thesetofnodes: Forevery Si∈
S,wecreatethreenodes,denotedby Si,qi andqi.Foreveryui∈
U ,wecreate|
S|
+
1nodes,denotedbyui,1
,
. . . ,
ui,|S|+1.Additionally,wecreateasinglenodeiS.•
Thesetofedges: Forevery node Si,wecreateanedge SiiS.Foreverynode qi,wecreateedgesqiSi andqiqi.Finally,foreverynodeui,i andevery Sjsuchthatui
∈
Sj wecreateanedgeui,iSj.•
Theweightmatrix: Foreveryedge SiqiwesetitsweightstowSiqi=
0 andwqiSi=
α
−
1,whereα
=
z|
S|
λ+1 forsome
λ
>
0.Forevery edgeui,iSj weset itsweightsto wui,iSj=
0 and wSjui,i=
1.Forallother pairsofnodesconnectedwithanedgewesettheirweightsto1.
Tocomplete the constructedinstance ofthe Optimal PartialDiffusion Sequenceproblem, we set theseed node to be
iS andwe setthe numberof nodesto be activatedto z