1 -minimization method for link ﬂow correction

(1)

ContentslistsavailableatScienceDirect

Transportation Research Part B

journalhomepage:www.elsevier.com/locate/trb

1 -minimization method for link ﬂow correction

Penghang Yin

^a

, Zhe Sun

^b

, Wen-Long Jin

^c^,^∗

, Jack Xin

^d

aDepartment of Mathematics, University of California, Los Angeles, CA 90095, USA

bDepartment of Civil and Environmental Engineering, Institute of Transportation Studies, 4040 Anteater Instruction and Research Bldg, University of California, Irvine, CA 92697-3600, USA

cDepartment of Civil and Environmental Engineering, California Institute for Telecommunications and Information Technology, Institute of Transportation Studies, 4038 Anteater Instruction and Research Bldg, University of California, Irvine, CA 92697-3600, USA

dDepartment of Mathematics, University of California, Irvine, CA 92697, USA

a rt i c l e i n f o

Article history:

Received 26 April 2017 Revised 8 August 2017 Accepted 9 August 2017

Keywords:

Link ﬂow correction

1-minimization Flow conservation law Recoverability Exact recovery Correction bound

a b s t ra c t

Acomputationalmethod,basedon1-minimization,isproposed fortheproblemoflink flow correction, when the availabletraffic flowdata onmany linksin aroad network areinconsistentwithrespecttotheflowconservationlaw.Withoutextrainformation,the problemisgenerallyill-posedwhenalargeportionofthelinksensorsareunhealthy.Itis possible,however,tocorrectthecorruptedlinkflowsaccuratelywiththeproposedmethod underarecoverabilityconditionifthereareonlyafewbadsensorswhicharelocatedat certainlinks. We analyticallyidentifythe links thatarerobust tomiscounts and relate themto thegeometricstructureofthe trafficnetworkby introducingtherecoverability conceptandanalgorithmforcomputingit.Therecoverabilityconditionforcorruptedlinks issimplytheassociatedrecoverabilitybeinggreaterthan1.Inamorerealisticsetting,be- sidestheunhealthylinksensors,smallmeasurementnoisesmaybepresentattheother sensors.Under thesamerecoverability condition,ourmethodguaranteestogive anes- timated trafficflow fairlyclose tothe ground-truth dataand leadstoa boundfor the correctionerror.Bothsyntheticandreal-worldexamplesareprovidedtodemonstratethe effectivenessoftheproposedmethod.

1. Introduction

Link volume/ﬂowdata isan importantdata sourcein bothlong-termplanning andshort-term operationapplications.

Theexamplesincludebutarenotlimitedtosignaltiming,tollroadpricing,origin-destinationtripmatrixestimation,trans- portationplanning,traﬃcsafety(e.g.Koonceetal.,2008;Lindsey,2006;McNally,2007;LordandMannering,2010andthe referencestherein).

Theflow conservationina trafficnetworkimpliesthatthetotal in-flowequalsthetotalout-flowateachnon-centroid node.Thecentroidsarenodeswheretraffic originates/isdestined to,andnon-centroidsnodesdenotesalltheother nodes.

Practically,when lookingattrafficflow countsoverasufficientlylong timeperiod(e.g.dailycumulative flow),we expect thatthesumofcumulativelinkflowsenteringthenon-centroidnodeequalsthesumofcumulativelinkflowsleavingit.

Theflowconservationlawisanimportantproperty,whichhasbeenexploitedinmanydifferentapplications.Forexam- ple,thewidelyusedfirst-ordertrafficflowmodel,theLWRmodel(LighthillandWhitham,1955;Richards,1956),isderived

∗ Corresponding author.

E-mail addresses: yph@ucla.edu (P. Yin), zhes@uci.edu (Z. Sun), wjin@uci.edu (W.-L. Jin), jxin@math.uci.edu (J. Xin).

(2)

basedonthe conservationoftraffic.InChen etal.(2009), theauthorsmentioned thata pathflow estimator(PFE) needs reasonablyconsistent linkflows,meaningthattheflowconservationlawshouldbesatisfiedwithina certainerrorbound, toreproducefeasiblepathflowsolutions.

Inpractice,theflow conservationlawcan be violateddueto numerousflow measuringerrors; i.e., theobserved flow countsaregenerallycorruptedandcausedatainconsistencyissues.InSunetal.(2016),thenetworksensorhealthproblem (NSHP)(Sunetal.,2016)isproposed toevaluateindividualsensors’healthindicesbasedonthelevelofflowdataconsis- tency. Assumingflow countingsensors are alreadyinstalled on some ofthe linkswhereatleast onebase setexists, the NSHPtriestofindtheleastinconsistentbasesetthat“minimizesthesumofsquaresofthedifferencesbetweenobserved andcalculatedlinkflows”.Thehealthindexofaspecificsensorisevaluatedbasedonthefrequencythatitappearsinthe leastinconsistentset.

Severalstudies havelooked intothe problemof correcting inconsistent flow data according to flow conservation. To solvea similar problemintransitplanning,Kikuchi etal.(2006)studied thepassenger flow balancingproblemandpro- poseda least square correction methodto adjust the flows, sothat thecounts are conservedand closeto the observed values.vanZuylenandBranston (1982)assumedthatthe observedlink flowsfollowprobability distributions constrained by flow conservation. The study derived the formula for constrained maximum likelihood estimates of the link flows.

Kikuchietal. (2000) examined andcompared sixdifferentmethods to adjustobserved flow rateaccording toflow conservation.Allofthemethodshavethesameconstraintsbutdifferentobjectivefunctions.VanajakshiandRilett(2004)stud- iedflow inconsistencyproblembetweenneighboring upstreamanddownstream loop detectors.Anonlinear optimization problemisproposedtocorrectloopdetectordata,inthecasewhenobserveddataviolatesflowconservation.

Insummary,giventheobservedcumulativeﬂowsondifferentlinks,alloftheexistingﬂowcorrectionmethodsadopted optimizationapproachesthattrytomeetthefollowingprinciples:

• Ensurethatﬂowconservationbefollowedexactlyatallnon-centroidnodesafteradjustmentusingasetofconstraints,

• Preservethe integrityoftheobserveddataasmuchaspossibleby minimizingthedistancebetweenadjustedandob- servedﬂows.

However, all ofthe studies are limited tosimple hypothetical networksornetworks withsimple topologies. Also, no systematicstudyhasbeendoneregardingtheeffectivenessandapplicabilityofthemethods.

Inthis study,we propose a method to estimate the true linkflow from corrupted data on observedlinks aswell as unobservedlinksvia1-minimization.Similarto theexistingmethods,thelink flowcorrectionmethodisalsoformulated asanoptimizationproblemtominimizethedifferencebetweenobservedandestimatedlinkflows.Asanimprovementover theexistingmethods,thenode-basedformulationofflowconservationisintroducedtohandlegeneralroadnetworkwhere linkflowsareonlyobservedonmonitoredlinks,notonalllinksasassumedinmanyexistingstudies.Moreimportantly,we adoptthe1-minimizationmethodfromcompressedsensing(Candèsetal.,2005;2006)toanalyticallyderivethecondition forexact/stablerecovery ofthetruecumulativeflowcounts.The1normistheuniqueconvexsparsitypromotingpenalty.

Thoughit is not differentiable, various eﬃcientscalable numericalmethods exist to date forits minimization (Beckand Teboulle,2009;Boydetal.,2011;Daubechiesetal.,2004;GoldsteinandOsher,2009;YangandZhang,2011)besideslinear programming.Inaddition to1 norm,other non-convexsparsitypromoting penaltyfunctionscanalsobe considered;see Yinetal.(2015);YinandXin(2017); Louetal.(2016) andreferencestherein.Theirminimizationiscomputationallymore expensivethan1,andweshallleavesuchastudyforafuturework.

Therestofthe paperisorganizedasfollows.InSection 2,westate the linkflow correctionproblemformulation,the exactandstablerecoverytheorem,therecoverabilityconditionandtheconnectionwithcompressedsensing.InSection3, weusea toy exampletoillustrate theconditionsforexactandstablelink flowrecovery.InSection 4,we usereal-world loopdetectordataasan applicationforthismethod.Inboththetoyandrealworldexamples,therecoverabilitycondition isverifiedanalytically.TheconcludingremarksareinSection5.

Notations

Letusﬁxsome notations.Rⁿ representsthe realcoordinatespaceof ndimensions. Letx∈Rⁿ,

^x

1:=n

i=1

|

^xi

|

^takes

the1 normofx,and

^x

^denotes^the^Euclidean⁽2)norm.GivenanyindexsetI⊆

{

¹,2,...,n

}

,

|

I

|

^counts^the^number^of

elementsinI;I^c:=

{

¹,2,...,n

} \

IisthecomplementsetofI.x_I∈R^|^I^|consistsoftheelementsinxrestrictedtotheindex setI.0₍_n₎∈Rⁿ denotesthevector containingzeros only,whileI₍_n₎∈Rⁿ^×n denotesthe identitymatrix ofordern.For any matrixA∈R^m^×ⁿ,AisthetransposeofA;A_I∈R^|^I^|^×ⁿisthesubmatrixofArestrictedtotherowindexsetI⊆

{

¹,2,...,m

}

, andA^I∈R^m^×^|^I^| isthesubmatrixofArestrictedtothecolumnindexsetI⊆

{

¹,2,...,n

}

^; ê.g.,Â{1,2}extracts thefirsttwo rowsofA,andA^{1,^2} extracts the first twocolumns ofA.Ker(Â)^:=

{

^x∈Rⁿ:Ax=0₍_m₎

}

^represents ^the^kernel ^space^of^A^,

whileRan(^A)^:=

{

^h∈R^m:h=Axforsomex∈Rⁿ

}

^represents^the^range^space^of^A^.

(3)

2. Methodology 2.1. Problemsetup

Givenatraﬃcnetworkwithnon-centroidnodesonly,thenode-linkincidencematrixA∈Rⁿ^×^l withnbeingthenumber ofnodesandlthenumberoflinks,canbeexpressedas

Ai j=

₋₁ _if_the _j-th_link_is _outgoing_link_of _node_i

1 ifthe j-thlinkis incominglinkof nodei 0 otherwise.

ThenAisalwaysoffull(row)rankasprovedinNg(2012),andtrafficflowdata fˆ∈R^l obeystheflowconservation:

Afˆ=0₍_n₎. (2.1)

Suppose M⊆

{

¹,2,...,l

}

îs ^the ^set ôf^links^whose ^link ^flows âre ôbserved,ând

|

M

|

=m.We call Mas “monitored set”

thereafter.Weassumethat f_M= fˆ_M+e_M∈R^m,

istheobservedinconsistentﬂowdatacorruptedbysensingerrorse_M∈R^m.

Theﬂowcorrectionproblemistoderive anestimateof fˆ,denotedbyf^∗,fromthecorrupteddata f_M.Hereweimpose an underlyingassumption onM forthe ﬂowcorrection problemtobe well-posed.We willneedthe conceptofbaseset introducedinSunetal.(2016).

Assumption2.1. McontainsatleastonebasesetK⊆M,meaningthat

|

K

|

=l−nandA^K^c∈Rⁿ^×ⁿisinvertible.

Fortheconsistent data fˆ_M= f_M(i.e.,e_M=0₍_m₎),ofcourse wehave fˆ_K= f_KsinceK⊆M.Thenthe fˆcanbeuniquely recoveredbyperforming(Ng,2012;Sunetal.,2016):

fˆ_K= f_K and fˆ_K^c=−

(

^A^K^c

)

⁻¹^A^K^fK.

IfMcontainsmorethanonebaseset,the fˆrecoveredintheabovefromdifferent f_K willbeconsistent.

Assumption2.1isthesufficientandnecessaryconditionforthewholelinkflowstobeinferable.Itguaranteesthatthe whole flowdata canbe deduced fromatleastone subset ofthe observedlink flows.Without thisassumption, however, some ofthe link flows cannot be estimated fromavailable data andthe problemis unsolvable (Ng, 2012), whetherthe measuredflowsareconsistentornot.

2.2. Flowcorrectionvia1-minimization

Since A isof fullrow rank,Ker(A) is an (^l−n)-dimensionalsubspace ofR^l.Suppose Z∈R^l^×⁽^l⁻ⁿ⁾ is thematrix whose columnsformabasisofKer(A).Since fˆ∈Ker(^A),wehave

fˆ=Zx, forsomex∈R^l⁻ⁿ.

Asaresult, fˆ_MmustbeoftheformZ_Mxforsomex∈R^l⁻ⁿ.

Remark2.1. ClearlytheexistenceofZisnon-unique,butf^∗isinvarianttothechoiceofZandonlydependsonthestructure ofthetraﬃcnetwork.Indeedf^∗istheoneinRan(Z)whoserestrictiononMhastheleastabsolutedeviationfrom f_M.Sof^∗ onlydependsonRan(Z)whichissameasKer(A).NotethatAisthenode-link matrixuniquelydeterminedby thenetwork structure.

Thefollowingresultnotonlygivesa concreteconstructionofZ,butalsointerpretsx^∗in(2.2) asan estimateof fˆ_K for somebasesetK(notnecessarilyasubsetofM).

Theorem2.1. LetK beany base set.Without loss ofgenerality,suppose A is partitionedas [A^K,A^K^c] withA^K^c ∈Rⁿ^×n being invertible.Then

Z=

I₍_l₋_n₎

−

(

^A^K^c

)

⁻¹^A^K

∈R^l^×⁽^l⁻ⁿ⁾

isabasismatrixofKer(A).Moreover,bychoosingsuchZ,x^∗from(2.2)isanestimateof fˆ_K.

WewillshowtheproofinAppendixC.Ourproposedmethodconsistsofthefollowingtwosteps:

1. Weﬁrstsolvean1-minimizationproblem:

x^∗=argmin

x∈R^l−n

^ZMx−f_M

¹^, ^(2.2)

Thatis, we seekan estimate ofe_M=f_M−fˆ_M in the aﬃne space

{

^fM−Z_Mx:x∈R^l⁻ⁿ

}

^with^the ^least1 norm.The problem(2.2) canbe eﬃcientlysolved bythe alternatingdirectionmethodof multipliers(ADMM)(Boyd etal., 2011);

seeAppendixAfortheimplementationdetails.

(4)

Fig. 1. A Toy Network. The solid links are monitored. The numbers in parentheses denote the ground-truth traﬃc counts.

2. fˆisthenestimatedby

f^∗=Zx^∗. (2.3)

Zx^∗mayhavenon-integerentries,inthiscase,wecanjustperformrounding.

2.3.Connectionswithcompressedsensing

Compressed sensing (Candès et al., 2006; Donoho, 2006) aims to recover a sparse signal (vector) y from an under- determined linearsystemthat generally hasinﬁnitely manysolutions. It enablesrecovery of thesignal y fromfarfewer samplesthan requiredby theNyquist-Shannonsamplingtheorem. Majoringredients of thestandard compressedsensing techniqueinclude

• Sparsity:mostoftheentriesinyarezeros.

• 1-minimization:minimizing

^y

1toexploitthesparsityofy.

Letusreturntotheflowcorrectionproblem,whichisinessenceequivalenttotheestimationofe_M.Inanextremecase, supposeallthesensorsarebad,leadingtolargesensingerrors.Withoutfurtherinformation,itisclearlyimpossibletogeta goodestimateof fˆfrom f_Mbyanymeans.Intuitively,however,reconstructing fîspromisingifmostofthesensorsrecord consistentflowdata.Mathematicallyspeaking,e_Missparse.Theflowcorrectionproblemthuscanbeviewedassparseerror correctionproblem(Candès etal., 2005;Xuetal., 2013), whichissimilarto compressedsensing.Note that,however,the flowcorrectionproblemdeviatesfromthetraditionalcompressedsensingproblem,wherethematrixAwouldberandom.

3. Correctionresults

Notethatour proposedmethoddoesnot take advantageofanyprior informationaboutthe possiblebadsensors. Ap- parentlyonecan notalways hope foragoodestimationf^∗ to fˆ,evenifthereis onlyonebad sensorinthenetwork.For instance,inthenetworkshowninFig.1,ifthesensoronlink1givesverywrongcount,then basicallythereisnowayto reasonablycorrectthiserrorbecauselinks1and2areequivalentinthetopologyofthenetwork.Withthat said,without extrainformation,obtainingagoodestimateof fˆispossibleonlywhenthebadsensorsarelocatedatsomeparticularlinks.

Theselocationstoleratingmiscountaresomehowdeterminedbythenetworkstructure.Inthefollowing,weshallintroduce theconceptofrecoverability.

Deﬁnition3.1. Givenanetworkwithnode-linkincidencematrixAandmonitoredlinksetM,wedeﬁnetherecoverability forthesubsetS⊆Mby

Rec

(

S;A,M

)

:= inf

h∈Ker(A):^hS1=0

^hM\S

1

^hS

1 , (3.4)

whichisafunctionofthesubsetSandalsodeterminedbyboththenetworkstructureAandthemonitoredlinksetM. Sinceforanyh∈Ker(A),itholdsthath=Z

v

forsome

v

_∈_R^l⁻ⁿ_,thenwecanrewrite(3.4)as

Rec

(

^S;^A,^M

)

⁼ ^inf

v∈R^l−n:Z_Sv=0

^ZM\S

v

1

^ZS

v

1

, (3.5)

which resembles the classical Rayleigh quotient for the principal eigenvalue

μ

^of ^the generalized eigenvalue problem (Weinberger,1974):Z_SZ_S

v

=

μ

^Z_M_\_S^ZM\S

v

if2 normreplacesthe1norm.Theoptimizationoftheratiooftwohomoge- neousfunctionsofdegreeonehasbeenstudiedinHeinandBühler(2010),whereaninversepoweriterativealgorithmwas proposed.BasedonHeinandBühler(2010),weproposeaneﬃcientalgorithmtosolveproblem(3.5)whichwillbedetailed inAppendixB.

(5)

3.1. Exactrecovery

Weﬁrst considerthecasewheresome sensorsarebad, whichintroduceinconsistencyoftheﬂow data.Thefollowing Theorem3.1assertsthatwhenthebadsensorsarelocatedatcertainlinksetSwhosesizeisexpectedtobesmall,thenno matterhowlargetheerrorsare,weareabletoexactlyrecover fˆfrom f_M.

Theorem3.1(Exactrecovery). LetS:=

{

ⁱ∈M:e_i=0

}

,whichmeansmiscountsonlyoccuratthelinksetS.IfRec(S;A,M)>

1,thentheestimationf^∗computedby(2.3)isequalto fˆ.Thatis,thelinksinS arerobusttomiscountsifRec(S;A,M)>1. Theproofisomittedhere,sincetheabovetheoremisaspecialcaseofTheorem3.2inSection 3.2.Weremarkthatthe lower boundforRec(S;A,M)ⁱⁿ^therecoverabilityconditionissharp.Indeedthemethodcanfail whenRec(S;A,M)=1, aswillbeseeninthefollowingexample.

Example3.1. Letusconsiderthetraﬃcnetworkassociatedwiththe3×6node-linkincidencematrix A=

₁ ₁ ₋₁ ₋₁ ₀ ₀

0 0 1 0 −1 0

0 0 0 1 1 −1

,

andtheground-truthnetworkﬂow fˆ=

⎡

⎢ ⎢

⎢ ⎣

300 200 300 200 300 500

⎤

⎥ ⎥

⎥ ⎦

asinFig.1,thenodeandlinksarelabeledwiththeirIDwithgroundtruth

linkﬂowsintheparentheses.

ThenTheorem2.1givesthat

Z=

⎡

⎢ ⎢

⎣

−1 0 1

1 0 0

0 1 0

0 −1 1

0 1 0

0 0 1

⎤

⎥ ⎥

⎦

^.

LetthemonitoredlinksetbeM=

{

¹,2,4,5,6

}

,then

Z_M=

⎡

⎢ ⎢

⎣

−1 0 1

1 0 0

0 −1 1

0 1 0

0 0 1

⎤

⎥ ⎥

⎦

^.

Lettheobservationbe f_M=

⎡

⎢ ⎢

⎢ ⎣

f₁ f₂ f₄ f₅ f₆

⎤

⎥ ⎥

⎥ ⎦

⁼

⎡

⎢ ⎢

⎢ ⎣

300 200 200 300 600

⎤

⎥ ⎥

⎥ ⎦

^,î.e.,^theôbserved^link^flowôn^link⁶îsînflated^by¹⁰⁰^due^to^sensorêrror.^So

e_M=f_M−fˆ_M=

⎡

⎢ ⎢

⎢ ⎣

0 0 0 0 100

⎤

⎥ ⎥

⎥ ⎦

^,^S⁼

{

ⁱ∈M:e_i=0

}

=

{

⁶

}

,andM

\

S=

{

¹,2,4,5

}

^.

We can verifyby eitheran analytic approach orAlgorithm2 that therecoverabilityconditionRec(S;A,M)=2>1is satisﬁed.Then Theorem3.1assertsthatf^∗derived from(2.2) and(2.3)mustbe equalto fˆ.It isindeedtruebecausex^∗=

₂₀₀

300 500

,andtherefore

f^∗=Zx^∗=

⎡

⎢ ⎢

⎣

300 200 300 200 300 500

⎤

⎥ ⎥

⎦

⁼

fˆ.

(6)

Comparethisresultwiththegroundtruthlinkﬂows,wecanconcludethattheerrorsarecompletelyeliminated.

Remark3.1. Wehavetworemarksbelow.

• Withoutknowingthecountatlink3,i.e.,M=

{

¹,2,4,5,6

}

,theproposedmethodwouldfailexactrecoveryifthecount was corrupted at any other link except link 6. Take link 1 for example, it is easy to check that Rec(

{

¹

}

;A,M)=1. Therefore,link1isnotguaranteedtoberobust tomiscount byourtheory.Indeedthisisthecaseasmentionedinthe beginningofthissection.

• Supposelink3wasalsomonitored,i.e.,M=

{

¹,2,...,6

}

,thenanycountingerroratoneofthelinks3,4,5and6could beaccuratelycorrected.

3.2.Stablerecovery

Inamorerealisticsetting,we assumethatall theelementsine_Mare non-zeros,yetmostofthemarerelativelysmall comparedwiththeother few.Thisrefers toapproximatesparsityincompressedsensing.Inthiscase,itisstill possiblefor f^∗tobecloseenoughto fˆ.Inanotherword,theestimationerrorsareboundedfromaboveinthiscase.

Theorem3.2(Stability). ForanyS⊆M,ifRec(S;A,M)=

α

>1,thenf^∗computedby(2.3)obeys

^f^∗⁻^f^ˆ

¹^≤

λ ( α

^,^A,^M

)

^eM\S

¹^, ^(3.6)

forsomeconstant

λ

(

α

,A,M)>0dependingonly on

α

^,^A ^and M.Moreover,

λ

(

α

,A,M) ^decreasesⁱⁿ

α

^,^meaning^that^larger

recoverabilityleadstohighercorrectionaccuracy.

Inview of(3.6),f^∗is agoodestimationif

^eM\S

1 issmall.Onthe otherhand,theestimation errordoesnot relyon e_S.Theorem3.1isessentiallyacorollaryofTheorem3.2inthespecialcase

^eM\S

1=0.TheproofofTheorem3.2willbe giveninAppendixC,inwhichwederiveanexplicitexpressionfortheconstantfactor

λ

(

α

,A,M)^.

Example3.2. WeconsiderthesamesettingasinExample3.1exceptthattheother observeddatacontainssmallsensing

noisebesidesthelargecorruptionatlink6.Speciﬁcally,let f_M=

⎡

⎢ ⎢

⎢ ⎣

302 201 198 301 600

⎤

⎥ ⎥

⎥ ⎦

^and^e^M⁼ ^f^M⁻^f^ˆ^M⁼

⎡

⎢ ⎢

⎢ ⎣

302 201 198 301 600

⎤

⎥ ⎥

⎥ ⎦

⁻

⎡

⎢ ⎢

⎢ ⎣

300 200 200 300 500

⎤

⎥ ⎥

⎥ ⎦

⁼

⎡

⎢ ⎢

⎢ ⎣

2 1

−2 1 100

⎤

⎥ ⎥

⎥ ⎦

^.

AgainwetakeS=

{

⁶

}

^.^Since^Rec(S)=2>1,itisassertedbyTheorem3.2thatthe1 normoftheestimationerror

^f^∗− fˆ

1iscomparableto

^eM\S

1=2+1+2+1=6.

Thisistrue,asweobtainthatx^∗=

₂₀₁

303 503

by(2.2), f^∗=

⎡

⎢ ⎢

⎢ ⎣

302 201 303 200 303 503

⎤

⎥ ⎥

⎥ ⎦

,and

^f^∗−fˆ

1=12.

NotethattheoriginalcountingerroratLink6is100,insharpcontrasttotheerroraftercorrectionwhichisjust3.

4. Testexamples

Inthissection,weprovidebothsyntheticandreal-worldexamplestodemonstrateeffectivenessofourproposedmethod.

4.1. Asyntheticnetwork

Fig.2showsaparallel highwaynetwork (Hu etal.,2009; Ng,2012) with9nodesand18linksamongwhich15links aremonitored.We createtheground-truthandobservedﬂowdata andlist theminTable1,wherethe estimationerrors equal the differences betweenthe estimated and ground-truthvalues, and the percentage differencesequal the relative differencesbetweentheestimatedandobservedvalues.Thedataonlinks3,10and14areunobservable.Theyaremarked by “N/A” inthetable andby dashed linein theplot.The recorded dataon links6and16 are severelycorrupted, while theotherdata containsmallnoise.SobasicallyM=

{

¹,2,4,5,6,7,8,9,11,12,13,15,16,17,18

}

^andS=

{

⁶,16

}

^.^It^is^clear

thatourestimationbyAlgorithm1isfairlyclosetotheground-truth,andthemiscountsonlinks6and16aresuccessfully detected. In fact,we can check by Algorithm2 that the recoverability conditionRec(S;A,M)=1.5>1holds. Therefore, Theorem3.2providesguaranteeforourcorrectionresult.

(7)

Fig. 2. A parallel highway network.

Table 1

Computational results for Example 1. Links with corrupted data are labeled with ^∗.

Link ID Ground-truth Observation Estimation Estimation error Percentage difference

1 10 0 0 0 9950 9950 −10 0.0%

2 70 0 0 0 69887 69887 −113 0.0%

3 80 0 0 N/A 7953 −47 N/A

4 20 0 0 1997 1997 −3 0.0%

5 150 0 0 15010 15104 104 0.6%

6 ^∗ 550 0 0 39751 54783 −217 37.8%

7 20 0 0 0 20043 20043 43 0.0%

8 30 0 0 3014 3014 14 0.0%

9 70 0 0 6977 6977 23 0.0%

10 90 0 0 N/A 9009 9 N/A

11 50 0 0 5045 5046 46 0.0%

12 480 0 0 47770 47771 −229 0.0%

13 25500 25397 25505 5 0.4%

14 1500 N/A 1515 15 N/A

15 20 0 0 0 20 0 0 0 20 0 0 0 0 0.0%

16 ^∗ 330 0 0 45302 32817 −183 −27.6%

17 45500 45912 45505 5 −0.9%

18 34500 34332 34332 −168 0.0%

Fig. 3. A road network on I-405 northbound in the city of Irvine.

4.2. Areal-worldexample

The daily cumulative ﬂow data in thisexample is from Caltrans Performance Measurement System(PeMS) database, collectedonI-405 northboundinthecityofIrvine,onApril28,2016.Thenetworkhas18linksand9nodesasillustrated inFig.3.Theloopdetectorsareinstalledonalllinksexceptforlinks3,13,and14,whicharerepresentedbydashed lines.

ThelinksarelabeledwiththeirIDsandcorrespondingobservedﬂowsintheparentheses.

The estimatedlink flows by (2.2) and(2.3) are compared withthe observedlink flows inTable 2,where unobserved linksflowsaremarkedby“N/A”.Ourcorrectionresultshowsthatthepercentagedifferenceatlink6ismuchlargerthanall otherlinks. Sincethereisnoground-truthdataavailable inthisexample,we cannotcheckthecorrectionqualitydirectly.

(8)

Table 2

Computational results for Example 2.

Link ID Observation Estimation Difference Percentage difference

1 123714 123714 0 0.0%

2 4835 4835 0 0.0%

3 N/A 128549 N/A N/A

4 15479 15479 0 0.0%

5 105748 113070 7322 6.9%

6 11127 13661 2534 22.8%

7 127073 126731 −342 −0.3%

8 16194 16194 0 0.0%

9 110997 110537 −460 −0.4%

10 2809 2757 −52 −1.9%

11 113002 113295 293 0.3%

12 10941 10941 0 0.0%

13 N/A 124236 N/A N/A

14 N/A 139715 N/A N/A

15 124437 124322 −115 −0.1%

16 15393 15393 0 0.0%

17 113411 113413 2 0.0%

18 10907 10909 2 0.0%

However,link6isﬂaggedasunhealthysensorbyPeMS,whichisconsistentwithourestimation.Ontheotherhand,iflink 6isindeedtheonlyunhealthysensor,thequalityoftheestimatedlinkﬂowlistedinTable2isguaranteedbyTheorem3.2, in which we have M=

{

¹,2,4,5,6,7,8,9,10,11,12,15,16,17,18

}

^and S=

{

⁶

}

^. ^It ^can ^be ^veriﬁed ^that ^the recoverability conditionRec(S;A,M)=2>1holds.

5. Conclusion

Inthisstudy,wesystematicallystudiedthelinkflowcorrectionprobleminatrafficnetworkbasedonflowconservation.

Theproblemisformulated asan 1-minimizationproblem, inwhichthedifferencesbetweentheestimatedandobserved linkflows areminimized. Weintroduced therecoverabilityconceptforasubsetoflinksandspecificallyderived therecov- erabilityconditionforexactlyretrieving themissingdata: whencertain sensors aremalfunctioning, nomatter howlarge theerrorsare, thegroundtruth flowcan be exactlyrecovered. Thatis,some linksarerobust tomiscounts. Furthermore, whensmallerrorsarepresentinobservedlink flows,theestimationerrorbound isfoundsuch thatwe canestimate the linkflows thatarecloseenoughtoground-truthundertherecoverabilitycondition.Wealsoshowedanefficientalgorithm forcomputingrecoverability.

For the real-world example in Section 4.2, the percentage differences between estimated and observed values in Table2 seemto be higherforlinks (e.g.,link5) closer tothe one (link 6in thiscase)with an unhealthysensor.In the future we will be interested in exploring the relationship between the percentage differences and the distances to un- healthysensors.However,forthesyntheticexampleinSection4.1,sucharelationisnotsoobviousinTable1;thissuggests thatthe1 normcaneffectivelypreventerrorspreading,andasensorclosetoanunhealthysensordoesnothavetohavea relativelylargepercentagedifference.

Afew morefollow-upstudytopicscan be interesting boththeoretically andpractically.Inaddition tothe 1 norm,it willbeinterestingtoinvestigatethefeasibilityandefficiencyofothersparsitypromotingpenaltyfunctionsforformulating andsolvingtheflowcorrection problem.Therecoverabilitydefinedin(3.4)iscentraltotheflowcorrectionproblem,asit determineswhetherexact recovery ispossibleornot (see Theorem3.1) andalsothe errorboundin stablerecovery (see (3.6)). In the futurewe will be interestedin examiningwith Algorithm2 how theroad network’s structure impacts the recoverabilityofasubset oflinks,andsuch astudycouldprovideguidelinesforinstallingflowcountingsensorsespecially inalarge-scalenetwork.

Acknowledgments

YinandXinwerepartiallysupportedbyNSFgrantsDMS-1522383andIIS-1632935.YinwasalsosupportedbyONRgrant N00014-16-1-7157.Wewouldliketothanktherefereesfortheirconstructivecomments.

AppendixA. ADMMforsolving(2.2)

Thefollowingalternatingdirectionmethodofmultipliers(ADMM)(Boyd etal., 2011)isanthresholding-basediterative algorithm.InAlgorithm1,z∈R^m andu∈R^m areauxiliaryvariables.’shrink’istheso-calledsoft-thresholdingoperatoron R^m.Foranyz∈R^mandr>0,shrink(z,r)performscomponent-wiseoperationonzgivenby

(

^shrink

(

z,r

) )

i=sign

(

^zi

)

^max

{|

^zi

|

⁻^r, ⁰

}

^, ⁱ⁼¹^,^.^.^.^,^m.

(9)

Algorithm1 ADMMforsolvingmin_x_∈_R_l−n

^ZMx−f_M

1. Input:Z_M, f_M,

δ

>0

Initialize:x⁽⁰⁾, z⁽⁰⁾, u⁽⁰⁾ fori=0,1,...,k₁−1do

x⁽ⁱ⁺¹⁾=(^Z_M^ZM)⁻¹^Z_M(^fM+z⁽ⁱ⁾−u⁽ⁱ⁾) z⁽ⁱ⁺¹⁾=shrink(^ZMx⁽ⁱ⁺¹⁾−f_M+u⁽ⁱ⁾,¹_δ) u⁽ⁱ⁺¹⁾=u⁽ⁱ⁾+Z_Mx⁽ⁱ⁺¹⁾−z⁽ⁱ⁺¹⁾−f_M endfor

Output:x^∗=x⁽^k¹⁾

Thealgorithmstopsaftersomemaximumnumberofiterations.

AppendixB.Aninversepoweralgorithmforsolving(3.5)

WepresentAlgorithm2tosolvethefollowingoptimizationproblem(3.5): vmin∈R^l−n

^ZM\S

v

¹

^ZS

v

¹ ^.

Theoutput

λ

^∗îs^the^computedrecoverabilityin(3.5),i.e.,Rec(S;A,M)^.^Note^thatⁱⁿÂlgorithm²^,ûpdating^vûnder^the

Algorithm2 Aninversepoweralgorithm(HeinandBühler,2010)forsolving(3.5). Input:Z_M_\_S, Z_S

Initialize:

v

⁽⁰⁾,

λ

⁽⁰⁾= ^Z^M_Z^\^S^v⁽⁰⁾¹

Sv⁽⁰⁾1

fori=0,1,...,k₂−1do

v

⁽ⁱ⁺¹⁾=argmin_v

^ZM\S

v

1−

λ

⁽ⁱ⁾ ^Z_S^sign(^ZS

v

⁽ⁱ⁾),

v

^subject^to

v

≤1

λ

⁽ⁱ⁺¹⁾= ^Z^M_Z^\^S^v⁽ⁱ⁺¹⁾¹

Sv⁽ⁱ⁺¹⁾1

endfor Output:

λ

^∗=

λ

⁽^k²⁾

unitballconstraintisnon-trivialandrequiresextraeffort.WewriteanADMMsolverforthissubprobleminAlgorithm3be- low.

AppendixC. Technicalproofs

ProofofTheorem2.1. ToproveZ=

I₍_l₋_n₎

−(^A^K^c)⁻¹^A^K

∈R^l^×⁽^l⁻ⁿ⁾givesabasisofKer(A),itsuﬃcestoshowthat

Algorithm3 ADMMforupdatingv.

Input:Z_M_\_S, Z_S,b=

λ

⁽ⁱ⁾^Z_S^sign(^ZS

v

⁽ⁱ⁾₎fromAlgorithm2,and

δ

>0 Initialize:

v

⁽⁰⁾,z⁽⁰⁾,u⁽⁰⁾

for j=0,1,...,k₃−1do

v

⁽^j+1⁾=(^Z_M_\_S^Z_M_\_S)⁻¹

Z_M_\_S(^z⁽^j⁾+^u⁽_δ^j⁾)+_δ^b

v

⁽^j⁺¹⁾= ^v_v⁽₍^j+1_j+1⁾₎ if

v

⁽^j⁺¹⁾

>1

z⁽^j⁺¹⁾=shrink(^Z_M_\_S

v

⁽^j⁺¹⁾+^u⁽_δ^j⁾,¹_δ) u⁽^j⁺¹⁾=u⁽^j⁾+

δ

(^z⁽^j⁺¹⁾−Z_M_\_S

v

⁽^j⁺¹⁾₎

endfor

Output:

v

⁽ⁱ⁺¹⁾inAlgorithm2

1. AZ=Oisazeromatrix.ItistruesinceAZ=[A^K,A^K^c]

I₍_l₋_n₎

−(^A^K^c)⁻¹^A^K

=A^K−A^K^c(^A^K^c)⁻¹^A^K=O.

2. Zhasfullrank,i.e.,rank(^Z)=l−n.Thisisalsotruebecause,ononehandrank(^Z)≤l−n,ontheotherhand,rank(^Z)≥ rank(^I(l−n))=l−nsinceI₍_l₋_n₎isasubmatrixofZ.

(10)

Thenby(2.3),wehave f^∗=

I₍l−n)

−

(

^A^K^c

)

⁻¹^A^K

x^∗=

x^∗

−

(

^A^K^c

)

⁻¹^A^K^x^∗

.

Since fˆ=

_ˆ

f^K fˆ^K^c

,weconcludethatx^∗isanestimateof fˆ^K.

ProofofTheorem3.2. Suppose f^∗= fˆ+

v

,since f^∗, fˆ∈Ran(^Z),thenwehave

v

∈Ran

(

^Z

)

.

Moreover,since f_M^∗ =Z_Mx^∗and fˆ_M∈Ran(^ZM),(2.2)impliesthat

^fM^∗ −f_M

¹⁼

^ZMx^∗−f_M

¹^≤

^f^ˆM−f_M

¹^. ^(5.7)

Bearinmindthate_M= f_M−fˆ_Misthesensingerror,soontherighthandsideof(5.7),

^f^ˆM−f_M

¹⁼

^eM

¹⁼

^eS

¹⁺

^eM\S

¹^, ^(5.8)

andonthelefthandside,

^f_M^∗ −f_M

1=

(

^f^ˆ+

v ₎

_M−f_M

1=

v

_M−e_M

1

=

v

S−e_S

1+

v

_M_\_S−e_M_\_S

1

≥

^eS

1−

v

_S

1+

v

_M_\_S

1−

^eM\S

1 (5.9)

In(5.9),weusedthetriangleinequalityfor1norm.Combining(5.7),(5.8),and(5.9),wehave

^eS

¹⁺

^eM\S

¹^≥

^eS

¹⁻

v

_S

¹⁺

v

_M_\_S

¹⁻

^eM\S

¹

or

2

^eM\S

1≥ −

v

_S

1+

v

_M_\_S

1 (5.10)

Bytheassumptionthat

Rec

(

S;A,M

)

:= inf

h∈Ker(^A)^:^hS1=0

^hM\S

¹

^hS

¹ ⁼

α

^>¹^,

wehave

α

^hS

1≤

^hM\S

1 holdsforallh∈Ker(^A)=Ran(^Z)^.^Since^v∈Ran(Z)asaforementioned,wehavefurther

v

_M

1≥ (¹+

α

)

v

_S

1,thenitfollowsfrom(5.10)that

2

^eM\S

¹^≥

v

_M

¹⁻²

v

_S

¹^≥

(

¹− 2

1+

α ⁾ v

_M

¹^,

andthus

(

^f^∗−fˆ

)

M

1=

v

M

1≤2

( α

⁺¹

)

α

−1

^eM\S

1. (5.11)

Inwhatfollows,wederiveanupperboundfor

(^f^∗−fˆ)M^c

1.Withoutlossgenerality, supposeA=[A^K, A^K^c]withK⊆ Mbeinganybaseset.Sincebothf^∗and fˆobeyﬂowconservation,wehave

0₍_n₎=A

(

^f^∗−fˆ

)

=[A^K, A^K^c]

(

^f^∗⁻^f^ˆ

)

K

(

^f^∗−fˆ

)

K^c

,

whichgives

(

^f^∗−fˆ

)

K^c=−

(

^A^K^c

)

⁻¹^A^K

(

^f^∗−fˆ

)

K.

SinceK⊆M,wehaveM^c⊆K^c.Therefore,(^f^∗−fˆ)Kis containedin(^f^∗−fˆ)M,and(^f^∗−fˆ)M^c is containedin(^f^∗−fˆ)K^c. Usingtheabovefacts,wehave