Contents lists available atScienceDirect
Theoretical Computer Science
www.elsevier.com/locate/tcs
A genetically modified Hoare logic
G. Bernot
a,∗, J.-P. Comet
a,∗, Z. Khalis
a, A. Richard
a, O. Roux
b,∗aUniversityCôted’Azur,I3Slaboratory,UMRCNRS7271,CS40121,06903SophiaAntipolisCedex,France bLS2NUMRCNRS6004,BP92101,1ruedelaNoë,44321NantesCedex3,France
a r t i c l e i n f o a b s t r a c t
Articlehistory:
Received8March2017
Receivedinrevisedform2October2017 Accepted3February2018
Availableonlinexxxx CommunicatedbyL.Kari
Keywords:
Hoarelogic
Generegulatorynetworks Thomasnetworks Parameteridentification Soundnessandcompleteness
An important problem when modelling gene networks lies in the identification of parameters,even whenusingadiscreteframeworksuchastheoneofRenéThomas.We present inthisarticle anew approachbasedon Hoarelogictogenerateconstraints on parametervalues.Specificationsofobservedbehavioursplayarolecomparabletoprograms inthe classicalHoarelogic,and deduced weakestpreconditions characterize thesets of all compatibleparameterizations,expressedas constraints onparameters.Besides being naturaland simple,ourHoarelogicapproachisremarkablypowerfuland,amongothers, itallowsonetoexpressexternalinterventionsofthebiologistduringexperimentssuchas knockouts. Insupplementary materials,wegiveaproof ofsoundnessofourHoarelogic forgenenetworksaswellasaproofofcompletenessanddecidabilitybasedonthenotion oftheweakestprecondition.
©2018PublishedbyElsevierB.V.
1. Introduction
Differentframeworksforstudyingthebehaviourofgenenetworksinasystematicwayhavebeenproposed.Amongthem, ordinarydifferentialequationsplayedanimportantrole,whichhowevermostlyleadtonumericalsimulations.Besides,the abstractionprocedureofRenéThomas [1],approximatingsigmoidfunctionsbystepfunctions,makesitpossibletodescribe thequalitative dynamicsofgenenetworksaspaths inafinitestate space. Neverthelessthisqualitative descriptionofthe dynamicsisstillgovernedbyasetofparametervalues,which,althoughbecomingsmallintegers,remaindifficulttodeduce from classical experimental knowledge. In this context, we are interested in the exhaustive search of parameter values thatareconsistentwithspecificationsformalizingtheexperimentallyobservedbehavioursofgeneregulatorynetworks.In additionanimportantqualityofourapproach,whichisnotaddressedbyotherformalisms,istotakeintoaccountexternal interventionsofthebiologistsduringtheexperiments(e.g.knockouts).
Severalworkswereundertakenwiththeobjectivetoidentifytheparameters.Theapplicationoftemporallogictogene regulatory networkswas presented in[2,3], then constraint programmingwas used in[4,5]. In thispaper, we presenta somewhatunexpectedapplicationofformalmethodstobiologythroughanewapproachbasedonHoarelogic [6] andits associatedweakestpreconditioncalculus [7] thatgeneratesconstraintsonparameters.Theformalismonwhichwedecided to apply this idea is the one of René Thomas because it is now universally recognized asthe reference framework for discrete modelling of gene networks. The key point of our proposal is to define a language able to capture the actual tracesobserved by molecularbiologists during aset ofexperiments (eitherat thetranscriptomicorproteomic level [8]).
Wehavedesignedalanguagewhichisexpressiveenoughtospecifysetsofobservedtracesaswellasexternalinterventions
*
Correspondingauthors.E-mailaddresses:bernot@unice.fr(G. Bernot),comet@unice.fr(J.-P. Comet),Olivier.Roux@irccyn.ec-nantes.fr(O. Roux).
https://doi.org/10.1016/j.tcs.2018.02.003 0304-3975/©2018PublishedbyElsevierB.V.
during thebiological experiments,whilepreserving thecompletenessofa correspondingextendedHoare logic.Sincethis methodavoidsbuildingthecompletestategraph,itresultsinapowerfultechniquetofindouttheconstraintsrepresenting thesetofconsistentparameterizations withatangiblegainforcomputation time.Indeed,theweakestpreconditionproof strategy which extractstheconstraints,goes throughthe trace specificationsyntax butisindependent ofthe sizeofthe genenetwork.
Thepaperisorganizedasfollows.ThebasicconceptsofclassicalHoarelogicanditsassociatedDijkstraweakestprecon- ditionarequicklyremindedinSection2.TheclassicalformaldefinitionsforThomasdiscretegeneregulatorynetworksare remindedinSection3.Section4givesourdefinitionof(geneticallymodified)Hoaretriples,includingtheassertionlanguage andthetracespecificationlanguage.InSection5,anextendedHoarelogicforgenenetworksisdefinedforThomasdiscrete models. In Section 6,the smallexampleof theincoherentfeedforwardloopoftype 1 (made popularby Uri Alon in [9,10]) highlights thewholeprocess ofourapproachtofindout thesuitableparametervalues.Section 7sketchesthepreviously existingmethodsforformalidentificationofdiscreteparametersingenenetworkmodels.WeconcludeinSection8.Supple- mentarymaterialsprovidethemathematicalsemanticsoftheseextendedHoaretriples,aproof ofsoundnessofourHoare logicforgenenetworks,andaproofofcompletenessanddecidability.
2. BasicsofHoarelogic
The Hoare logic is a formal system for reasoning about the correctness of imperative programs. In [6], Tony Hoare introduced thenotation “
{
P}
p{
Q}
”tomean“Iftheassertion P (precondition)issatisfiedbeforeperformingtheprogram pandiftheprogramterminates,thentheassertion Q (postcondition)willbesatisfiedafterwards.”Thisconstitutesdefacto a specification ofthe program under theform ofa triple, calledthe Hoare triple.In [7], Edsger Dijkstrahas definedan algorithmtakingthepostcondition Q andtheprogrampasinputandcomputingtheweakestprecondition P0thatensures Q ifpterminates.Inotherwords,weakestmeansthattheHoaretriple{
P0}
p{
Q}
issatisfiedandthatforanyprecondition P,{
P}
p{
Q}
issatisfiedifandonlyifP⇒
P0 issemanticallysatisfied.Noticethatweakestpreconditionmeansthatitdoesnot containanyuselesscondition,so,itmeansthatthesetofstatesthatsatisfytheweakestpreconditionisthelargestone.Thebasicidea istostampthesequentialstepsofaprogramwithassertionsthatareinferred accordingtotheinstructiontheysurround.Withinthefollowinginferencerules, p,p1andp2standforprograms, P,P1,P2,IandQ standforfirst-orderassertions on thevariables oftheprogram, v standsfora variableoftheimperative program,and Q
[
v←
expr]
meansthat expr is substitutedtoeachfreeoccurrenceofv in Q:Assignment:
{
Q[
v←
expr]}
v:=
expr{
Q}
Sequential composition:
{
P2}
p2{
Q} {
P1}
p1{
P2} {
P1}
p1;
p2{
Q}
Conditional branching:
{
P1}
p1{
Q} {
P2}
p2{
Q}
{
(e∧
P1)∨
(¬
e∧
P2)}
if e then p1 else p2{
Q}
Iteration:{
e∧
I}
p{
I} ¬
e∧
I⇒
Q{
I}
while e with I do p{
Q}
Empty program: P⇒
Q{
P} ε {
Q}
(whereε
standsfortheemptyprogram)TheIterationruledeservessomecomments.TheassertionI iscalledtheloopinvariantanditiswellknownthatfinding the weakestloop invariant (ifany)is undecidableingeneral [11,12]. So,TonyHoare askstheprogrammer togive a loop invariantexplicitly (withI).Thereareapproachestohelpfindingloopinvariantssuchastheiterativeapproachadoptedin ASTREE [13] (abstractinterpretation [14]).
Some authorspreferthefollowingiterationrule {I}whilee{e∧withI}pI{Ido} p{¬e∧I} thatrequirestheapplicationoftheempty program rule to become equivalent to our version. By doing so, these authors put the light on the fact that within a program,each whileinstructioncarriesitsown(sub)specificationandit canconsequentlybeprovedapartfromtherest oftheprogram.
From thestandardsetofHoarelogicrules, thefollowingproofstrategy buildsa prooftreethat computestheweakest precondition [7].
Definition2.1.(DijkstraBackwardstrategy). Let
{
P}
p{
Q}
be a Hoare triple.We call backwardstrategy theproof strategy definedinductively onpasfollows:1. If pisoftheform p1
;
p2wherep2ismadeofasingleinstruction,thenapplytheSequentialcompositionrule.2. If pis a single instruction,then apply thecorresponding rule (Iteration rule,Conditionalbranchingrule orassignment rule).
3. Onlyaftersteps1and2havefullytreatedp,i.e.whenallinstructionshavebeentreated,applytheEmptyprogramrule.
Fig. 1.Thegraphicalrepresentationofageneregulatorygraph R=(V,M,EV,EM)with V= {x,y},theboundsofxand y arerespectively2and 1, M= {μ1,μ2,μ3},ϕμ1is((x2)∧μ3),ϕμ2 is(x1),ϕμ3 is¬(y1).
Noticethat, thesethreeitemsbeingmutuallyexclusive,thebackwardstrategy generatesauniqueprooftree.(Inaddition, theremainingleafsoftheprooftreemustbehandledusingfirstorderlogicandarithmeticknowledge.)
Bydoingso,theprecondition P0 obtainedjustbeforeapplyingthelastEmptyprogramruleistheweakestprecondition.
AccordingtoStephenCook [15],theHoarelogiciscompleteassumingthateachloopinvariantintheprogramistheweakest loopinvariantwithrespecttotheconditioncomputedjustattherightofitswhilestatement.Moretechnically,aprogram witha while statement is of theform: “p1
;
whilee with I do p;
p2.” The Dijkstrabackward strategy computes inductivelytheweakestpreconditionQ2 suchthat,aftertheexecutionofp2,thepostconditionissatisfied.So Q2 becomes thepostcondition ofthewhilestatement.Cook’sresultisthen validwhen theinvariant I istheweakestcondition that ensures Q2 iftheprogramexitsfromthewhilestatement.Allinall,Cook’sresultmeansthattheHoaretriple{
P}
p{
Q}
is correct if and only if P⇒
P0 is semanticallysatisfied. So the full completeness of the Hoare logic dependson two things:asufficientexpressivepowertoexpressallthepreviouslymentioned weakest loopinvariantsandtheexistenceof afirst-orderprooftreeforP⇒
P0 wheneveritissemanticallysatisfied.Technically,thisreliesontheexpressivenessofthe chosenunderlyingassertionlanguage [16].ThemoststrikingfeatureofthebackwardstrategyforHoarelogicisthat,owingtoverysimplesequencesofsyntactic formula manipulations, we capturethe mathematical semantics of a program within first order logic. Nevertheless,it is worthnoticingthatweonlyaddresspartialcorrectnesssinceHoarelogicdoesnotgiveanyproofoftheterminationofthe program(whileinstructionsmayinduceinfiniteloops).
3. Basicsofdiscretegeneregulatorynetworkmodels
Thissection presentstheformalframeworkbasedonthediscretemodellingmethodofRenéThomas[17,18] andintro- ducedin[19].AsshowninFig.1,ageneregulatorygraphisvisualizedasalabelleddirectedgraphinwhichverticesare eithervariables(withincircles)ormultiplexes(withinrectangles).Variablesabstractgenesortheirproducts,andmultiplexes contain propositional formulasthat encode situations inwhich agroup of variables(inputs ofmultiplexes) influencethe evolutionofsomevariables(outputsofmultiplexes).Inthefigurethesimplemultiplex
μ
2 expressesthatthevariablexcan help theactivationofthevariable y when itsstate isatleastequalto 1. Ingeneral,multiplexescan representcombined biologicalphenomena,oneofthesimplestbeingtheformationofcomplexes(inwhichcasetheformulawouldsimplycon- tain a conjunction).Inthe figure,the situationofμ
1 isa little bitmoreelaborated:It reflects an auto-activation ofx at level2whichiscontrolledbyμ
3.Becauseμ
3 contains anegation,μ
1 doesnotmodelapositivecooperationofxand y:Theauto-activationofxisinhibitedby y.
So, in thisexample, there are three qualitatively interesting intervals ofexpression levels for x: an interval called 0, wherexcanneitheract on y noronitself,aninterval called1,wherexcanacton y andneveronitself,andaninterval called2,wherex canact on y aswell ason itselfprovided that
μ
3 issatisfied.From thebiological pointofview,there isathreshold(i.e.agivennumberofintracellularmoleculesproducedbyx)suchthat xisunable (resp.able)toactonits targetgeneifitsexpressionlevelisunder(resp.over)thethreshold.Wesaythattheboundofxisbx
=
2 andsimilarlythereareonly2qualitativelyinterestingintervalsfory,sothebound of yisby=
1.Ingeneral,thislabelleddirectedgraphisformallydefinedasfollows.
Definition3.1.AgeneregulatorygraphwithmultiplexesisatupleR
=
(V,M,EV,EM)satisfyingthefollowingconditions:•
V andMaredisjointsets,whoseelementsarecalledvariablesandmultiplexesrespectively.•
G=
(V∪
M,EV∪
EM)isalabelleddirectedgraphsuchthat:– Edgesof EV startfromavariableandendtoamultiplex,andedgesof EM startfromamultiplexandendtoeither avariableoramultiplex.
– EverydirectedcycleofGcontainsatleastonevariable.
– Everyvariablev ofV islabelledbyapositiveintegerbv calledtheboundofv.
– EverymultiplexmofMislabelledbyaformula
ϕ
m belongingtothelanguageLm inductivelydefinedby:Fig. 2.State graph obtained according to Definition3.5, following Fig.1and arbitrarily assuming thatKx=0,Kx,μ1=2,Ky=0 andKy,μ2=1.
−
Ifv→
mbelongsto EV ands∈
N,thenvsisanatomofLm.−
Ifm→
mbelongstoEM thenmisanatomofLm.−
Ifϕ
andψ belongtoLmthen¬ ϕ
,(ϕ ∧
ψ)and(ϕ ∨
ψ)alsobelongtoLm.All inall,the discrete valuesofavariable xabstract intervalsofquantityofmolecules producedby x withinthe cell.
Theseintervalsareobtainedbysortingtheactivationthresholdsofxonitstargets.Consequentlyonlytheknowledgeofthe thresholdsorderisusefulandnottheiractualvalues.Themultiplexesusetheseabstractlevelsinordertoencodepeculiar biological knowledge intoformulasthat definethe conditionsunderwhichthe regulationpositively actsonits targets. If thereisnopeculiarknowledgeaboutcooperationoveragiventarget,thereisonemultiplexperregulatinggeneactingon thistarget,whoseformulaisreducedtoanatom.
Successivemultiplexescanbecombinedbyflatteningtheirformulas:
Definition3.2.Theflatten versionofa formula
ϕ
m,denotedϕ
m,isobtainedbyrecursivelysubstituting eachoccurrenceof a multiplexm inϕ
mby itsformulaϕ
m (thisrecursiveprocessofsubstitutionsiswelldefinedbecauseG hasnodirected cyclewithonlymultiplexes).InFig.1,theflattenformula
ϕ
μ1 is(x2)∧ ¬
(y1).Asaresultoftheflatteningtransformation,alltheatomsofaflattenformulaareoftheformvs.
Astate isobviouslyan assignmentofintegervaluestothevariables v ofV withintheintervals
[
0,bv]
.Accordingtoa givenstate,byreplacingvariablesbytheirvalues,ϕ
m becomesapropositionalformulawhoseatomsaretheresultsofthe integerinequalities.Definition3.3.(States
η
,satisfactionrelation|=
Nandresourcesρ
).LetN beagrnandV beitssetofvariables.AstateofN is afunctionη :
V→
Nsuchthatη
(v)bv forallv∈
V.LetLbethesetofpropositionalformulaswhoseatomsareofthe formvswithv∈
V ands∈
N∗.Thesatisfactionrelation|=
N betweenastateη
ofN andaformulaϕ
ofLisinductively defined by:•
Ifϕ
isanatomoftheformvs,thenη |=
Nϕ
ifη
(v)s.•
Ifϕ ≡
ψ1∧
ψ2thenη |=
Nϕ
ifη |=
Nψ1 andη |=
Nψ2;andweproceedsimilarlyfortheotherconnectives.Givenavariable v
∈
V,a multiplexm∈
N−(v)(where N−(v)isthesetofmultiplexesm suchthatm→
v belongstothe interactiongraphofN)isaresourceofv atstateη
ifη |=
Nϕ
m.Thesetofresourcesofv atstate
η
isρ
(η
,v)= {
m∈
N−(v)| η |=
Nϕ
m}
.AccordingtoFig.1,atthestatewhere
η
(x)=
2 andη
(y)=
1,ϕ
μ2 issatisfiedandconsequentlyμ
2 istheonlyresource of y.Onthecontraryϕ
μ1 isfalseandconsequentlythesetofresourcesofxisempty.Theequilibriumtowardwhichtheexpressionlevelofagene v isattractedonlydependsonitsset
ω
ofresources.The intervalnumberbetween0andbv containingthisequilibriumisclassicallydenoted Kv,ω [20,21,17,22,2,19].Definition3.4.Ageneregulatorynetwork(grnforshort)isacouple N
=
(V,M,EV,EM,K)satisfyingthefollowingcondi- tions:•
R=
(V,M,EV,EM)isageneregulatorygraphwithmultiplexes,•
K= {
Kv,ω}
isafamilyofintegersindexedbyv∈
V andω ⊂
N−(v),whereN−(v)isthesetofmultiplexesmsuchthat m→
v isanedgeofEM.EachKv,ωmustsatisfy0Kv,ωbv.Ausualnotationabuseisthefollowing:wewrite Kv insteadofKv,∅andwewrite Kv,m1m2...insteadofKv,{m1,m2,...}. Atagivenstate
η
,eachvariablevtriestoevolveinthedirectionofparameterKv,ρ(η,v).Hence,atstateη
,vcanincrease ifη
(v)<Kv,ρ(η,v),itcandecreaseifη
(v)>Kv,ρ(η,v),andv isstableifη
(v)=
Kv,ρ(η,v).InFig.2,atthestate(2,1),wehaveKx
=
0<η
(x)=
2 and Ky,μ2= η
(y)=
1,but(0,1)isnotasuccessorstateof(2,1) becausethe proteindegradation occursone proteinafter theother andconsequently theconcentration level ofx cannot jumpfrom2to0.Consequently(1,1)isthenextstate.At (1,0), both Kx
=
0<η
(x)=
1 and Ky,μ2=
1>η
(y)=
0, butthe probability for x and y to crosstheir threshold exactlyatthesametimeisnull [20,21,17,22,2,19].1 Consequently,therearetwopossiblenextstates:(0,0)ifxcrossesits thresholdfirstand(1,1)ifycrossesitsthresholdfirst.So,Thomasmethodassumesthatvariablesevolveasynchronouslyandbyunitstepstowardtheirrespectivetargetlevels:
Definition3.5(Stategraph).Let N
=
(V,M,EV,EM,K)be a grn. The stategraph of N isthe directedgraph S whoseset ofvertices istheset ofstatesof N,andsuch that thereexists an edge(called transition)η → η
ifone ofthefollowing conditionsissatisfied:•
Forallvariablesv∈
V wehaveη
(v)=
Kv,ρ(η,v),andthenη
= η
.•
Thereexistsv∈
V suchthatη
(v)=
Kv,ρ(η,v),andη
(
v) =
η (
v) +
1 ifη (
v) <
Kv,ρ(η,v)η (
v) −
1 ifη (
v) >
Kv,ρ(η,v) and∀
u=
v, η
(
u) = η (
u).
For each variable v such that
η
(v)=
Kv,ρ(η,v), there is a transition allowing v to evolve (±
1) toward its focal level Kv,ρ(η,v).Everyoutgoingtransitionofη
issupposedtobe possible,sothat thereisan non-determinismassoonasη
has severaloutgoingtransitions.Fig.2representsacompletestategraph.4. SyntaxofHoaretriplesforgenenetworks
In order to formalize known information about a gene network, we introduce in this section a language to express propertiesofstates(assertions)andalanguagetoexpresspropertiesofstatetransitions(tracespecifications).
4.1. Assertionsfordiscretemodelsofgenenetworks
Definition4.1(Termsandassertions).LetN
=
(V,M,EV,EM,K)beagrn.ThewellformedtermsforNareinductivelydefined by:•
Eachintegern∈
Nconstitutesawellformedterm•
Foreachvariablev∈
V,thenameofthevariablev,consideredasasymbol,constitutesawellformedterm.•
Similarly,foreachv∈
V andforeachsubsetω
ofN−(v),thesymbol Kv,ω constitutesawellformedterm.•
Ift andt arewellformedtermsthen(t+
t)and(t−
t)arealsowellformedterms.LetN
=
(V,M,EV,EM,K)beagrn.TheassertionsforN areinductivelydefinedby:•
Ift andt arewellformedtermsthen(t=
t),(t<t),(t>t),(tt)and(tt)areatomicassertionsforN.•
Ifϕ
andψ areassertionsforN then¬ ϕ
,(ϕ ∧
ψ),(ϕ ∨
ψ)and(ϕ ⇒
ψ)arealsoassertionsforN.Astate
η
ofthenetwork N satisfiesan assertionϕ
ifandonlyifitsinterpretationisvalidinZ,aftersubstitutingeach variablev byη
(v)andeachsymbolKv,ω byitsvalueaccordingtothefamilyK.Wenoteη |=
Nϕ
.Moreover,conventionally,wedenote“
”thetautology(e.g.“1=
1”).4.2. Tracespecificationsfordiscretemodelsofgenenetworks
When biologistsobserve the dynamicsof gene expression levels along a set ofexperiments, they extract, asa direct experimental knowledge,some sets ofobserved traces(see Fig.3). It isconsequently offirstinterest tosee thesesetsof observationsasbasicelementsforthespecificationofgenenetworks.
Definition4.2(Tracespecifications).Let N
=
(V,M,EV,EM,K)be a grn.The setoftracespecifications for N isinductively definedby:1 Indeed,biologically,eachthresholdcorrespondstoaprecisenumberofmoleculesproducedbyxoryrespectivelyinthecell.So,thereisaprobability 0forthedegradationtomakethenumberofx-moleculescrossthex-thresholdexactlyatthesametimeasanewmoleculeproducedbyymakesthe y-thresholdcrossed(asufficientlyprecisetimescalewilldistinguishthetwoevents).
•
Foreach v∈
V andn∈ [
0,bv]
theexpressions v+
,v−
andv:=
nareatomictracespecifications(respectivelyincrease, decreaseorassignment).•
IfeisanassertionforN,thentheexpressionassert(e)isanatomictracespecification.•
If p1 andp2 aretrace specificationsthen(p1;
p2)isalsoatrace specification(sequential composition).Moreoverthe sequentialcompositionisassociative,sothatwecanwrite(p1;
p2; · · · ;
pn)withoutintermediateparentheses.•
If pisatracespecificationandifeand Iareassertions forN,then (while e with I do p)isalsoatracespecification.Theassertion Iiscalledtheinvariantofthewhileloop.
•
If p1 and p2 are trace specificationsthen∀(
p1,p2)and∃(
p1,p2)are alsotracespecifications(quantifiers). Moreover thequantifiersareassociativeandcommutative,sothatwecanwrite∀
(p1,p2,· · ·
,pn)and∃
(p1,p2,· · ·
,pn)asuseful abbreviations.Conventionally,wedenote:
• ε
(calledtheemptytrace)thetracespecificationassert().•
If e then p1else p2(calledconditionalbranching)thetracespecification∃
(assert(e);
p1, assert(¬
e);
p2),wherep1and p2 areanytracespecificationsandeisanassertionforN.Intuitively, v
+
(resp. v−
)meansthatthebiologisthasobservedthattheexpressionlevelofvariablev isincreasing by one unit (resp.decreasingby oneunit). v:=
n meansthatthe biologisthasset theconcentrationlevel forgene v tothe value nduringtheexperiment(e.g. v:=
0 foraknockoutorv:=
bv fora saturationoftheproductof v).assert(e)allows one toexpressa propertyofthecurrentstate withoutchangeofstate.Sequentialcomposition allowsone toconcatenate two tracespecifications.Theloop invariant I,asinclassical Hoarelogic,isawaytohandleanunknownnumberoftrace repetitions:ItwillfacilitateproofsofHoaretriples.Finallyitbecomespossibletogrouptogetherseveraltracespecifications thankstothequantifiers∀
and∃
.Theseintuitionsareformalizedasfollowsviaabinaryrelationbetweenstatesandsetsof states.Notation4.3.Forastate
η
,avariablev andi∈ [
0,bv]
,wenoteη [
v←
i]
thestateη
suchthatη
(v)=
i andforall u=
v,η
(u)= η
(u).Definition4.4(Mathematicalsemanticsofatracespecification).LetN
=
(V,M,EV,EM,K)beagrn,letS bethestategraph ofN whosesetofverticesisdenotedSandletpbeatracespecificationforN.Thebinaryrelation;p isthesmallestsubset of S×
P(S)suchthat,foranystateη
:1. If pistheatomicexpression v
+
,thenlet usconsiderthestateη
= η [
v←
(η
(v)+
1)]
:Ifη → η
isatransitionofS thenη
;p{ η
}
.2. If pistheatomicexpression v
−
,thenlet usconsiderthestateη
= η [
v←
(η
(v)−
1)]
:Ifη → η
isatransitionofS thenη
;p{ η
}
.3. If pistheatomicexpression v
:=
i,thenη
;p{ η [
v←
i]}
. 4. If pisoftheformassert(e),ifη |=
Ne,thenη
;p{ η }
.5. If pisoftheform
∀
(p1,p2):Ifη
;p1E1 andη
;p2 E2 thenη
;p (E1∪
E2).6. If pisoftheform
∃
(p1,p2):Ifη
;p1 E1 thenη
;p E1,andifη
;p2 E2 thenη
;p E2.7. If pis oftheform (p1
;
p2):Ifη
;p1 F andif{
Ee}
e∈F isa F-indexedfamilyofstate sets such thate;p2 Ee, thenη
;p (e∈FEe).
8. If pisoftheform(while e with I do p0):
•
Ifη |=
Nethenη
;p{ η }
.•
Ifη |=
Neandη
p;0;pE thenη
;p E.DetailedcommentsaboutthisdefinitioncanbefoundinSupplementary materialsAppendixA.
4.3. Hoaretriples
SimilarlytoSection2,twoassertionsandonetracespecificationareusedtoconstituteaHoaretripleforgenenetworks.
Definition4.5.A Hoaretriple fora grn N isan expression of theform
{
P}
p{
Q}
where P and Q are assertions for N, calledpre- andpost-conditionrespectively,andpisatracespecificationforN.In practice P candescribe aset ofstateswherecells havebeensynchronised atthebeginning oftheexperiment, for exampleallstatesforwhichthevariablev hasvaluezero(P
≡
(v=
0)),thetracespecificationpdescribesbiologicallyob-Fig. 3.AclassicalexampleofnormalisedexpressionprofilesforthreeBooleangenesa,bandcresultingfromanexperimentalcampaign.Thresholdsfor eachgenearetunedaccordingtobiologicalknowledge.Thenthetracespecificationforthisfigureisb−;a+;c+;a−;b+.
serveddynamicprocesses,forexampleincreaseoftheexpressionlevelofv (p
≡
v+
),andthepostconditionalsodescribes observationsattheendoftheexperiment,forexampleallstatesforwhichthevariable vhasvalueone(Q≡
(v=
1)),and soon.Morepreciselywe show inFig.3aclassical representationofexpression profilesobtainedafteran experimentalcam- paign. From our numerous case studies, it is a good heuristics to consider by default equidistributed thresholds (e.g. a thresholdof0.5forBooleangenes).Ifnecessary,some thresholdsaretuned afterdiscussingwithbiologists.Then, succes- sivecrossingsbetweenageneprofileanditsthresholdgivedirectlythetracespecification.Inpracticewhentwocrossings are very close, a
∃
statement is used (∃(
x+;
y+
, y+;
x+)
) and the other primitives of trace specifications are often introducedinordertomixtogetherandgeneraliseseveralobservedtracespecifications.Whether ornot the triple is satisfiedby a givengene network N, willdependonitsstatetransitiongraph,thus itwill dependontheparametervaluesinK.
Definition4.6(SemanticsofaHoaretriple).LetN
=
(V,M,EV,EM,K)beagrnandletS bethestategraphofN whoseset ofverticesisdenoted S.AHoaretriple{
P}
p{
Q}
issatisfiedifandonlyif:Forall
η ∈
Ssatisfying P,thereexistsE suchthatη
;p E andforallη
∈
E,η
satisfiesQ. SeeSupplementary materialsAppendixAformoredetails.5. AHoarelogicfordiscretemodelsofgenenetworks
Inthissection,wedefineourgeneticallymodifiedHoarelogicbygivingtheruleforeachconstructoroftracespecifications (Definition4.2).First,letusintroduceafewconventionalnamestodenoteformulasthatwillbeintensivelyused.
Notation5.1.ForeachvariablevofagrnN,weconventionallyusethefollowingnotations:
1. Foreachsubset
ω
ofN−(v)wedenotebyωv thefollowingformulaωv
≡ (
m∈ω
ϕ
m) ∧ (
m∈N−(v)ω
¬ ϕ
m)
whereN−(v)
ω
standsforthecomplementarysubsetofω
inN−(v).FromDefinition3.3,forallstates
η
,η |=
Nωv ifandonlyifω = ρ
(η
,v),thatis,ω
isthesetofresourcesofvatstateη
. Consequently,foreach v,thereexistsauniqueω
suchthatη |=
Nωv.2. Wedenoteby+v thefollowingformula
+v
≡
ω⊂N−(v)
(
ωv=⇒
Kv,ω>
v)
FromDefinition3.5,wehave
η |=
N+v ifandonlyifthereisatransition(η → η [
v←
v+
1]
)inthestategraphS,that is,ifandonlyifthevariablev canincrease.3. Wedenoteby−v thefollowingformula
−v
≡
ω⊂N−(v)
(
ωv=⇒
Kv,ω<
v)
Similarly,
η |=
N−v ifandonlyifthevariablev candecreasefromthestateη
inthestategraphS. SeeSection6whereexamplesoftheseformulasaregiven.OurHoare logic fordiscrete models of gene networksis then definedby the following inference rules, where v is a variableofthegrnandk
∈ [
0,bv]
.1. RulesencodingThomasdiscretedynamics.
Increase:
{
+v∧
Q[
v←
v+
1] }
v+ {
Q}
Decrease:{
−v∧
Q[
v←
v−
1] }
v− {
Q}
2. RulescomingfromHoarelogic.TheserulesaresimilartotheonesgiveninSection2.Obviousrulesfortheexpression assert(),andforthequantifiers,areadded:
Assert:
{
∧
Q}
assert(){
Q}
Universal quantifier:
{
P1}
p1{
Q} {
P2}
p2{
Q} {
P1∧
P2} ∀(
p1,p2){
Q}
Existential quantifier:{
P1}
p1{
Q} {
P2}
p2{
Q}
{
P1∨
P2} ∃(
p1,p2){
Q}
Assignment:{
Q[
v←
k]}
v:=
k{
Q}
Sequential composition:
{
P1}
p1{
P2} {
P2}
p2{
Q} {
P1}
p1;
p2{
Q}
Iteration:{
e∧
I}
p{
I} ¬
e∧
I⇒
Q{
I}
while e with I do p{
Q}
Empty trace: P
⇒
Q{
P} ε {
Q}
3.Boundaryaxiomassertingthatallvaluesstaybetweentheirbounds,foreach v
∈
V andω ⊂
N−(v): 0v∧
vbv∧
0Kv,ω∧
Kv,ωbvRemark5.2.
•
(+v⇒
v<bv)canbededucedfromtheboundaryaxioms:+v impliesthatforω
correspondingtothecurrentsetof resources, Kv,ω>v and,usingtheboundaryaxiom Kv,ωbv,wegetv<bv.•
Similarly,wehave(−v⇒
v>0). TheseimplicationswillbeusedinSection6.TheconditionalbranchingruleofthestandardHoarelogichasnotbeenreproducedherebecausethetracespecification (if e then p1 else p2)isashorthandfor
∃
(assert(e);
p1, assert(¬
e);
p2 ).Theconditionalbranchingruleremainssound.We proveinSupplementary materials AppendixB thatthismodified Hoarelogicissoundandcompleteandweshow that theweakestloop invariantscanalways becomputed. Thisimpliesthedecidabilityofthe(partial)correctnessofany geneticallymodified Hoaretriple.Moreprecisely,theproof strategycalledbackwardstrategy,alreadydescribedattheend ofSection2,alsoapplieshere:ItautomaticallycomputestheloopinvariantsandtheweakestpreconditionW oftheHoare triple
{
P}
p{
Q}
,andtheimplicationP⇒
W isdecidable.SimilarlytoclassicalHoarelogicwhichreflectsapartialcorrectnessofimperativeprograms,thepreviousdefinitiondoes notimplyterminationofwhileloops.
6. Illustrativeexamples
6.1. Alon’sinterpretationoftheincoherentfeedforwardloopoftype 1
In [9,10] UriAlonandco-workershavestudiedthemostcommoninvivopatternsinvolvingatmostfourgenes.Among them,evenwithoutconsideringfeedbackloopssuchasin[23],thereareinterestingpatternswhosedynamicsislessobvious thanitseems.Inparticulartheyhaveemphasizedtheincoherentfeedforwardloopoftype 1.Itiscomposedbyatranscription factora thatactivatesa secondtranscription factorc,andbotha andc regulateagene b.The genea isanactivatorofb whereas thegene c isan inhibitorofb.There isa“short” positiveactionofa on b anda“long” negativeaction via c:a activatesc whichinhibitsb.ThelefthandsideofFig.4showssuchafeedforwardloop.Supposingthatboththresholdsof actions ofa are equal leadsto a Booleannetwork since,inthat case, thevariablea cantake only thevalue 0(a hasno action)or1(aactivatesbothb andc).Therighthandsideofthefigureshowsthecorrespondinggrnwithmultiplexes:
σ
encodesthe“short”actionofaonb,whilstlfollowedbyλconstitutesthe“long”action.
Fig. 4.(Left)Boolean“incoherentfeedforwardloopoftype1”accordingtoUriAlon.(Right) CorrespondinggrnN=(V,M,EV,EM,K).V= {a,b,c}with ba=bb=bc=1.M= {l,λ,σ},φl≡(a1),φλ≡(¬(c1)),φσ≡(a1).K= {Ka,Kc,Kc,l,Kb,Kb,σ,Kb,λ,Kb,σ λ}.
Classical interpretation: UriAlonandmanybiologistshaveinmindthatifa isequalto0forasufficientlylongtime,bothb andc willalsobeequalto0,becauseb andc needa asaresourceinordertoreachthestate 1.Theyalsohaveinmindthatthefunction ofthisfeedforwardloopistoensureatransitoryactivityofb thatsignalswhena hasswitchedfrom0to1.Theideaisthata activatestheproductionsofb andc,andthenc stopstheproductionofb.
Inthefollowingsubsections,werevisitthisaffirmationviafourdifferenttracespecifications,andweproveformallythat theaffirmationisonlyvalidundersomeconstraintsontheparametersofthenetwork,andonlyundertheassumptionthat bstartsitsactivitybeforec.
6.2. Isatransitoryproductionofb possible?
The simple popular idea that b is activated and then the activation of c inhibits b is specified by the Hoare triple
{
P}
P1{
Q0}
where P≡
(a=
1∧
b=
0∧
c=
0),P1≡
(b+;
c+;
b−)
and Q0≡
(b=
0).Thebackward strategyusingour geneticallymodifiedHoarelogiconthisexamplegivesthefollowingsuccessiveconditions.•
Theweakestpreconditionobtainedthroughthelastexpression“b−
”is−b∧
Q0[
b←
b−
1]
(Decreaserule):⎧ ⎪
⎪ ⎪
⎪ ⎨
⎪ ⎪
⎪ ⎪
⎩
∅b
⇒
Kb<
bσb
⇒
Kb,σ<
bλb
⇒
Kb,λ<
bσb,λ
⇒
Kb,σλ<
b b−
1=
0≡
⎧ ⎪
⎪ ⎪
⎨
⎪ ⎪
⎪ ⎩
(¬¬(
c1) ∧ ¬(
a1)) ⇒
Kb<
b(¬¬(
c1) ∧ (
a1)) ⇒
Kb,σ<
b( ¬ (
c1) ∧ ¬ (
a1)) ⇒
Kb,λ<
b(¬(
c1) ∧ (
a1)) ⇒
Kb,σλ<
b b−
1=
0whichsimplifiesasQ1
≡
⎧⎪
⎪⎪
⎨
⎪⎪
⎪⎩ b
=
1((c1)
∧
(a<1))=⇒
Kb=
0 ((c1)∧
(a1))=⇒
Kb,σ=
0 ((c<1)∧
(a<1))=⇒
Kb,λ=
0 ((c<1)∧
(a1))=⇒
Kb,σλ=
0•
Then,theweakestpreconditionobtainedthroughtheexpression“c+
”is+c∧
Q1[
c←
c+
1]
:⎧⎪
⎪⎪
⎪⎪
⎪⎪
⎨
⎪⎪
⎪⎪
⎪⎪
⎪⎩
¬(
a1)⇒
Kc>c a1⇒
Kc,l>c b=
1((c
+
11)∧
(a<1))⇒
Kb=
0 ((c+
11)∧
(a1))⇒
Kb,σ=
0 ((c+
1<1)∧
(a<1))⇒
Kb,λ=
0 ((c+
1<1)∧
(a1))⇒
Kb,σλ=
0whichsimplifies as Q2
≡
⎧⎪
⎪⎪
⎪⎪
⎨
⎪⎪
⎪⎪
⎪⎩ c
=
0a<1
⇒
Kc=
1 a1⇒
Kc,l=
1 b=
1a<1
⇒
Kb=
0 a1⇒
Kb,σ=
0usingthe boundary axioms
andRemark5.2.
•
Lastly,theweakestpreconditionobtainedthroughthefirst“b+
”ofthetraceis+b∧
Q2[
b←
b+
1]
whichsimplifiesasQ3
≡
⎧⎪
⎪⎪
⎪⎪
⎪⎪
⎪⎪
⎨
⎪⎪
⎪⎪
⎪⎪
⎪⎪
⎪⎩
a<1
⇒
Kb,λ=
1 a1⇒
Kb,σλ=
1 c=
0a<1
⇒
Kc=
1 a1⇒
Kc,l=
1 b=
0a<1