Biochemical Programs and Mixed Analog-Digital Algorithms in the Cell
François Fages
Inria Saclay – Ile de France EPI Lifeware
France
http://lifeware.inria.fr/
Outline
1. Formal biochemical reaction systems as a programming language
2. Functional specifications: example of the MAPK signaling network module 3. Computability and computational complexity of analog circuits
4. Biochemical compiler
5. Examples of function/program compilation into biochemical reactions 6. Conclusion and perspectives
MAPK Input/Output Function
Dose-response diagrams alias Bifurcation diagrams
biocham: load(library:examples/mapk/mapk).
biocham: dose_response(‘E1',1.0e-6,1e-4,200,‘PP_K').
MAPK has (at least) the function of an analog/digital converter in the cell.
How to program !"
# % !" with reactions ?
Hill function !"
# % !"
[Huang Ferrel 96 PNAS]
n ≈ 4.9 at 3rd level n ≈ 1.7 at 2nd level
n = 1 at 1st level (Michaelis-Menten)
How to Program a function f : 𝑅 → 𝑅 with Reactions ?
First idea: GPAC f(t) General Purpose Analog Computer [Shannon 41]
Inputs: t, x, y, z, … Circuit built from 1. Constant unit 2. Sum unit
3. Product unit
4. Integral ∫ 𝑥 𝑑𝑦 unit
𝑥 + 𝑦 /.!.1 𝑥 + 𝑦 + 𝑧 z /.3 ↓
dz/dt=k(xy-z)
=0 when z=xy
x → 𝑥 + 𝑧!
dz/dt=x z= ∫ 𝑥 𝑑𝑡 x /.! 𝑥 + 𝑧
𝑦 /.1 𝑦 + 𝑧 𝑧 /.3 ↓
dz/dt=k(x+y-z)
=0 when z=x+y
GPAC Design of the Cosine Oscillator
Cosine(t) is the solution of the ODE y’’(t) = -y(t) with y(0)=1 1. GPAC circuit:
2. Rewriting in pseudo reactions
biocham: present(y,1).
biocham: compile_wgpac(y::integral integral-1*y).
present(x2,-1).
fast*[x2]*[y]for _=[x2+y]=>x1.
fast*[x1]for x1=>_.
_=[x1]=>x0.
_=[x0]=>y.
• Pb negative concentration values x2=-1
• Pb GPAC circuits with no solution or several
• The errors for + and * accumulate along the circuit May not scale up.
• Better design ?
Computable Analysis
Church-Turing thesis: single notion of computability, universal Turing machines Definition. A real number r is computable if there exists a Turing machine with Input: precision pÎN
Output: rational number qÎQ with | r-q |<2-p
Example. Rational numbers, limits of computable Cauchy sequences
π, e
, … Definition. A real function f:R®R is computable if there exists a Turingmachine that computes f(x) with an oracle for x.
Example. Polynomials, trigonometric functions, …
Counter-example.
⌈x⌉
is not computable (undecidable on x=1.000…)GPAC-generable & GPAC-computable Functions
Definitions by Polynomial Initial Value Problems (PIVP) instead of GPAC circuits:
Definition. [Graça Costa 03 complexity] A real function f:R®R is PIVP-generable If there exist a vector of polynomials pÎRn[Rn] and of initial values y(0) and a function y:Rn®R such that y’(t)=p(y(t)) and f(t)=y1(t)
Example. Oscillator cos(t)
Definition. [Graça Costa 03 complexity]A real function f:R®R is PIVP-computable if there exists vectors of polynomials pÎRn[Rn] and qÎRn[R] and
a function y:Rn®R such that y’(t)=p(y(t)) , y(0)=q(x) and lim
R→S𝑦T(𝑡) = 𝑓(𝑥) Example. MAPK function PP_K(RAS)
Theorem. [Bournez Campagnolo Graça Hainry 06]
A real function is computable iff it is PIVP-computable.
Theorem. [Pouly 15 PhD thesis] A real function is computable in Ptime iff it is PIVP-computable with a trajectory of polynomial length.
Biochemically Computable Functions 1/2
Theorem. (positive systems) Any PIVP-computable function can be encoded by a PIVP on R+ of double dimension, preserving polynomial time complexity.
Proof idea. Encode yiÎR by y-i y+iÎR+ such that yi= y+i- y-i at each time Let pi(y+1, y-1,…, y+n, y-n) = pi[y = y+i- y-i] and pi = p+i- p-i
y+i‘ = p+i - fi y+iy-i y+i(0) = max(0, yi(0)) y-i‘ = p-i - fi y+iy-i y-i(0) = max(0, -yi(0))
Where fi are positive coefficient polynomials such that fi > max(q+i, q-i ) Implementation by biochemical reactions ?
• Fast annihilation reactions: y+i +y−i fi
→ ↓
• n-ary catalytic synthesis reactions for each monomial m+i,jin p+i, m−i,jin p−i:
Mi, j + m+i,j
y+i + Mi, j+ Mi, j − m−i,j
y+i+ Mi, j−
Biochemically Computable Functions 2/2
Theorem. (binary reactions) [Carothers Parker Sochacki Warne 2005]
Any PIVP can be encoded by a PIVP of degree £ 2.
Proof idea. Introduce variable vi1,…,in for each monomial y1i1…ynin We have y1=v1,0…,0, y2=v0,1,0…,0 ,…
y’i is of degree one in vi1,…,in
𝑣]𝑖1,… , 𝑖𝑛 = ∑b/cd𝑖𝑘 𝑣𝑖1,… , 𝑖𝑘 − 1, … , 𝑖𝑛 𝑦]𝑘 is of degree at most 2.
(Trade high dimension for low degrees…)
Theorem. Elementary reaction systems on finite sets of molecules (no polymerization) are Turing-complete under the differential semantics
Reaction systems are not Turing-complete in the discrete semantics (Petri Nets) unless adding unbounded complexation (polymerization) [Cardelli Zavattaro 2008]
or unbounded membranes [Busi Gorrieri 2006, Paun 2000, Berry Boudol 92 ]
Biochemical Oscillator/Function Cosine
biocham: compile_from_expression(cos,time,cos(t)).
present(cos(t)_p,1).
_=[z_p_2]=>cos(t)_p.
_=[z_m_2]=>cos(t)_m.
_=[cos(t)_m]=>z_p_2.
_=[cos(t)_p]=>z_m_2.
biocham: present(x_p, 4).
biocham: compile_from_expression(cos,x,cos(x)).
present(cos(x)_p, 1).
_=[g_m]=>g_p. _=[g_m+cos(x)_m]=>z_p_4.
_=[x_p]=>g_p. _=[g_p+cos(x)_p]=>z_p_4.
_=[g_p]=>g_m. _=[x_p+cos(x)_m]=>z_p_4.
_=[x_m]=>g_m. _=[x_m+cos(x)_p]=>z_p_4.
_=[g_m+z_p_4]=>cos(x)_p. _=[g_m+cos(x)_p]=>z_m_4.
_=[g_p+z_m_4]=>cos(x)_p. _=[g_p+cos(x)_m]=>z_m_4.
_=[x_m+z_m_4]=>cos(x)_p. _=[x_m+cos(x)_m]=>z_m_4.
_=[x_p+z_p_4]=>cos(x)_p. _=[x_p+cos(x)_p]=>z_m_4.
_=[g_m+z_m_4]=>cos(x)_m. _=[x_p+cos(x)_p]=>z_m_4.
_=[g_p+z_p_4]=>cos(x)_m. _=[x_m+z_p_4]=>cos(x)_m.
Compilation of MAPK Sigmoid Function #
e! %!
e e?
• Manual solution :
Elegant but exponential time complexity on x=0 (exponential amplitude on auxiliary component y1)
• Generic compiler-generated solution:
biocham: compile_from_expression(id*id/(1+id*id),time,sigmoid).
59 reactions on 20 species Polynomial time
yet one component diverges
• MAPK: 30 reactions, no divergence
Logical Preconditions on Reactions
[Senum Riedel 2010]
Conjunction
Disjunction
Negation better implementation
with absence indicator # % !# e
Implementation of Sequentiality and
Program Control Flow by Absence Indicator
[Huang Jiang Huang Cheng 2012 ICCAD]
[Huang Huang Chiang Jiang Fages 2013 IWBDA]
Specification and Compilation of… Life
while true {growing; replication; verification; mitosis}
Compilation of sequentiality and loop with program control variables 50 reactions
Cyclin D, Cyclin E, Cyclin A, Cyclin B [Paul Nurse 2001]
Cyclins as necessary markers for implementing sequentiality
Enzymatic Implementation in non-living Protocells
Formal synthesis/degradation reactions as enzymatic activation reaction DNA-free microfluidic vesicles
[Franck Molina, CNRS Sys2diag]
Differential Diagnosis of Diabete
[Courbet Amar Fages Renard Molina 2017]Conclusion
• Binary reaction systems over a finite set of molecules (without polymerization) are Turing-complete under the differential semantics
– PIVP definition of computable function
– Notion of computational complexity as trajectory length of stabilizing PIVPs
• Generic biochemical compiler of i/o functions
– Input: Function specification by PIVP, mixed digital-analog program – Output: system of binary reactions with mass action law kinetics – Exact characterization of the result for an ideal fluid implementation
• The natural MAPK program has a better complexity than our generated code
– Comparison to natural circuits (BioModels repository) – Code optimization ? Refactoring ? Alternative schemes ?
• Formal ground for modeling evolution: Circuit ⟷ Function
– Probably Approximatively Correct [Valliant 2013] ↻
mutation
Conclusion of the Course on some Hot Research Topics
and Thesis Subjects
1. Compilation of controllers and programs in biochemical reactions
– Design methods for digital and analog systems
– Microfluidic implementation of artificial reaction programs in protocells
2. Machine learning models from traces
– Probably Approximately Correct learning – Artificial evolution of reaction systems
3. Feedback loops in reaction systems
– Thomas’ rules for multi-stationarity in reaction systems – Implementation of Soliman’s theorem
4. Model reduction
– Tropical algebra methods for identifying slow/fast regimes – Symbolic computation
5. Multi-scale stochastic and deterministic models
– Hybrid stochastic-deterministic simulation – Model reduction