HAL Id: hal-01968453
https://hal.archives-ouvertes.fr/hal-01968453
Submitted on 2 Jan 2019
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Eric Bapteste, Philippe Huneman
To cite this version:
Eric Bapteste, Philippe Huneman. Towards a Dynamic Interaction Network of Life to unify and
expand the evolutionary theory. BMC Biology, BioMed Central, 2018, 16 (1),
�10.1186/s12915-018-0531-6�. �hal-01968453�
1
O P I N I O N
Open Access
2
Towards a Dynamic Interaction Network of
3
Life to unify and expand the evolutionary
4
theory
5
Q1 678
Eric Bapteste
1,2*and Philippe Huneman
39 Abstract
10 The classic Darwinian theory and the Synthetic 11 evolutionary theory and their linear models, while 12 invaluable to study the origins and evolution of species, 13 are not primarily designed to model the evolution of 14 organisations, typically that of ecosystems, nor that of 15 processes. How could evolutionary theory better 16 explain the evolution of biological complexity and 17 diversity? Inclusive network-based analyses of dynamic 18 systems could retrace interactions between (related or 19 unrelated) components. This theoretical shift from a 20 Tree of Life to a Dynamic Interaction Network of Life, 21 which is supported by diverse molecular, cellular, 22 microbiological, organismal, ecological and evolutionary 23 studies, would further unify evolutionary biology. 24 Keywords: Evolutionary biology, Interactions,
25 Theoretical biology, Tree of Life, Web of Life 26 Deciphering diversity through evolution
27 The living world is nested and multilevel, involves
mul-28 tiple agents and changes at different timescales.
Evolu-29 tionary biology tries to characterize the dynamics
30 responsible for such complexity to decipher the
pro-31 cesses accounting for the past and extant diversity
ob-32 served in molecules (namely, genes, RNA, proteins),
33 cellular machineries, unicellular and multi-cellular
or-34 ganisms, species, communities and ecosystems. In the
35 1930s and 1940s, a unified framework to handle this task
36 was built under the name of Modern Synthesis [1]. It
37 encompassed Darwin’s idea of evolution by natural
selec-38 tion as an explanation for diversity and adaptation and
* Correspondence:eric.bapteste@upmc.fr
1
Q7 Sorbonne Universités, UPMC Université Paris 06, Institut de Biologie Paris-Seine (IBPS), F-75005 Paris, France
2CNRS, UMR7138, Institut de Biologie Paris-Seine, F-75005 Paris, France
Full list of author information is available at the end of the article
39
Mendel’s idea of particular inheritance, giving rise to
40
population and quantitative genetics, a theoretical frame
41
that corroborated Darwin’s hypothesis of the paramount
42
power of selection for driving adaptive evolution [2].
43
This framework progressively aggregated multiple
disci-44
plines: behavioural ecology, microbiology, paleobiology,
45
etc. Overall, this classic framework considers that the
46
principal agency of evolution is natural selection of
47
favourable variations, and that those variations are
con-48
stituted by random mutations and recombination in a
49
Mendelian population. The processes of microevolution,
50
modelled by population and quantitative genetics, are
51
likely to be extrapolated to macroevolution [3]. To this
52
extent, models that focus on one or two loci are able to
53
capture much of the evolutionary dynamics of an
organ-54
ism, even though in reality many interdependencies
be-55
tween thousands of loci (epistasis, dominance, etc.)
56
occur as the basis of the production and functioning of a
57
phenotypic trait. Among forces acting on populations
58
and modelled by population geneticists, natural selection
59
is the one that shapes traits as adaptations and the
de-60
sign of organisms; adaptive radiation then explains much
61
of the diversity; and common descent from adapted
62
organisms explains most of the commonalities across
63
living forms (labelled homologies), and allows for
classi-64
fying living beings into phylogenetic trees. Evolution is
65
gradual because the effects of mutations are generally
66
small, large ones being most likely to be deleterious as
67
theorized by Fisher’s geometric model [4].
68
Many theoretical divergences surround this core view:
69
not everyone agrees that evolution is change in allele
fre-70
quencies, or that population genetics captures the whole
71
of the evolutionary process, or that the genotypic
view-72
point— tracking the dynamics of genes as ‘replicators’ [5]
73
or the strategy ‘choices’ of organisms as fitness
maximiz-74
ing agents [6]— should be favoured to understand
evolu-75
tion. Nevertheless, it has been a powerful enough
76
framework to drive successful research programs on
© Bapteste et al. 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
77 speciation, adaptation, phylogenies, evolution of sex,
78 cooperation altruism, mutualism, etc., and incorporate
ap-79 parent challenges such as neutral evolution [7],
acknow-80 ledgement of constraints on variation [8], or the recent
81 theoretical turn from genetics to genomics following the
82 achievement of the Human Genome Program [9].
Caus-83 ation is here overall conceived of as a linear causal relation
84 of a twofold nature: from the genotype to the phenotype
85 (assuming of course environmental parameters), and from
86 the environment to the shaping of organisms via natural
87 selection. For instance, in the classic case of evolution of
88 peppered moths in urban forests at the time of the
indus-89 trial revolution, trees became darkened with soot, and
90 then natural selection favored darker morphs as ‘fitter’
91 ones, due to their being less easily detected by predator
92 birds, resulting in a relative increase in frequency of the
93 darker morphs in the population [10].
94 Yet in the last 15 years biologists and philosophers of
95 biology have regularly questioned the genuinely unifying
96 character of this Synthesis, as well as its explanatory
ac-97 curacy [11]. Those criticisms questioned notably the set
98 of objects privileged by the Modern Synthesis, arguably
99 too gene-centered [12], and its key explanatory
100 processes, since niche construction [13], lateral gene
101 transfer [14,15], phenotypic plasticity [16,17], and mass
102 extinction [18] could, for example, be added [11].
103 Usually these critiques emphasize aspects rooted in a
104 particular biological discipline: lateral gene transfer from
105 microbiology, plasticity from developmental biology,
106 mass extinction from paleobiology, ecosystem
engineer-107 ing from functional ecology, etc. There were also
108 recurring claims for novel transdisciplinary fields:
evo-109 eco-devo [19], investigating the evolutionary dynamics
110 of host and microbe associations (forming combinations
111 often referred to as holobionts), evolutionary cell biology
112 [20], or microbial endocrinology [21], among others.
113 This latter discipline aims at understanding the evolved
114 interactions between microbial signals and host
develop-115 ment. Indeed, it is compelling for evolutionary biologists
116 to decipher how such multi-species interactions became
117 established (namely, whether they involved specific
mi-118 crobial species and molecules, and whether they evolved
119 independently in different host lineages).
120 Evolutionary biology is thus currently undergoing
vari-121 ous theoretical debates concerning the proper frame to
122 formulate it [11,22–24]. Here, we introduce an original
123 solution which moves this debate forward,
acknowledg-124 ing that nothing on Earth evolves and makes sense in
125 isolation, thereby challenging the key assumption of the
126 Modern Synthesis framework that targeting the
individ-127 ual gene or organism (even when in principle knowing
128 that it is part of a set of complex interactions) allows us
129 to capture evolution in all its dimensions. Since the
liv-130 ing world evolves as a dynamic network of interactions,
131
we argue that evolutionary biology could become a
sci-132
ence of evolving networks, which would allow biologists
133
to explain organisational complexity, while providing a
134
novel way to reframe and to unify evolutionary biology.
135
Biology is regulated by networks
136
Networks at the molecular level
137
Although numerous studies have focused on the functions
138
of individual genes, proteins and other molecules, it is
in-139
creasingly clear that each of these functions belongs to
140
complex networks of interactions. Starting at the
molecu-141
lar scale, the importance of a diversity of molecular agents,
142
such as (DNA-based) genes and their regulatory
se-143
quences, RNAs and proteins, is well recognized.
Import-144
antly, in terms of their origins and modes of evolution,
145
these agents are diverse. Genes are replicated across
gen-146
erations, via the recruitment of bases along a DNA
tem-147
plate, thereby forming continuous lineages, affected by
148
Darwinian evolution. By contrast, proteins are
recon-149
structed by recruitment of amino acids at the ribosomal
150
machinery. There is no physical continuity between
gener-151
ations of proteins, and thus no possibility for these agents
152
to directly accumulate beneficial mutations [25].
153
Moreover, all these molecular entities are compositionally
154
complex, in the sense that they are made of inherited or
155
reassembled parts. E pluribus unum: genes and proteins
156
are (often) conglomerates of exons, (eventually) introns
157
[26–28], and domains [29–31]. Similar claims can be
158
made about composite molecular systems, such as
159
CRISPR and Casposons [32,33], etc. This modular
organ-160
isation has numerous consequences: among them, genes
161
can be nested within genes [34]; proteins congregate in
162
larger complexes [35]. Importantly, this modularity is not
163
the mere result of a divergence from a single ancestral
164
form, but also involves combinatorial processes and
mo-165
lecular tinkering of available genetic material [36–38]. The
166
coupling and decoupling of molecular components can
167
operate randomly, as in cases of presuppression proposed
168
to neutrally lead to large molecular complexes [39–41].
169
Presuppression, also known as constructive neutralism, is
170
a process that generates complexity by mechanically
in-171
creasing dependencies between interacting molecules, in
172
the absence of positive selection. When a deleterious
mu-173
tation affects one molecular partner, existing properties of
174
another molecule with which the mutated molecule
175
already interacted can compensate for its partner defect.
176
Presuppression operates like a ratchet, since the likelihood
177
to restore the original independency between molecules
178
(by reverting the deleterious mutation) is lower than the
179
likelihood to move away from this original state (by
accu-180
mulating other mutations). Molecular associations can
181
also evolve under constraints [42], eventually reinforcing
182
the relationships between molecular partners, as suggested
183
184 Consistently, interconnectedness is a striking feature
185 of the molecular world [46,47]. Genes belong to
regula-186 tory networks with feedback loops [48]. Proteins belong
187 to protein–protein interaction networks. This systemic 188 view contrasts with former atomistic views assigning one
189 function to one gene. First, it is not always correct that a
190 gene produces only a protein, in the case of alternative
191 splicing. Second, it is also unlikely that a protein
192 performs one function, because no protein acts alone.
193 Rather, biological traits result from co-production
pro-194 cesses. This is nicely illustrated by the actual process of
195 translation, during which both proteins and DNA
neces-196 sarily interact, allowing for the collective reproduction of
197 these two types of molecular agents. How these different
198 components became so tightly integrated is a central
199 issue for explaining evolution. Understanding how the
200 molecular world functions and evolves therefore
re-201 quires analysing molecular organisation and the
evolu-202 tion of the architecture of interaction networks,
203 especially since this structure can partly explain
204 molecular reactions [46, 47, 49, 50]. Thus, systems
205 biologists search for common motifs in molecular
inter-206 action networks from different organisms, such as
feed-207 forward loops, assuming that some of these recurring
208 patterns, because they affect different gene or protein sets,
209 may reflect general rules and constraints affecting the
con-210 struction and evolution of biological organisations [46].
211 Focusing evolutionary explanations on the structure of
212 the interactions between genes rather than on the
pri-213 mary sequence of the genes is fundamentally different
214 from sequencing genes and inferring history from their
215 sequences alone; one could think here of the case of
216 explaining gene activation/repression. Comparative
217 works on molecular interaction networks show that
in-218 teractions affect the evolution of the molecules
compos-219 ing networks, which means that beyond compositional
220 complexity, organisational complexity must be modeled
221 to understand biological evolution [46, 51–54]. Before
222 the analysis of complex networks, compensatory sets of
223 elements, such as groups of sub-functional paralogous
224 genes [55], or groups of genes with pressupressed
muta-225 tions [39, 40], already stressed the evolutionary
inter-226 dependence of molecules. However, compensatory
227 interactions between agents, each of them being by
228 themselves poorly adapted, ran counter to the intuition
229 that natural selection will eliminate dysfunctional
indi-230 vidual entities. Their recognition invites one to consider
231 Earth as possibly populated by unions of individually
232 dysfunctional agents rather than by the fittest survivors
233 within individual lineages, possibly since early life,
ac-234 cording to Woese’s theory on progenotes, namely com-235 munities of interacting protocells unable to sustain
236 themselves alone, evolving via massive lateral genetic
ex-237 changes [56].
238
At the molecular level, it is reasonable to assume that
239
processes resulting from interactions of a diversity of
240
intertwined agents offer a crucial explanans of biological
241
complexity. Rather than‘one agent, one action’, it would
242
be more accurate to consider ‘a relationship between
243
agents, one action’ as the modus operandi of life.
244
Multiple drivers, of different nature, contribute to the
245
evolution of these interactions: among others, gene
co-246
expression/co-regulation [57], sometimes mediated by
247
transposons [58–61]; the evolutionary origin of the
248
genes [62]; and also physical and chemical laws, as well
249
as the presence of targeting machineries that constrain
250
and regulate diffusion processes in the cell. These types
251
of relationships described at the molecular level are also
252
recovered at other levels of biological organisations.
253
Networks at the cellular level
254
Similar conclusions have been reached at the cellular
255
level, also crucial for understanding life history. All
pro-256
karyotes and protists are unicellular organisations, and
257
the cell is a fundamental building block of multicellular
258
organisms. Cells must constantly evaluate the states of
259
their inner and outer environments, i.e. to adjust their
260
gene expression and react accordingly [46]. This involves
261
regulatory, transduction, developmental, and protein
262
interaction networks, etc. Cells are built upon inner
net-263
works of interacting components, and involved in or
af-264
fected by a diversity of exchanges, influences and modes
265
of communications (namely, genetic, energetic, chemical
266
and electrical modes). Microbiology has gone a long way
267
toward unraveling these processes since its heyday of
268
pure culture studies, a fruitful reductionist approach
269
now complemented by environmental studies. These
lat-270
ter further unraveled that cells compete and cooperate
271
with, and even compensate for each other, within
mono-272
or multispecific microbiomes [63, 64]. Both types of
273
microbiomes have a fundamental commonality: they
274
produce collective properties and co-constructed
pheno-275
types (Fig. 1) evolving at the interface between cells. F1
276
Such properties cannot be understood without
consider-277
ing networks of influences: the oscillatory growth of
bio-278
films of Bacillus subtilis cannot be deduced from the
279
analyses of the complete genomes of these clones, but
280
requires modeling metabolic co-dependence within a
281
monogenic community affected by a delayed feedback
282
loop, involving chemical and electrical signals [65,66].
283
Furthermore, many cellular agents show a relative lack
284
of autonomy. In nature, some groups of prokaryotes
dis-285
play complementary genomes with incomplete metabolic
286
pathways, consistent with the black queen hypothesis,
287
which predicts that our planet is populated by groups of
288
(inter)dependent microbes [67, 68]. More precisely, this
289
hypothesis predicts the loss of a costly function, encoded
290
291 function becomes dispensable at the individual level,
292 since it is achieved by other individuals that produce
293 (usually leaky) public goods in sufficient amount to
sup-294 port the equilibrium of the community. Thus, gene
295 losses in some cells are compensated by leaks of
sub-296 strates from other cells, formerly encoded by the lost
297 genes. Some microbes experience labor division [69].
298 Symbionts and endosymbionts depend on their hosts.
299 The ‘kill the winner’ theory [70] further challenges the
300 notion that the microbial world is a world of fit cellular
301 individuals. This theory stresses a collective process via
302 which viruses mechanically mostly attack cells that
re-303 produce faster and thus regulate bacterial populations,
304 these latter sustaining their diversity because these
pop-305 ulations are comprised of individual prokaryotic cells
306 that make a suboptimal use of a diversity of resources.
307 Thus, cells belong to networks that affect their growth
308 and survival, which might explain why most bacteria
309 cannot be grown in pure culture. They only truly thrive
310 within communities, whose global genetic instructions
311 are spread over several genetically incomplete microbes.
312 Accounting for these internal and external cellular
net-313 works requires considering processes that are not central
314 in the synthetic evolutionary theory. Typically, the
no-315 tion that cellular evolution makes jumps, because new
316 components and processes (such as metabolic pathways)
317 are acquired from outside a given cellular lineage,
con-318 trasts with more gradual accounts of biological change,
319 like accounts based on point mutations affecting genes
320 already present in the lineage. Because saltations
321 (macromutations) are essential evolutionary outcomes of
322 introgressive processes, via the combination of
323
components from different lineages, no complete picture
324
of evolution can be provided without these jumps, which
325
are naturally modeled by networks. Indeed, genetic
326
information has been flowing both vertically and
hori-327
zontally between prokaryotes for over 3.5 billion years
328
[71–77], and possibly earlier, according to Woese, who
329
proposed that our universal ancestor was not an entity
330
but a process, that is, genetic and energetic exchanges
331
within protocellular communities [56]. Remarkably, this
332
latter case indicates that network modeling could help
333
to tackle a fundamental issue in evolutionary biology:
334
modeling the evolution of biological processes that
335
emerge from interactions between biological entities.
336
Since these interactions can be represented by a
net-337
work, the evolution of these interactions, describing the
338
evolution of biological processes, can then be
repre-339
sented by dynamic networks. Likewise, eukaryogenesis
340
rested on the co-construction of a novel type of cell, as a
341
result of the endosymbiosis of a bacteria within an
342
archaeon [78–80]. Later, the evolution of photosynthetic
343
protists emerged from endosymbioses involving
unicel-344
lular eukaryotes and cyanobacteria, or various lineages
345
of protists, namely in secondary and tertiary
endosymbi-346
oses [81]. Such endosymbioses, and their outcomes as
il-347
lustrated in our work [82, 83], are also naturally
348
modeled using networks.
349
Moreover, the long-term impact of these introgressive
350
processes on cellular evolution should not be
underesti-351
mated. For instance, endosymbiosis does not merely
intro-352
duce new cellular lineages, it also favors the evolution of
353
chimeric structures and chimeric processes within cells
354
[83–91]. Such intertwining cannot be modeled using a
sin-355
gle genealogical tree, which only recapitulates cellular
di-356
vergence from a last common ancestor. Even though cells
357
always derive from other cells, a full cellular history cannot
358
be reduced to the history of some cellular components
359
that are assumed to track the history of cellular division
360
[92]. In particular, phylogenetic analyses of informational
361
genes cannot be the only clue to understanding the origins
362
of cellular diversity, since these genes do not reflect how
363
cells are organized, how they gather their energy, and how
364
they interact with each other. Analyzing the
co-365
construction side of evolution requires enhanced models:
366
understanding eukaryotic evolution requires mixed
con-367
siderations of cellular architecture, population genetics
368
and energetics, which go beyond classic phylogenetic
369
models, which not so long ago were still prone to
consid-370
ering three primary domains of life [93–95].
371
Although invoking multiple agents rather than a single
372
ancestor in evolutionary explanations might appear to
373
contradict the famous Ockham’s razor [96], it does so
374
only superficially when it is likely that many cells are
co-375
constructed, especially in the context of a web of life.
376
Enhanced models including intra- and extracellular
f1:1 Fig. 1. An example of co-construction, the case of holobionts. The left f1:2 circle represents the set of traits associated with a host, the right circle f1:3 represents the set of traits associated with its microbial communities; f1:4 the intersected area represents traits that are produced jointly as a f1:5 result of the interaction between hosts and microbes. When this area f1:6 becomes large or when co-constructed traits are remarkable, they f1:7 cannot be correctly explained under a simple model treating hosts and f1:8 microbes in isolation. This scheme holds for different types of partners f1:9
377 interactions appear necessary to understand cellular
378 complexity, including the predictable disappearance of
379 traits (and processes), namely the convergent gene loss
380 of mitochondria and plastids [97] by a process called
381 dedarwinification [98,99].
382 Networks beyond the cellular level
383 Studies of multicellular organisms—we will focus on ani-384 mals—have led to similar general findings. Understand-385 ing animal traits and their evolution requires analyzing
386 the relationships between a multiplicity of agents
be-387 longing to different levels of biological organisation,
388 eventually nested, some of which co-constructs animals
389 and guarantees their complete lifecycle [100]. Because
390 no sterile organism lives on Earth, animal development,
391 health and survival depend on microbes. Granted,
bac-392 teria can often legitimately be seen as part of the
envir-393 onmental demands in an evolutionary model focused on
394 the host’s lineage; or sometimes bacteria and host could 395 also be considered as part of a coevolution process, with
396 no need to posit the whole as a unit of selection [101].
397 However, asking‘who is the beneficiary of the symbiosis
398 as the result of evolution?’ may in some cases lead to the 399 recognition that bacteria and host evolved together and
400 were selected together [102]. More generally, while some
401 microbes contribute to animals’ lives possibly as a result 402 of host-derived selection, others contribute as a result of
403 selectively neutral processes (like microbial priming
404 [103]) [101, 104]. These interactions produce
communi-405 cation networks within the animal body: chemical
infor-406 mation circulates between the animal brain and the gut
407 microbiome. These interactions also result in
communi-408 cation and interaction networks between individuals. In
409 some animal lineages, the microbiome affects social
be-410 haviors, for instance fermenting microbes inform about
411 the gender and reproductive status in hyena [105].
Com-412 ponents of the microbiome also affect mating choice
413 [106], reproductive isolation and possibly speciation.
414 Consequently, the microbiome now appears as an
essen-415 tial component of animal studies [107]. Microbiome
416 studies, the significance of which is overstated in some
417 respects, nevertheless have shown that the evolutionary
418 intertwining between many metazoa and commensal or
419 symbiotic bacteria could not be neglected anymore and
420 black-boxed in favor of purely host gene-centered
evolu-421 tionary models. And the associations between hosts and
422 microbes do not need to be units of selection to be part
423 of the recent insights that support the novel theoretical
424 framework proposed here. Their interplay imposes
425 reconfigurations of practices, theories and disciplines
426 [108]. As a result of our improved insight into evolution,
427 zoology and immunology [109] become theaters of new
428 ecological considerations [110], sometimes strangely
429 qualified as Lamarckian [111, 112], because animals can
430
recruit environmental microbes and transmit them (with a
431
non-null heritability [113]) to their progeny. Therefore,
432
nuclear gene inheritance alone may provide too narrow a
433
perspective to account for the evolution of all animal
434
traits; as an example, aphid body color depends on animal
435
genetics and the presence of Rickettsiella [114]. Population
436
genetics gets included in a broader community genetics,
437
which also considers transmission of microbes and their
438
genes [108, 114]. The use of gnotobiotic and transbiotic
439
animals becomes a new experimental standard to analyze
440
multigenomic collectives without counterparts in modern
441
synthesis theories. These collectives harbor morphological,
442
physiological, developmental, ecological, behavioral and
443
evolutionary features [115–119] that are not purely
con-444
structed by animal genes, but rather appear to be
co-445
constructed at the genetic and metabolic interface
446
between the microbial and macrobial worlds, while the
447
content of the respective animal genomes only provides
448
incomplete instructions. Understanding animal evolution
449
requires understanding the interaction networks between
450
components from which these taxa evolved, and the
net-451
works to which these taxa still belong.
452
In ecology, an analogous turn towards network
think-453
ing has been promoted since the 1990s with the general
454
acceptance of the notions of metapopulations [120] and
455
then metacommunities [121]. These views suggest that
456
the dynamics of ecological biodiversity is not so much
457
located within a community of species but rather in a
458
metacommunity, which can be thought of as a network
459
of communities exchanging species, while targeting one
460
community blinds one to what genuinely accounts for
461
biodiversity and ecosystem functioning [122].
462
This quick overview provides evidence that networks
463
are at the origin of the genes of unicellular and
multicel-464
lular organisms and central for their functions. The
liv-465
ing world is a world of ‘and’ and ‘co-’. From division of
466
labor and compensations, to dependencies and
co-467
constructions, etc.: interactions (which only begin to be
468
deciphered) are everywhere in biology. Thus, explaining
469
the actual features of biodiversity requires explaining
470
how multiple processes, interface phenomena
(co-con-471
struction of biological features, niche construction,
472
metabolic cooperation, co-infection and co-evolution)
473
and organisations (for instance, from molecular
path-474
ways to organisms and ecosystems) arose from
interact-475
ing components, and how these processes, phenomena
476
and organisations may have been sustained and
trans-477
formed on Earth.
478
Reframing evolutionary explanations from the
479
scaffolded evolution perspective
480
Introducing a classification of interacting components
481
While classic evolutionary models, prompted by
482
483 entities diverge in relative independence, it appears
im-484 portant to show how a diversity of components, which
485 may not be related, interact and produce various
evolu-486 tionary patterns.
487 The notion of scaffolding [124], which describes how
488 one entity continues an event initiated by another entity,
489 and relies on it up to the point that at some timescale it
490 becomes dependent upon it for further evolution,
ap-491 pears as a fundamental relationship to describe the
evo-492 lution of life. We propose scaffolding should become
493 more central in explanations of evolution because no
494 components from the biological world are actually able
495 to reproduce, or persist, alone (Fig.
F2 2). Each entity
influ-496 ences or is influenced by something external to it, and is
497 consequently part of a process. Scaffolding thus defines
498 the causal backbone of collective evolution. It describes
499 the historical continuity between temporal slices of
500 interaction networks, since any evolutionary stage relies
501 on previously achieved networks and organisations.
502 Therefore, describing the evolution of interactions
re-503 quires explanations to address the following issues: what
504 scaffolds what, what transforms the environment of what,
505 and are these influences reciprocal? Characterizing the
506 types of components that, together, have evolutionary
im-507 portance through their potential interaction is therefore a
508 central step to expanding evolutionary theory.
509 We propose that a first distinction can be made
be-510 tween obligate and facultative components. Suppressing
511 the former impacts the course and eventually the
512 reproduction of the process to which they contribute
513 (Fig.
F3 3), whereas facultative components do not hold
514 such a crucial role, and may simply be involved by
515 chance. A second distinction is whether the components
516 are biotic (genes, proteins, organisms…) or abiotic (such 517 as minerals, environmental, cultural artefacts). Abiotic
518 components can be recruited from the environment or
519 be shaped by biological processes [125]. They can also
520
alter the evolution of the biotic components, for
ex-521
ample, environmental change can drive genetic and
or-522
ganismal evolution and selection. The history of life
523
clearly depends on the interplay of both types of
compo-524
nents. Biotic components, however, deserve a specific
525
focus. Some of them form lineages (for instance, genes
526
replicate), while others do not (for instance, proteins are
527
reconstructed). Finally, interacting replicated
compo-528
nents can be further classified into fraternal components
529
when they share a close last common ancestor (e.g. in
530
kin selection cases), and egalitarian components, when
531
they belong to distinct lineages (as an example, think of
532
the evolution of chimeric genes by fusion and shuffling
533
[29,45,126]) [63].
534
Introducing dynamic interaction networks
535
Biodiversity usually evolves from interactions between
536
the diverse types of components described above. For
537
example, metalloproteases emerge from the interaction
538
between reconstructed biotic components (proteins) and
539
a metal ion. Regulatory networks involve biotic
compo-540
nents that can be either replicated (i.e. genes and
pro-541
moters) or reconstructed (i.e. proteins). Protein
542
interaction networks intertwine reconstructed egalitarian
543
biotic components, which means proteins that are not
544
homologous. Evolutionary transitions such as
eukaryo-545
genesis result from the interweaving of biotic
compo-546
nents (cells) from multiple lineages. Holobionts evolve
547
from interactions between egalitarian biotic components
548
(macrobial hosts and microbial communities) and
pos-549
sibly abiotic components, such as the mineral termite
550
mounds, or the volatile chemicals produced by the
mi-551
crobial communities of hyenas [105].
552
Taking collectives of interacting components as central
553
objects of study in evolutionary biology invites us to
ex-554
pand the methods of this field. It encourages developing
555
statistical approaches or inference methods beyond those
a
b
c
d
f2:1 Fig. 2. Different types of scaffolding, at four levels of biological organisations. a Functional interactions at the molecular level. b Introgression and f2:2 vertical descent at the cellular level. c Co-construction at the multicellular level. d Niche-construction and physico-chemical interactions at the f2:3 eco-systemic level
556 operating under the very common assumption that
bio-557 logical components are independent. Therefore, we
558 propose to represent interactions between components
559 in the form of networks in which components are nodes
560 and their interactions (of various sorts) are edges. These
561 networks are conceptually simple objects. They can be
562 described as adjacency lists of interactions, in the form
563 ‘component A interacts with component B, at time t 564 (when such a temporal precision is known)’. Such dy-565 namic interaction networks could become more central
566 representations and analytical frameworks, and serve as
567 a common explanans for various disciplines in an
ex-568 panded evolutionary theory. Importantly, because these
569 networks embed both abiotic and biotic, related and
un-570 related components (like viruses, cells and rocks), they
571 should not be conflated with phylogenetic networks, but
572
recognized as a more inclusive object of study (Fig. 4). F4
573
Where phylogenies describe relationships, networks can
574
describe organisations. How such organisations evolve
575
could for example be described by identifying
evolution-576
ary stages, that is, sets of components and of their
inter-577
actions simultaneously present in the network (Fig. 4).
578
Investigating the evolution of an ecosystem corresponds
579
to studying the succession of evolutionary stages in such
580
networks and detecting possible regularities—in the
581
sense that some evolutionary stages would fully or partly
582
reiterate over time—or hinting at rules or constraints
583
(like architectural contingencies [127, 128] or principles
584
of organisations [46]) on the recruitment, reproduction
585
and heritability of their components.
586
Thus, we suggest that evolutionary biology could be
587
reframed as a science of evolving networks, because
f3:1 Fig. 3. Classification of major types of components in evolving systems. A process/collective cannot be completed in the absence of obligate f3:2 components, whereas facultative components do not affect the outcome of the process/function of the collective. Biotic components are f3:3 biological, material products, whereas abiotic components are environmental, geological, chemical, physical or cultural artefacts. Replicated f3:4 components are produced by replication, which implies a physical continuity between ancestral and descendent components; they undergo a f3:5 paradigmatic Darwinian evolution. Reconstructed components are reproduced without direct physical continuity, and cannot directly accumulate f3:6 beneficial mutations. Fraternal components belong to the same lineage, whereas egalitarian components belong to different lineages
f3:7
f4:1 Fig. 4. An evolving interaction network. Nodes are components (circles are full when the component is biotic). Thick black edges represent f4:2 interactions between these components. The network topology evolves as nodes or their connection change. Dashed edges represent the f4:3 phylogenetic ancestry of lineage-forming components
588 such a shift would allow inclusive, multilevel studies of a
589 larger body of biological and abiotic data, via approaches
590 from network sciences.
591 Concrete strategies to enhance network-based
592 evolutionary analyses
593 Enhancing network-based evolutionary analyses, beyond
594 the now classic research program of phylogenetic
net-595 works, could consolidate comparative analyses in the
596 nascent field of evolutionary systems biology [129, 130],
597 as illustrated by examples based on molecular networks.
598 Network construction/gathering constitutes the first step
599 of such analyses. This involves first defining nodes of the
600 network, namely components suspected to be involved
601 in a given system, and edges, namely qualitative (or
602 quantitative, when weighted) interactions between these
603 entities. Many biological interaction networks (gene
co-604 expression networks (GCNs), gene regulatory networks
605 (GRNs), metabolic networks, protein–protein interaction 606 networks (PPIs), etc. [46]) are already known for some
607 species, or can be inferred [131–136]. For example,
608 GCNs offer an increasingly popular resource to study
609 the evolution of biological pathways [137], as well as to
610
reveal conservation and divergence in gene regulation
611
[138]. GCNs are already used for micro-evolution
stud-612
ies, as in the case of fine-grained comparisons of
expres-613
sion variations between orthologous genes across closely
614
related species, and for the analysis of minor
evolution-615
ary and ecological transitions, such as changes of ploidy
616
[139, 140], adaptation to salty environments [141] or
617
drugs [142], or the effects of plant domestication
618
[143, 144]. Likewise, GRNs are starting to be used in
619
micro-evolution and phenotypic plasticity studies
620
[145]. Understanding the dynamics of GRNs appears
621
critical to inferring the evolution of organismal traits,
622
in particular during metazoan [146–148], plant [149] and
623
fungal [150] evolution. We suggest that PPI, GCN and
624
GRN studies could become mainstream and also be
con-625
ducted at (much) larger evolutionary and temporal scales,
626
to analyze additional, major, transitions.
627
Based on these established networks, two major types
628
of evolutionary analyses (network-decomposition and Q8 629
graph-matching; Fig.5b) can be easily further developed F5
630
by evolutionary biologists. More precisely, the
above-631
mentioned kinds of biological networks could be
system-632
atically turned into what we call evolutionary colored
f5:1 Fig. 5. Workflow of the evolutionary analysis of interaction networks. From left to right: triangles represent components of interaction networks, f5:2 edges between triangles represent interactions between these components. Interaction networks are first constructed/inferred, then their nodes f5:3 and edges are colored to produce evolutionary colored networks (ECNs) that represent both the topological and the evolutionary properties of f5:4 the networks. ECNs can be investigated individually by graph decomposition and centrality analyses, or several ECNs can be compared by graph f5:5 alignment. The two types of comparisons can return conserved subgraphs that allow understanding of the dynamics of interaction networks, f5:6 meaning when different sets of interactions (hence processes) evolved, and whether these interactions were evolutionarily stable. Ancient and f5:7 Contemporary refer to the relative age of the sub-graphs, identifying new clade-specific relationships (here called refinement); introgression f5:8 indicates that a component, and the relationship it entertains with the rest of the network, was inferred to result from a lateral acquisition f5:9
633 biological networks (ECNs). In ECNs, each node of a
634 given biological network is colored to reflect one or
sev-635 eral evolutionary properties. For example, in molecular
636 networks, nodes correspond to molecular sequences
637 (genes, RNA, proteins) that belong to homologous
638 families that phylogenetic distribution across host
spe-639 cies allows us to date [137, 151–156]. The ‘age’ of the
640 family at the node can thus become one evolutionary
641 color (Fig. 5a). Likewise, several processes affecting the
642 evolution of a molecular family (selection, duplication,
643 transfer, and divergence in primary sequence) can be
in-644 ferred by classic phylogenetic analyses or, as we
pro-645 posed, by analyses of sequence similarity networks [157].
646 Such studies provide additional evolutionary colors (like
647 quantitative measures: intensity of selection, rates of
du-648 plication, transfer, and percentage of divergence), which
649 can be associated with nodes in ECNs [139, 149, 154,
650 158–161]. Thus, ECNs contain both topological
informa-651 tion, characteristic of the biological network under
investi-652 gation, as well as evolutionary information: what node
653 belongs to a family prone to duplication, divergence, or
654 lateral transfer, as well as when this family arose.
Combin-655 ing these two types of information in a single graph allows
656 us to test specific hypotheses regarding evolution.
657 Using ECNs, it is first fruitful to test whether (or
658 which of ) these evolutionary colors correlates with
topo-659 logical properties of the ECNs [162–164]. The null
hy-660 pothesis that nodes’ centrality, e.g. nodes’ positions in 661 the network, is neither correlated with the age nor with
662 the duplicability, transferability or divergence of the
mo-663 lecular entities represented by these nodes can be tested.
664 Rejection of this hypothesis would hint at processes that
665 affect the topology of biological networks or are affected
666 by the network topology. For example, considering
de-667 gree in networks, proteins with more neighbors are less
668 easily transferred [163], highly expressed genes, more
669 connected in GCNs, evolve slower than weakly
670 expressed genes [165], and genes with lower degrees
671 have higher duplicability in yeast, worm and flies [166].
672 Considering position in networks, node centrality
corre-673 lates with evolutionary conservation [136], gene
eccen-674 tricity correlates with level of gene expression and
675 dispensability [167], and proteins interacting with the
676 external environment have higher average duplicability
677 than proteins localized within intracellular
compart-678 ments [168]. Additionally, network structure gives a clue
679 to evolution since old proteins have more interactions
680 than new ones [169, 170]. Generalizing these disparate
681 studies could help to understand the dynamics of
bio-682 logical networks, in other words how the architecture,
683 the nodes and edges of present day networks, evolved
684 and whether their changes involved random or biased
685 sets of nodes and edges or follow general models of
net-686 work growth with detectable drivers.
687
This focus would complement a classic tree-based
688
view. For instance, under the reasonable working
hy-689
pothesis that pairs of connected nodes of a given age
re-690
flect an interaction between nodes that may have arisen
691
at that time [154,171], ECNs can easily be easily
decom-692
posed into sub-networks, featuring processes of different
693
ages (that is, sets of nodes of a given age, e.g. sets of
694
interacting genes). This strategy allows identification of
695
conserved network patterns, possibly under strong
se-696
lective pressure [159]. Constructing and exploiting ECNs
697
from bacteria, archaea, and eukaryotes thus has the
po-698
tential to define conserved ancestral sets of relationships
699
between components, allowing evolutionary biologists to
700
infer aspects of the early biological networks of the last
701
common ancestor of eukaryotes, archaea and bacteria
702
and even of the last universal common ancestor of cells.
703
Assuming that some of these topological units
corres-704
pond to functional units [172], especially for broadly
705
conserved subgraphs [138, 149, 152, 166, 173–182],
706
would allow network decompositions to propose sets of
707
important processes associated with the emergence of
708
major lineages.
709
Moreover, graph-matching of ECNs allows several
710
complementary analyses. First, for interaction networks,
711
such as GRNs, whose sets of components and edges
712
evolve rapidly [183–185], it becomes relevant to analyze
713
where in the network such changes occur in addition to
714
(simply) tracking conserved sets of components and
715
edges. Whereas the latter can test to what extent
conser-716
vation of the interaction networks across higher taxa
717
supports generalizations made from a limited number of
718
model species [186], the former allows us to test a
gen-719
eral hypothesis: are there repeated types of network
720
changes? For example, does network modification
721
primarily affect nodes with particular centralities, as
722
exemplified by terminal processes [187], or modules?
723
Systematizing these analyses would provide new insights
724
into whether the organisation principles of biological
725
networks changed when major lineages evolved or
726
remained conserved. In terms of the ECN, can the same
727
model of graph evolution explain the topology of ECNs
728
from different lineages? The null hypothesis would be
729
that these major transitions left no common traces in
730
biological networks. An alternative hypothesis would be
731
that the biological networks convergently became more
732
complex (more connected and larger) during these
tran-733
sitions to novel life forms. Indeed, analyses conducted
734
on a few taxa have reported quantifiable and qualifiable
735
modifications in biological networks (in response to
en-736
vironmental challenges [188], during ecological
transi-737
tions [189] or as niche specific adaptations [190]). More
738
systematic graph-matching [191–193] and motif
ana-739
lyses, comparing the topology of ECNs from multiple
740
741 that major lineages are enriched in particular motifs
742 (either modules of colored nodes and edges, or specific
743 topological features, such as feed-forward loops [46] or
744 bow-ties [194]). It would also allow identification of
745 functionally equivalent components across species,
746 namely different genes with similar neighbors in
differ-747 ent species [176].
748 While inferences on conserved sets of nodes and edges
749 in ECNs are likely to be robust (since the patterns are
750 observed in multiple species), missing data (missing
751 nodes and edges) constitute a recognized challenge,
es-752 pecially for the interpretation of what will appear in
753 ECN studies as the most versatile (least conserved) parts
754 of the biological networks. The issue of missing data,
755 however, is not specific to network-based evolutionary
756 analyses, and should be tackled, as with other
compara-757 tive approaches, by the development and testing of
im-758 putation methods [195–197]. Moreover, issues of
759 missing data can also be addressed by the production of
760 high coverage -omics datasets in simple systems,
allow-761 ing for (nearly) exhaustive representations of the entities
762 and their interactions (i.e. PPIs, GCNs and GRNs within
763 a cell, or metabolic networks within a species poor
eco-764 system). This kind of data would allow testing for the
765 existence of selected emergent ecosystemic properties
766 (like carbon fixation), as stated by the ITSNTS
hypoth-767 esis [198]. For instance, deep coverage time series of
768 metagenomic/metatranscriptomic data coupled with
en-769 vironmental measures from a simple microbial
ecosys-770 tem, such as carbon fixation, could produce enough data
771 to allow the evolutionary coloring of nodes of metabolic
772 networks. Comparing ECNs representing, at each time
773 point, the origin and abundance of the lineages hosting
774 the enzymes involved in carbon fixation could test
775 whether some combinations of lineages are repeated
776 over time, and whether the components (e.g. genes and
777 lineages) vary, whereas carbon fixation is maintained in
778 the ecosystem, which would suggest that this process
779 evolves irrespective of the nature of the interacting
780 components.
781 Finally, entities from different levels of biological
or-782 ganisation (domains, genes, genomes, lineages, etc.)
783 could also be studied together in a single network
frame-784 work, by integrating them into multipartite networks
785 [199]. Recently, our studies and others (see [200] and
786 references therein) have demonstrated that various
pat-787 terns in multipartite graphs can be used to detect and
788 test combinatorial (introgressive) and gradual evolution
789 (by vertical descent) affecting genes and genomes.
790 Decomposing multipartite networks into twins and
791 articulation points could for example then be used to
792 represent and analyze the evolution of complex
compos-793 ite molecular systems, such as CRISPR, or the dynamics
794 of invasions of hairpins in genomes [201].
795
Further justifications for a shift toward network
796
thinking
797
Enlargement of evolutionary biology
798
Focusing evolutionary explanations and theories on
collec-799
tives of interacting components, which may be under
se-800
lection, facilitate selection, or condition arrangements
801
through neutral processes [39, 40, 202], and representing
802
these scaffolding relationships using networks with biotic
803
and abiotic components and a diversity of edges
represent-804
ing a diversity of interaction types would be an
enlarge-805
ment. Enlargements, as expressing the need to consider
806
structures that are more general than what already exists,
807
have already occurred within evolutionary theory, when
808
simplifications from population genetics were relaxed with
809
respect to the original formalization in the Modern
810
Synthesis [203], to account for within-genome interaction
811
[9], gene–environment covariance [204], parental effects
812
[205], and extended fitness though generations [206]. It
813
also occurred when reticulations representing
introgres-814
sions were added to the evolutionary tree.
815
Interestingly, replacing standard linear models in
evolu-816
tionary theory with network approaches would transcend
817
several traditional axes structuring the debates in
evolu-818
tionary biology. For instance, scaffolded evolution, the idea
819
that evolution relies on what came before, is orthogonal to
820
the distinction between vertical and horizontal descent,
821
since both tree-like and introgressive evolution are
par-822
ticular cases of scaffolding. Scaffolded evolution is also
or-823
thogonal to the distinction between gradual and
824
saltational evolution. Likewise, scaffolded evolution is
or-825
thogonal to the debates between the actual role of
adapta-826
tions vs neutral processes. Selection is a key mode of
827
evolution of collectives but not the only one. The
pro-828
cesses involved in the forming and evolution of collectives
829
are not even restricted to the key processes of the Modern
830
Synthesis (drift, selection, mutation and migration) but
831
embrace interactions such as facilitation—namely
antag-832
onistic interactions between two species that allow a third
833
species to prosper by restraining one of its predators or
834
parasites [207], presuppression [39,40], etc. Consequently,
835
some evolutionary concepts may become more important
836
than they currently are to explain evolution. For example,
837
contingency, which means the dependence of an
evolu-838
tionary chain of events upon an event that itself is
contin-839
gent, in the sense that it can’t be understood as a selective
840
response to environmental changes [18,208,209], is often
841
associated with extraordinary events, like mass
decima-842
tion. Contingency could come to be seen as a less
extraor-843
dinary mode of evolution in the history of life, since the
844
ordinary course of evolution might include many cases of
845
contingent events, that is, associations of entities in a
tran-846
sient collective, including any scaffolds—associations that
847
are not necessarily selective responses or the outcomes of
848
849 Likewise, adopting a broader ontology could affect
850 how evolutionary theorists think about evolution.
Popu-851 lation thinking and tree-thinking came after essentialist
852 conceptions of the living words, when populations and
853 lineages were recognized as central objects of
evolution-854 ary studies [210]. A shift towards collectives and
scaf-855 folded evolution might encorage a similar development:
856 the emergence of an openly pluralistic processual
think-857 ing, consistent with Carl Woese’s proposal to reformu-858 late our view of evolution in terms of complex dynamic
859 systems [211].
860 Further unifying the evolutionary theory
861 Using a network-based approach to analyse dynamic
sys-862 tems also permits explanations that rely purely on
statis-863 tical properties [212] or on topological or graph
864 theoretical properties [213, 214] besides standard
expla-865 nations devoted to unravelling mechanisms responsible
866 for a phenomenon. Moreover, because of the
inclusive-867 ness of the network model, disciplines already
recog-868 nized for their contribution to evolutionary theory
869 (microbiology, ecology, cell biology, genetics, etc.) could
870 become even more part of an interdisciplinary research
871 program on evolution, effectively addressing current
is-872 sues, consistent with the repeated calls for
transdisci-873 plinary collaborations [19–21, 215]. Disciplines that
874 were not central in the Modern Synthesis—chemistry, 875 physics, geology, oceanography, cybernetics or
linguis-876 tics—could aggregate with evolutionary biology. Since a 877 diversity of components gets connected by a diversity of
878 edges in networks featuring collectives, as a result of a
879 diversity of drivers, several explanatory strategies could
880 be combined to analyze evolution. This extension to
881 seemingly foreign fields makes sense when the
compo-882 nents/processes studied by these other disciplines are
883 evolutionarily or functionally related to biotic
compo-884 nents and processes (either as putative ancestors of
bio-885 logical components and processes, like the use of a
886 proton gradient in cells, which possibly derived from
887 geological processes affecting early life [216], or as
de-888 scendants of biological systems, e.g. technically
synthe-889 sized life forms, which have a potential to alter the
890 future course of standard biological evolution).
891 Remarkably, this mode of unification of diverse
scien-892 tific disciplines would be original: the integration would
893 not be a unification in the sense of logical positivism
894 [217]—namely reducing a theory to a theory with more
895 basic laws, or a theory with a larger extension. It would
896 be a piecemeal [218] unification. Some aspects would be
897 unified through a specific kind of graph modeling
898 (because some interactions, namely mechanical,
chem-899 ical, ecological ones, and a range of time scales are
privi-900 leged in a set of theories), while other theories might be
901 unified by other graph properties (like different types of
902
edges and components). For example, the fermentation
903
hypothesis for mammalian chemical communication
904
could be analyzed in a multipartite network framework,
905
which would involve nodes corresponding to individual
906
mammals, nodes corresponding to microbes, and nodes
907
corresponding to odorous metabolites. Nodes
corre-908
sponding to mammals could either be colored to reflect
909
an individual’s properties (its lineage, social position,
910
gender, sexual availability), or these nodes could be
con-911
nected by edges that reflect these shared properties,
912
which defines a first host subnetwork. This host
subnet-913
work can itself be further connected to a second
subnet-914
work, namely the microbial subnetwork in which nodes
915
representing microbes, colored by phylogenetic origins,
916
could be connected to reflect microbial interactions
917
(gene transfer, competition, metabolic cooperation, etc.).
918
Connections between the host and microbial
subnet-919
works could simply be made by drawing edges between
920
nodes representing individuals hosting microbes, and
921
nodes representing these microbes. Moreover, nodes
922
representing mammals and nodes representing microbes
923
could be connected to nodes representing odorous
me-924
tabolites to show what odours are associated with what
925
combinations of hosts and microbes. Elaborating this
926
network in a piecemeal fashion would involve
cooper-927
ation between chemists, microbiologists, zoologists and
928
evolutionary biologists.
929
Of note, the use of integrated networks could
prag-930
matically address a deep concern for evolutionary
stud-931
ies, by connecting phenomena that occur at different
932
timescales: development and evolution [219] or ecology
933
and evolution [220]. Considering transient collectives
934
(thus processes) as stable entities at a given time-scale,
935
when these collectives change much more slowly than
936
the process in which they take part, amounts to a focus
937
on interactions occurring at a given time scale by
treat-938
ing the slower dynamics as stable edges/nodes. Then,
939
various parts of the networks embody distinct
time-940
scales, which may provide a new form of timescale
inte-941
gration, working out the merging of timescales from the
942
viewpoint of the model, and with resources intrinsic to
943
the model itself. The reason for this is that a node in an
944
interaction network Ni, describing processes relevant at
945
a time scale i, can itself be seen as the outcome of
an-946
other (embedded) interaction network Nj, unfolding at a
947
time scale j. This nestedness typically occurs when the
948
node in Ni represents a collective process, involving
949
components that evolve sufficiently slowly with respect
950
to the system considered at the time scale i to figure as
951
an entity, a node in Ni. In the case of a PPI network Ni,
952
each node conventionally represents a protein, but the
953
evolution of each protein could be further analysed as
954
the result of mutation, duplication, fusion and shuffling
955