Towards a Dynamic Interaction Network of Life to unify and expand the evolutionary theory

(1)

HAL Id: hal-01968453

https://hal.archives-ouvertes.fr/hal-01968453

Submitted on 2 Jan 2019

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Eric Bapteste, Philippe Huneman

To cite this version:

Eric Bapteste, Philippe Huneman. Towards a Dynamic Interaction Network of Life to unify and

expand the evolutionary theory. BMC Biology, BioMed Central, 2018, 16 (1),

�10.1186/s12915-018-0531-6�. �hal-01968453�

(2)

1

O P I N I O N

Open Access

2

Towards a Dynamic Interaction Network of

3

Life to unify and expand the evolutionary

4

theory

5

Q1 678

Eric Bapteste

1,2*

and Philippe Huneman

3

9 Abstract

10 The classic Darwinian theory and the Synthetic 11 evolutionary theory and their linear models, while 12 invaluable to study the origins and evolution of species, 13 are not primarily designed to model the evolution of 14 organisations, typically that of ecosystems, nor that of 15 processes. How could evolutionary theory better 16 explain the evolution of biological complexity and 17 diversity? Inclusive network-based analyses of dynamic 18 systems could retrace interactions between (related or 19 unrelated) components. This theoretical shift from a 20 Tree of Life to a Dynamic Interaction Network of Life, 21 which is supported by diverse molecular, cellular, 22 microbiological, organismal, ecological and evolutionary 23 studies, would further unify evolutionary biology. 24 Keywords: Evolutionary biology, Interactions,

25 Theoretical biology, Tree of Life, Web of Life 26 Deciphering diversity through evolution

27 The living world is nested and multilevel, involves

mul-28 tiple agents and changes at different timescales.

Evolu-29 tionary biology tries to characterize the dynamics

30 responsible for such complexity to decipher the

pro-31 cesses accounting for the past and extant diversity

ob-32 served in molecules (namely, genes, RNA, proteins),

33 cellular machineries, unicellular and multi-cellular

or-34 ganisms, species, communities and ecosystems. In the

35 1930s and 1940s, a unified framework to handle this task

36 was built under the name of Modern Synthesis [1]. It

37 encompassed Darwin’s idea of evolution by natural

selec-38 tion as an explanation for diversity and adaptation and

* Correspondence:eric.bapteste@upmc.fr

1

Q7 Sorbonne Universités, UPMC Université Paris 06, Institut de Biologie Paris-Seine (IBPS), F-75005 Paris, France

2_{CNRS, UMR7138, Institut de Biologie Paris-Seine, F-75005 Paris, France}

Full list of author information is available at the end of the article

39

Mendel’s idea of particular inheritance, giving rise to

40

population and quantitative genetics, a theoretical frame

41

that corroborated Darwin’s hypothesis of the paramount

42

power of selection for driving adaptive evolution [2].

43

This framework progressively aggregated multiple

disci-44

plines: behavioural ecology, microbiology, paleobiology,

45

etc. Overall, this classic framework considers that the

46

principal agency of evolution is natural selection of

47

favourable variations, and that those variations are

con-48

stituted by random mutations and recombination in a

49

Mendelian population. The processes of microevolution,

50

modelled by population and quantitative genetics, are

51

likely to be extrapolated to macroevolution [3]. To this

52

extent, models that focus on one or two loci are able to

53

capture much of the evolutionary dynamics of an

organ-54

ism, even though in reality many interdependencies

be-55

tween thousands of loci (epistasis, dominance, etc.)

56

occur as the basis of the production and functioning of a

57

phenotypic trait. Among forces acting on populations

58

and modelled by population geneticists, natural selection

59

is the one that shapes traits as adaptations and the

de-60

sign of organisms; adaptive radiation then explains much

61

of the diversity; and common descent from adapted

62

organisms explains most of the commonalities across

63

living forms (labelled homologies), and allows for

classi-64

fying living beings into phylogenetic trees. Evolution is

65

gradual because the effects of mutations are generally

66

small, large ones being most likely to be deleterious as

67

theorized by Fisher’s geometric model [4].

68

Many theoretical divergences surround this core view:

69

not everyone agrees that evolution is change in allele

fre-70

quencies, or that population genetics captures the whole

71

of the evolutionary process, or that the genotypic

view-72

point— tracking the dynamics of genes as ‘replicators’ [5]

73

or the strategy ‘choices’ of organisms as fitness

maximiz-74

ing agents [6]— should be favoured to understand

evolu-75

tion. Nevertheless, it has been a powerful enough

76

framework to drive successful research programs on

© Bapteste et al. 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

(3)

77 speciation, adaptation, phylogenies, evolution of sex,

78 cooperation altruism, mutualism, etc., and incorporate

ap-79 parent challenges such as neutral evolution [7],

acknow-80 ledgement of constraints on variation [8], or the recent

81 theoretical turn from genetics to genomics following the

82 achievement of the Human Genome Program [9].

Caus-83 ation is here overall conceived of as a linear causal relation

84 of a twofold nature: from the genotype to the phenotype

85 (assuming of course environmental parameters), and from

86 the environment to the shaping of organisms via natural

87 selection. For instance, in the classic case of evolution of

88 peppered moths in urban forests at the time of the

indus-89 trial revolution, trees became darkened with soot, and

90 then natural selection favored darker morphs as ‘fitter’

91 ones, due to their being less easily detected by predator

92 birds, resulting in a relative increase in frequency of the

93 darker morphs in the population [10].

94 Yet in the last 15 years biologists and philosophers of

95 biology have regularly questioned the genuinely unifying

96 character of this Synthesis, as well as its explanatory

ac-97 curacy [11]. Those criticisms questioned notably the set

98 of objects privileged by the Modern Synthesis, arguably

99 too gene-centered [12], and its key explanatory

100 processes, since niche construction [13], lateral gene

101 transfer [14,15], phenotypic plasticity [16,17], and mass

102 extinction [18] could, for example, be added [11].

103 Usually these critiques emphasize aspects rooted in a

104 particular biological discipline: lateral gene transfer from

105 microbiology, plasticity from developmental biology,

106 mass extinction from paleobiology, ecosystem

engineer-107 ing from functional ecology, etc. There were also

108 recurring claims for novel transdisciplinary fields:

evo-109 eco-devo [19], investigating the evolutionary dynamics

110 of host and microbe associations (forming combinations

111 often referred to as holobionts), evolutionary cell biology

112 [20], or microbial endocrinology [21], among others.

113 This latter discipline aims at understanding the evolved

114 interactions between microbial signals and host

develop-115 ment. Indeed, it is compelling for evolutionary biologists

116 to decipher how such multi-species interactions became

117 established (namely, whether they involved specific

mi-118 crobial species and molecules, and whether they evolved

119 independently in different host lineages).

120 Evolutionary biology is thus currently undergoing

vari-121 ous theoretical debates concerning the proper frame to

122 formulate it [11,22–24]. Here, we introduce an original

123 solution which moves this debate forward,

acknowledg-124 ing that nothing on Earth evolves and makes sense in

125 isolation, thereby challenging the key assumption of the

126 Modern Synthesis framework that targeting the

individ-127 ual gene or organism (even when in principle knowing

128 that it is part of a set of complex interactions) allows us

129 to capture evolution in all its dimensions. Since the

liv-130 ing world evolves as a dynamic network of interactions,

131

we argue that evolutionary biology could become a

sci-132

ence of evolving networks, which would allow biologists

133

to explain organisational complexity, while providing a

134

novel way to reframe and to unify evolutionary biology.

135

Biology is regulated by networks

136

Networks at the molecular level

137

Although numerous studies have focused on the functions

138

of individual genes, proteins and other molecules, it is

in-139

creasingly clear that each of these functions belongs to

140

complex networks of interactions. Starting at the

molecu-141

lar scale, the importance of a diversity of molecular agents,

142

such as (DNA-based) genes and their regulatory

se-143

quences, RNAs and proteins, is well recognized.

Import-144

antly, in terms of their origins and modes of evolution,

145

these agents are diverse. Genes are replicated across

gen-146

erations, via the recruitment of bases along a DNA

tem-147

plate, thereby forming continuous lineages, affected by

148

Darwinian evolution. By contrast, proteins are

recon-149

structed by recruitment of amino acids at the ribosomal

150

machinery. There is no physical continuity between

gener-151

ations of proteins, and thus no possibility for these agents

152

to directly accumulate beneficial mutations [25].

153

Moreover, all these molecular entities are compositionally

154

complex, in the sense that they are made of inherited or

155

reassembled parts. E pluribus unum: genes and proteins

156

are (often) conglomerates of exons, (eventually) introns

157

[26–28], and domains [29–31]. Similar claims can be

158

made about composite molecular systems, such as

159

CRISPR and Casposons [32,33], etc. This modular

organ-160

isation has numerous consequences: among them, genes

161

can be nested within genes [34]; proteins congregate in

162

larger complexes [35]. Importantly, this modularity is not

163

the mere result of a divergence from a single ancestral

164

form, but also involves combinatorial processes and

mo-165

lecular tinkering of available genetic material [36–38]. The

166

coupling and decoupling of molecular components can

167

operate randomly, as in cases of presuppression proposed

168

to neutrally lead to large molecular complexes [39–41].

169

Presuppression, also known as constructive neutralism, is

170

a process that generates complexity by mechanically

in-171

creasing dependencies between interacting molecules, in

172

the absence of positive selection. When a deleterious

mu-173

tation affects one molecular partner, existing properties of

174

another molecule with which the mutated molecule

175

already interacted can compensate for its partner defect.

176

Presuppression operates like a ratchet, since the likelihood

177

to restore the original independency between molecules

178

(by reverting the deleterious mutation) is lower than the

179

likelihood to move away from this original state (by

accu-180

mulating other mutations). Molecular associations can

181

also evolve under constraints [42], eventually reinforcing

182

the relationships between molecular partners, as suggested

183

(4)

184 Consistently, interconnectedness is a striking feature

185 of the molecular world [46,47]. Genes belong to

regula-186 tory networks with feedback loops [48]. Proteins belong

187 to protein–protein interaction networks. This systemic 188 view contrasts with former atomistic views assigning one

189 function to one gene. First, it is not always correct that a

190 gene produces only a protein, in the case of alternative

191 splicing. Second, it is also unlikely that a protein

192 performs one function, because no protein acts alone.

193 Rather, biological traits result from co-production

pro-194 cesses. This is nicely illustrated by the actual process of

195 translation, during which both proteins and DNA

neces-196 sarily interact, allowing for the collective reproduction of

197 these two types of molecular agents. How these different

198 components became so tightly integrated is a central

199 issue for explaining evolution. Understanding how the

200 molecular world functions and evolves therefore

re-201 quires analysing molecular organisation and the

evolu-202 tion of the architecture of interaction networks,

203 especially since this structure can partly explain

204 molecular reactions [46, 47, 49, 50]. Thus, systems

205 biologists search for common motifs in molecular

inter-206 action networks from different organisms, such as

feed-207 forward loops, assuming that some of these recurring

208 patterns, because they affect different gene or protein sets,

209 may reflect general rules and constraints affecting the

con-210 struction and evolution of biological organisations [46].

211 Focusing evolutionary explanations on the structure of

212 the interactions between genes rather than on the

pri-213 mary sequence of the genes is fundamentally different

214 from sequencing genes and inferring history from their

215 sequences alone; one could think here of the case of

216 explaining gene activation/repression. Comparative

217 works on molecular interaction networks show that

in-218 teractions affect the evolution of the molecules

compos-219 ing networks, which means that beyond compositional

220 complexity, organisational complexity must be modeled

221 to understand biological evolution [46, 51–54]. Before

222 the analysis of complex networks, compensatory sets of

223 elements, such as groups of sub-functional paralogous

224 genes [55], or groups of genes with pressupressed

muta-225 tions [39, 40], already stressed the evolutionary

inter-226 dependence of molecules. However, compensatory

227 interactions between agents, each of them being by

228 themselves poorly adapted, ran counter to the intuition

229 that natural selection will eliminate dysfunctional

indi-230 vidual entities. Their recognition invites one to consider

231 Earth as possibly populated by unions of individually

232 dysfunctional agents rather than by the fittest survivors

233 within individual lineages, possibly since early life,

ac-234 cording to Woese’s theory on progenotes, namely com-235 munities of interacting protocells unable to sustain

236 themselves alone, evolving via massive lateral genetic

ex-237 changes [56].

238

At the molecular level, it is reasonable to assume that

239

processes resulting from interactions of a diversity of

240

intertwined agents offer a crucial explanans of biological

241

complexity. Rather than‘one agent, one action’, it would

242

be more accurate to consider ‘a relationship between

243

agents, one action’ as the modus operandi of life.

244

Multiple drivers, of different nature, contribute to the

245

evolution of these interactions: among others, gene

co-246

expression/co-regulation [57], sometimes mediated by

247

transposons [58–61]; the evolutionary origin of the

248

genes [62]; and also physical and chemical laws, as well

249

as the presence of targeting machineries that constrain

250

and regulate diffusion processes in the cell. These types

251

of relationships described at the molecular level are also

252

recovered at other levels of biological organisations.

253

Networks at the cellular level

254

Similar conclusions have been reached at the cellular

255

level, also crucial for understanding life history. All

pro-256

karyotes and protists are unicellular organisations, and

257

the cell is a fundamental building block of multicellular

258

organisms. Cells must constantly evaluate the states of

259

their inner and outer environments, i.e. to adjust their

260

gene expression and react accordingly [46]. This involves

261

regulatory, transduction, developmental, and protein

262

interaction networks, etc. Cells are built upon inner

net-263

works of interacting components, and involved in or

af-264

fected by a diversity of exchanges, influences and modes

265

of communications (namely, genetic, energetic, chemical

266

and electrical modes). Microbiology has gone a long way

267

toward unraveling these processes since its heyday of

268

pure culture studies, a fruitful reductionist approach

269

now complemented by environmental studies. These

lat-270

ter further unraveled that cells compete and cooperate

271

with, and even compensate for each other, within

mono-272

or multispecific microbiomes [63, 64]. Both types of

273

microbiomes have a fundamental commonality: they

274

produce collective properties and co-constructed

pheno-275

types (Fig. 1) evolving at the interface between cells. F1

276

Such properties cannot be understood without

consider-277

ing networks of influences: the oscillatory growth of

bio-278

films of Bacillus subtilis cannot be deduced from the

279

analyses of the complete genomes of these clones, but

280

requires modeling metabolic co-dependence within a

281

monogenic community affected by a delayed feedback

282

loop, involving chemical and electrical signals [65,66].

283

Furthermore, many cellular agents show a relative lack

284

of autonomy. In nature, some groups of prokaryotes

dis-285

play complementary genomes with incomplete metabolic

286

pathways, consistent with the black queen hypothesis,

287

which predicts that our planet is populated by groups of

288

(inter)dependent microbes [67, 68]. More precisely, this

289

hypothesis predicts the loss of a costly function, encoded

290

(5)

291 function becomes dispensable at the individual level,

292 since it is achieved by other individuals that produce

293 (usually leaky) public goods in sufficient amount to

sup-294 port the equilibrium of the community. Thus, gene

295 losses in some cells are compensated by leaks of

sub-296 strates from other cells, formerly encoded by the lost

297 genes. Some microbes experience labor division [69].

298 Symbionts and endosymbionts depend on their hosts.

299 The ‘kill the winner’ theory [70] further challenges the

300 notion that the microbial world is a world of fit cellular

301 individuals. This theory stresses a collective process via

302 which viruses mechanically mostly attack cells that

re-303 produce faster and thus regulate bacterial populations,

304 these latter sustaining their diversity because these

pop-305 ulations are comprised of individual prokaryotic cells

306 that make a suboptimal use of a diversity of resources.

307 Thus, cells belong to networks that affect their growth

308 and survival, which might explain why most bacteria

309 cannot be grown in pure culture. They only truly thrive

310 within communities, whose global genetic instructions

311 are spread over several genetically incomplete microbes.

312 Accounting for these internal and external cellular

net-313 works requires considering processes that are not central

314 in the synthetic evolutionary theory. Typically, the

no-315 tion that cellular evolution makes jumps, because new

316 components and processes (such as metabolic pathways)

317 are acquired from outside a given cellular lineage,

con-318 trasts with more gradual accounts of biological change,

319 like accounts based on point mutations affecting genes

320 already present in the lineage. Because saltations

321 (macromutations) are essential evolutionary outcomes of

322 introgressive processes, via the combination of

323

components from different lineages, no complete picture

324

of evolution can be provided without these jumps, which

325

are naturally modeled by networks. Indeed, genetic

326

information has been flowing both vertically and

hori-327

zontally between prokaryotes for over 3.5 billion years

328

[71–77], and possibly earlier, according to Woese, who

329

proposed that our universal ancestor was not an entity

330

but a process, that is, genetic and energetic exchanges

331

within protocellular communities [56]. Remarkably, this

332

latter case indicates that network modeling could help

333

to tackle a fundamental issue in evolutionary biology:

334

modeling the evolution of biological processes that

335

emerge from interactions between biological entities.

336

Since these interactions can be represented by a

net-337

work, the evolution of these interactions, describing the

338

evolution of biological processes, can then be

repre-339

sented by dynamic networks. Likewise, eukaryogenesis

340

rested on the co-construction of a novel type of cell, as a

341

result of the endosymbiosis of a bacteria within an

342

archaeon [78–80]. Later, the evolution of photosynthetic

343

protists emerged from endosymbioses involving

unicel-344

lular eukaryotes and cyanobacteria, or various lineages

345

of protists, namely in secondary and tertiary

endosymbi-346

oses [81]. Such endosymbioses, and their outcomes as

il-347

lustrated in our work [82, 83], are also naturally

348

modeled using networks.

349

Moreover, the long-term impact of these introgressive

350

processes on cellular evolution should not be

underesti-351

mated. For instance, endosymbiosis does not merely

intro-352

duce new cellular lineages, it also favors the evolution of

353

chimeric structures and chimeric processes within cells

354

[83–91]. Such intertwining cannot be modeled using a

sin-355

gle genealogical tree, which only recapitulates cellular

di-356

vergence from a last common ancestor. Even though cells

357

always derive from other cells, a full cellular history cannot

358

be reduced to the history of some cellular components

359

that are assumed to track the history of cellular division

360

[92]. In particular, phylogenetic analyses of informational

361

genes cannot be the only clue to understanding the origins

362

of cellular diversity, since these genes do not reflect how

363

cells are organized, how they gather their energy, and how

364

they interact with each other. Analyzing the

co-365

construction side of evolution requires enhanced models:

366

understanding eukaryotic evolution requires mixed

con-367

siderations of cellular architecture, population genetics

368

and energetics, which go beyond classic phylogenetic

369

models, which not so long ago were still prone to

consid-370

ering three primary domains of life [93–95].

371

Although invoking multiple agents rather than a single

372

ancestor in evolutionary explanations might appear to

373

contradict the famous Ockham’s razor [96], it does so

374

only superficially when it is likely that many cells are

co-375

constructed, especially in the context of a web of life.

376

Enhanced models including intra- and extracellular

f1:1 Fig. 1. An example of co-construction, the case of holobionts. The left f1:2 circle represents the set of traits associated with a host, the right circle f1:3 represents the set of traits associated with its microbial communities; f1:4 the intersected area represents traits that are produced jointly as a f1:5 result of the interaction between hosts and microbes. When this area f1:6 becomes large or when co-constructed traits are remarkable, they f1:7 cannot be correctly explained under a simple model treating hosts and f1:8 microbes in isolation. This scheme holds for different types of partners f1:9

(6)

377 interactions appear necessary to understand cellular

378 complexity, including the predictable disappearance of

379 traits (and processes), namely the convergent gene loss

380 of mitochondria and plastids [97] by a process called

381 dedarwinification [98,99].

382 Networks beyond the cellular level

383 Studies of multicellular organisms—we will focus on ani-384 mals—have led to similar general findings. Understand-385 ing animal traits and their evolution requires analyzing

386 the relationships between a multiplicity of agents

be-387 longing to different levels of biological organisation,

388 eventually nested, some of which co-constructs animals

389 and guarantees their complete lifecycle [100]. Because

390 no sterile organism lives on Earth, animal development,

391 health and survival depend on microbes. Granted,

bac-392 teria can often legitimately be seen as part of the

envir-393 onmental demands in an evolutionary model focused on

394 the host’s lineage; or sometimes bacteria and host could 395 also be considered as part of a coevolution process, with

396 no need to posit the whole as a unit of selection [101].

397 However, asking‘who is the beneficiary of the symbiosis

398 as the result of evolution?’ may in some cases lead to the 399 recognition that bacteria and host evolved together and

400 were selected together [102]. More generally, while some

401 microbes contribute to animals’ lives possibly as a result 402 of host-derived selection, others contribute as a result of

403 selectively neutral processes (like microbial priming

404 [103]) [101, 104]. These interactions produce

communi-405 cation networks within the animal body: chemical

infor-406 mation circulates between the animal brain and the gut

407 microbiome. These interactions also result in

communi-408 cation and interaction networks between individuals. In

409 some animal lineages, the microbiome affects social

be-410 haviors, for instance fermenting microbes inform about

411 the gender and reproductive status in hyena [105].

Com-412 ponents of the microbiome also affect mating choice

413 [106], reproductive isolation and possibly speciation.

414 Consequently, the microbiome now appears as an

essen-415 tial component of animal studies [107]. Microbiome

416 studies, the significance of which is overstated in some

417 respects, nevertheless have shown that the evolutionary

418 intertwining between many metazoa and commensal or

419 symbiotic bacteria could not be neglected anymore and

420 black-boxed in favor of purely host gene-centered

evolu-421 tionary models. And the associations between hosts and

422 microbes do not need to be units of selection to be part

423 of the recent insights that support the novel theoretical

424 framework proposed here. Their interplay imposes

425 reconfigurations of practices, theories and disciplines

426 [108]. As a result of our improved insight into evolution,

427 zoology and immunology [109] become theaters of new

428 ecological considerations [110], sometimes strangely

429 qualified as Lamarckian [111, 112], because animals can

430

recruit environmental microbes and transmit them (with a

431

non-null heritability [113]) to their progeny. Therefore,

432

nuclear gene inheritance alone may provide too narrow a

433

perspective to account for the evolution of all animal

434

traits; as an example, aphid body color depends on animal

435

genetics and the presence of Rickettsiella [114]. Population

436

genetics gets included in a broader community genetics,

437

which also considers transmission of microbes and their

438

genes [108, 114]. The use of gnotobiotic and transbiotic

439

animals becomes a new experimental standard to analyze

440

multigenomic collectives without counterparts in modern

441

synthesis theories. These collectives harbor morphological,

442

physiological, developmental, ecological, behavioral and

443

evolutionary features [115–119] that are not purely

con-444

structed by animal genes, but rather appear to be

co-445

constructed at the genetic and metabolic interface

446

between the microbial and macrobial worlds, while the

447

content of the respective animal genomes only provides

448

incomplete instructions. Understanding animal evolution

449

requires understanding the interaction networks between

450

components from which these taxa evolved, and the

net-451

works to which these taxa still belong.

452

In ecology, an analogous turn towards network

think-453

ing has been promoted since the 1990s with the general

454

acceptance of the notions of metapopulations [120] and

455

then metacommunities [121]. These views suggest that

456

the dynamics of ecological biodiversity is not so much

457

located within a community of species but rather in a

458

metacommunity, which can be thought of as a network

459

of communities exchanging species, while targeting one

460

community blinds one to what genuinely accounts for

461

biodiversity and ecosystem functioning [122].

462

This quick overview provides evidence that networks

463

are at the origin of the genes of unicellular and

multicel-464

lular organisms and central for their functions. The

liv-465

ing world is a world of ‘and’ and ‘co-’. From division of

466

labor and compensations, to dependencies and

co-467

constructions, etc.: interactions (which only begin to be

468

deciphered) are everywhere in biology. Thus, explaining

469

the actual features of biodiversity requires explaining

470

how multiple processes, interface phenomena

(co-con-471

struction of biological features, niche construction,

472

metabolic cooperation, co-infection and co-evolution)

473

and organisations (for instance, from molecular

path-474

ways to organisms and ecosystems) arose from

interact-475

ing components, and how these processes, phenomena

476

and organisations may have been sustained and

trans-477

formed on Earth.

478

Reframing evolutionary explanations from the

479

scaffolded evolution perspective

480

Introducing a classification of interacting components

481

While classic evolutionary models, prompted by

482

(7)

483 entities diverge in relative independence, it appears

im-484 portant to show how a diversity of components, which

485 may not be related, interact and produce various

evolu-486 tionary patterns.

487 The notion of scaffolding [124], which describes how

488 one entity continues an event initiated by another entity,

489 and relies on it up to the point that at some timescale it

490 becomes dependent upon it for further evolution,

ap-491 pears as a fundamental relationship to describe the

evo-492 lution of life. We propose scaffolding should become

493 more central in explanations of evolution because no

494 components from the biological world are actually able

495 to reproduce, or persist, alone (Fig.

F2 2). Each entity

influ-496 ences or is influenced by something external to it, and is

497 consequently part of a process. Scaffolding thus defines

498 the causal backbone of collective evolution. It describes

499 the historical continuity between temporal slices of

500 interaction networks, since any evolutionary stage relies

501 on previously achieved networks and organisations.

502 Therefore, describing the evolution of interactions

re-503 quires explanations to address the following issues: what

504 scaffolds what, what transforms the environment of what,

505 and are these influences reciprocal? Characterizing the

506 types of components that, together, have evolutionary

im-507 portance through their potential interaction is therefore a

508 central step to expanding evolutionary theory.

509 We propose that a first distinction can be made

be-510 tween obligate and facultative components. Suppressing

511 the former impacts the course and eventually the

512 reproduction of the process to which they contribute

513 (Fig.

F3 3), whereas facultative components do not hold

514 such a crucial role, and may simply be involved by

515 chance. A second distinction is whether the components

516 are biotic (genes, proteins, organisms…) or abiotic (such 517 as minerals, environmental, cultural artefacts). Abiotic

518 components can be recruited from the environment or

519 be shaped by biological processes [125]. They can also

520

alter the evolution of the biotic components, for

ex-521

ample, environmental change can drive genetic and

or-522

ganismal evolution and selection. The history of life

523

clearly depends on the interplay of both types of

compo-524

nents. Biotic components, however, deserve a specific

525

focus. Some of them form lineages (for instance, genes

526

replicate), while others do not (for instance, proteins are

527

reconstructed). Finally, interacting replicated

compo-528

nents can be further classified into fraternal components

529

when they share a close last common ancestor (e.g. in

530

kin selection cases), and egalitarian components, when

531

they belong to distinct lineages (as an example, think of

532

the evolution of chimeric genes by fusion and shuffling

533

[29,45,126]) [63].

534

Introducing dynamic interaction networks

535

Biodiversity usually evolves from interactions between

536

the diverse types of components described above. For

537

example, metalloproteases emerge from the interaction

538

between reconstructed biotic components (proteins) and

539

a metal ion. Regulatory networks involve biotic

compo-540

nents that can be either replicated (i.e. genes and

pro-541

moters) or reconstructed (i.e. proteins). Protein

542

interaction networks intertwine reconstructed egalitarian

543

biotic components, which means proteins that are not

544

homologous. Evolutionary transitions such as

eukaryo-545

genesis result from the interweaving of biotic

compo-546

nents (cells) from multiple lineages. Holobionts evolve

547

from interactions between egalitarian biotic components

548

(macrobial hosts and microbial communities) and

pos-549

sibly abiotic components, such as the mineral termite

550

mounds, or the volatile chemicals produced by the

mi-551

crobial communities of hyenas [105].

552

Taking collectives of interacting components as central

553

objects of study in evolutionary biology invites us to

ex-554

pand the methods of this field. It encourages developing

555

statistical approaches or inference methods beyond those

a

b

c

d

f2:1 Fig. 2. Different types of scaffolding, at four levels of biological organisations. a Functional interactions at the molecular level. b Introgression and f2:2 vertical descent at the cellular level. c Co-construction at the multicellular level. d Niche-construction and physico-chemical interactions at the f2:3 eco-systemic level

(8)

556 operating under the very common assumption that

bio-557 logical components are independent. Therefore, we

558 propose to represent interactions between components

559 in the form of networks in which components are nodes

560 and their interactions (of various sorts) are edges. These

561 networks are conceptually simple objects. They can be

562 described as adjacency lists of interactions, in the form

563 ‘component A interacts with component B, at time t 564 (when such a temporal precision is known)’. Such dy-565 namic interaction networks could become more central

566 representations and analytical frameworks, and serve as

567 a common explanans for various disciplines in an

ex-568 panded evolutionary theory. Importantly, because these

569 networks embed both abiotic and biotic, related and

un-570 related components (like viruses, cells and rocks), they

571 should not be conflated with phylogenetic networks, but

572

recognized as a more inclusive object of study (Fig. 4). F4

573

Where phylogenies describe relationships, networks can

574

describe organisations. How such organisations evolve

575

could for example be described by identifying

evolution-576

ary stages, that is, sets of components and of their

inter-577

actions simultaneously present in the network (Fig. 4).

578

Investigating the evolution of an ecosystem corresponds

579

to studying the succession of evolutionary stages in such

580

networks and detecting possible regularities—in the

581

sense that some evolutionary stages would fully or partly

582

reiterate over time—or hinting at rules or constraints

583

(like architectural contingencies [127, 128] or principles

584

of organisations [46]) on the recruitment, reproduction

585

and heritability of their components.

586

Thus, we suggest that evolutionary biology could be

587

reframed as a science of evolving networks, because

f3:1 Fig. 3. Classification of major types of components in evolving systems. A process/collective cannot be completed in the absence of obligate f3:2 components, whereas facultative components do not affect the outcome of the process/function of the collective. Biotic components are f3:3 biological, material products, whereas abiotic components are environmental, geological, chemical, physical or cultural artefacts. Replicated f3:4 components are produced by replication, which implies a physical continuity between ancestral and descendent components; they undergo a f3:5 paradigmatic Darwinian evolution. Reconstructed components are reproduced without direct physical continuity, and cannot directly accumulate f3:6 beneficial mutations. Fraternal components belong to the same lineage, whereas egalitarian components belong to different lineages

f3_:7

f4:1 Fig. 4. An evolving interaction network. Nodes are components (circles are full when the component is biotic). Thick black edges represent f4:2 interactions between these components. The network topology evolves as nodes or their connection change. Dashed edges represent the f4:3 phylogenetic ancestry of lineage-forming components

(9)

588 such a shift would allow inclusive, multilevel studies of a

589 larger body of biological and abiotic data, via approaches

590 from network sciences.

591 Concrete strategies to enhance network-based

592 evolutionary analyses

593 Enhancing network-based evolutionary analyses, beyond

594 the now classic research program of phylogenetic

net-595 works, could consolidate comparative analyses in the

596 nascent field of evolutionary systems biology [129, 130],

597 as illustrated by examples based on molecular networks.

598 Network construction/gathering constitutes the first step

599 of such analyses. This involves first defining nodes of the

600 network, namely components suspected to be involved

601 in a given system, and edges, namely qualitative (or

602 quantitative, when weighted) interactions between these

603 entities. Many biological interaction networks (gene

co-604 expression networks (GCNs), gene regulatory networks

605 (GRNs), metabolic networks, protein–protein interaction 606 networks (PPIs), etc. [46]) are already known for some

607 species, or can be inferred [131–136]. For example,

608 GCNs offer an increasingly popular resource to study

609 the evolution of biological pathways [137], as well as to

610

reveal conservation and divergence in gene regulation

611

[138]. GCNs are already used for micro-evolution

stud-612

ies, as in the case of fine-grained comparisons of

expres-613

sion variations between orthologous genes across closely

614

related species, and for the analysis of minor

evolution-615

ary and ecological transitions, such as changes of ploidy

616

[139, 140], adaptation to salty environments [141] or

617

drugs [142], or the effects of plant domestication

618

[143, 144]. Likewise, GRNs are starting to be used in

619

micro-evolution and phenotypic plasticity studies

620

[145]. Understanding the dynamics of GRNs appears

621

critical to inferring the evolution of organismal traits,

622

in particular during metazoan [146–148], plant [149] and

623

fungal [150] evolution. We suggest that PPI, GCN and

624

GRN studies could become mainstream and also be

con-625

ducted at (much) larger evolutionary and temporal scales,

626

to analyze additional, major, transitions.

627

Based on these established networks, two major types

628

of evolutionary analyses (network-decomposition and Q8 629

graph-matching; Fig.5b) can be easily further developed F5

630

by evolutionary biologists. More precisely, the

above-631

mentioned kinds of biological networks could be

system-632

atically turned into what we call evolutionary colored

f5:1 Fig. 5. Workflow of the evolutionary analysis of interaction networks. From left to right: triangles represent components of interaction networks, f5:2 edges between triangles represent interactions between these components. Interaction networks are first constructed/inferred, then their nodes f5:3 and edges are colored to produce evolutionary colored networks (ECNs) that represent both the topological and the evolutionary properties of f5:4 the networks. ECNs can be investigated individually by graph decomposition and centrality analyses, or several ECNs can be compared by graph f5:5 alignment. The two types of comparisons can return conserved subgraphs that allow understanding of the dynamics of interaction networks, f5:6 meaning when different sets of interactions (hence processes) evolved, and whether these interactions were evolutionarily stable. Ancient and f5:7 Contemporary refer to the relative age of the sub-graphs, identifying new clade-specific relationships (here called refinement); introgression f5:8 indicates that a component, and the relationship it entertains with the rest of the network, was inferred to result from a lateral acquisition f5_:9

(10)

633 biological networks (ECNs). In ECNs, each node of a

634 given biological network is colored to reflect one or

sev-635 eral evolutionary properties. For example, in molecular

636 networks, nodes correspond to molecular sequences

637 (genes, RNA, proteins) that belong to homologous

638 families that phylogenetic distribution across host

spe-639 cies allows us to date [137, 151–156]. The ‘age’ of the

640 family at the node can thus become one evolutionary

641 color (Fig. 5a). Likewise, several processes affecting the

642 evolution of a molecular family (selection, duplication,

643 transfer, and divergence in primary sequence) can be

in-644 ferred by classic phylogenetic analyses or, as we

pro-645 posed, by analyses of sequence similarity networks [157].

646 Such studies provide additional evolutionary colors (like

647 quantitative measures: intensity of selection, rates of

du-648 plication, transfer, and percentage of divergence), which

649 can be associated with nodes in ECNs [139, 149, 154,

650 158–161]. Thus, ECNs contain both topological

informa-651 tion, characteristic of the biological network under

investi-652 gation, as well as evolutionary information: what node

653 belongs to a family prone to duplication, divergence, or

654 lateral transfer, as well as when this family arose.

Combin-655 ing these two types of information in a single graph allows

656 us to test specific hypotheses regarding evolution.

657 Using ECNs, it is first fruitful to test whether (or

658 which of ) these evolutionary colors correlates with

topo-659 logical properties of the ECNs [162–164]. The null

hy-660 pothesis that nodes’ centrality, e.g. nodes’ positions in 661 the network, is neither correlated with the age nor with

662 the duplicability, transferability or divergence of the

mo-663 lecular entities represented by these nodes can be tested.

664 Rejection of this hypothesis would hint at processes that

665 affect the topology of biological networks or are affected

666 by the network topology. For example, considering

de-667 gree in networks, proteins with more neighbors are less

668 easily transferred [163], highly expressed genes, more

669 connected in GCNs, evolve slower than weakly

670 expressed genes [165], and genes with lower degrees

671 have higher duplicability in yeast, worm and flies [166].

672 Considering position in networks, node centrality

corre-673 lates with evolutionary conservation [136], gene

eccen-674 tricity correlates with level of gene expression and

675 dispensability [167], and proteins interacting with the

676 external environment have higher average duplicability

677 than proteins localized within intracellular

compart-678 ments [168]. Additionally, network structure gives a clue

679 to evolution since old proteins have more interactions

680 than new ones [169, 170]. Generalizing these disparate

681 studies could help to understand the dynamics of

bio-682 logical networks, in other words how the architecture,

683 the nodes and edges of present day networks, evolved

684 and whether their changes involved random or biased

685 sets of nodes and edges or follow general models of

net-686 work growth with detectable drivers.

687

This focus would complement a classic tree-based

688

view. For instance, under the reasonable working

hy-689

pothesis that pairs of connected nodes of a given age

re-690

flect an interaction between nodes that may have arisen

691

at that time [154,171], ECNs can easily be easily

decom-692

posed into sub-networks, featuring processes of different

693

ages (that is, sets of nodes of a given age, e.g. sets of

694

interacting genes). This strategy allows identification of

695

conserved network patterns, possibly under strong

se-696

lective pressure [159]. Constructing and exploiting ECNs

697

from bacteria, archaea, and eukaryotes thus has the

po-698

tential to define conserved ancestral sets of relationships

699

between components, allowing evolutionary biologists to

700

infer aspects of the early biological networks of the last

701

common ancestor of eukaryotes, archaea and bacteria

702

and even of the last universal common ancestor of cells.

703

Assuming that some of these topological units

corres-704

pond to functional units [172], especially for broadly

705

conserved subgraphs [138, 149, 152, 166, 173–182],

706

would allow network decompositions to propose sets of

707

important processes associated with the emergence of

708

major lineages.

709

Moreover, graph-matching of ECNs allows several

710

complementary analyses. First, for interaction networks,

711

such as GRNs, whose sets of components and edges

712

evolve rapidly [183–185], it becomes relevant to analyze

713

where in the network such changes occur in addition to

714

(simply) tracking conserved sets of components and

715

edges. Whereas the latter can test to what extent

conser-716

vation of the interaction networks across higher taxa

717

supports generalizations made from a limited number of

718

model species [186], the former allows us to test a

gen-719

eral hypothesis: are there repeated types of network

720

changes? For example, does network modification

721

primarily affect nodes with particular centralities, as

722

exemplified by terminal processes [187], or modules?

723

Systematizing these analyses would provide new insights

724

into whether the organisation principles of biological

725

networks changed when major lineages evolved or

726

remained conserved. In terms of the ECN, can the same

727

model of graph evolution explain the topology of ECNs

728

from different lineages? The null hypothesis would be

729

that these major transitions left no common traces in

730

biological networks. An alternative hypothesis would be

731

that the biological networks convergently became more

732

complex (more connected and larger) during these

tran-733

sitions to novel life forms. Indeed, analyses conducted

734

on a few taxa have reported quantifiable and qualifiable

735

modifications in biological networks (in response to

en-736

vironmental challenges [188], during ecological

transi-737

tions [189] or as niche specific adaptations [190]). More

738

systematic graph-matching [191–193] and motif

ana-739

lyses, comparing the topology of ECNs from multiple

740

(11)

741 that major lineages are enriched in particular motifs

742 (either modules of colored nodes and edges, or specific

743 topological features, such as feed-forward loops [46] or

744 bow-ties [194]). It would also allow identification of

745 functionally equivalent components across species,

746 namely different genes with similar neighbors in

differ-747 ent species [176].

748 While inferences on conserved sets of nodes and edges

749 in ECNs are likely to be robust (since the patterns are

750 observed in multiple species), missing data (missing

751 nodes and edges) constitute a recognized challenge,

es-752 pecially for the interpretation of what will appear in

753 ECN studies as the most versatile (least conserved) parts

754 of the biological networks. The issue of missing data,

755 however, is not specific to network-based evolutionary

756 analyses, and should be tackled, as with other

compara-757 tive approaches, by the development and testing of

im-758 putation methods [195–197]. Moreover, issues of

759 missing data can also be addressed by the production of

760 high coverage -omics datasets in simple systems,

allow-761 ing for (nearly) exhaustive representations of the entities

762 and their interactions (i.e. PPIs, GCNs and GRNs within

763 a cell, or metabolic networks within a species poor

eco-764 system). This kind of data would allow testing for the

765 existence of selected emergent ecosystemic properties

766 (like carbon fixation), as stated by the ITSNTS

hypoth-767 esis [198]. For instance, deep coverage time series of

768 metagenomic/metatranscriptomic data coupled with

en-769 vironmental measures from a simple microbial

ecosys-770 tem, such as carbon fixation, could produce enough data

771 to allow the evolutionary coloring of nodes of metabolic

772 networks. Comparing ECNs representing, at each time

773 point, the origin and abundance of the lineages hosting

774 the enzymes involved in carbon fixation could test

775 whether some combinations of lineages are repeated

776 over time, and whether the components (e.g. genes and

777 lineages) vary, whereas carbon fixation is maintained in

778 the ecosystem, which would suggest that this process

779 evolves irrespective of the nature of the interacting

780 components.

781 Finally, entities from different levels of biological

or-782 ganisation (domains, genes, genomes, lineages, etc.)

783 could also be studied together in a single network

frame-784 work, by integrating them into multipartite networks

785 [199]. Recently, our studies and others (see [200] and

786 references therein) have demonstrated that various

pat-787 terns in multipartite graphs can be used to detect and

788 test combinatorial (introgressive) and gradual evolution

789 (by vertical descent) affecting genes and genomes.

790 Decomposing multipartite networks into twins and

791 articulation points could for example then be used to

792 represent and analyze the evolution of complex

compos-793 ite molecular systems, such as CRISPR, or the dynamics

794 of invasions of hairpins in genomes [201].

795

Further justifications for a shift toward network

796

thinking

797

Enlargement of evolutionary biology

798

Focusing evolutionary explanations and theories on

collec-799

tives of interacting components, which may be under

se-800

lection, facilitate selection, or condition arrangements

801

through neutral processes [39, 40, 202], and representing

802

these scaffolding relationships using networks with biotic

803

and abiotic components and a diversity of edges

represent-804

ing a diversity of interaction types would be an

enlarge-805

ment. Enlargements, as expressing the need to consider

806

structures that are more general than what already exists,

807

have already occurred within evolutionary theory, when

808

simplifications from population genetics were relaxed with

809

respect to the original formalization in the Modern

810

Synthesis [203], to account for within-genome interaction

811

[9], gene–environment covariance [204], parental effects

812

[205], and extended fitness though generations [206]. It

813

also occurred when reticulations representing

introgres-814

sions were added to the evolutionary tree.

815

Interestingly, replacing standard linear models in

evolu-816

tionary theory with network approaches would transcend

817

several traditional axes structuring the debates in

evolu-818

tionary biology. For instance, scaffolded evolution, the idea

819

that evolution relies on what came before, is orthogonal to

820

the distinction between vertical and horizontal descent,

821

since both tree-like and introgressive evolution are

par-822

ticular cases of scaffolding. Scaffolded evolution is also

or-823

thogonal to the distinction between gradual and

824

saltational evolution. Likewise, scaffolded evolution is

or-825

thogonal to the debates between the actual role of

adapta-826

tions vs neutral processes. Selection is a key mode of

827

evolution of collectives but not the only one. The

pro-828

cesses involved in the forming and evolution of collectives

829

are not even restricted to the key processes of the Modern

830

Synthesis (drift, selection, mutation and migration) but

831

embrace interactions such as facilitation—namely

antag-832

onistic interactions between two species that allow a third

833

species to prosper by restraining one of its predators or

834

parasites [207], presuppression [39,40], etc. Consequently,

835

some evolutionary concepts may become more important

836

than they currently are to explain evolution. For example,

837

contingency, which means the dependence of an

evolu-838

tionary chain of events upon an event that itself is

contin-839

gent, in the sense that it can’t be understood as a selective

840

response to environmental changes [18,208,209], is often

841

associated with extraordinary events, like mass

decima-842

tion. Contingency could come to be seen as a less

extraor-843

dinary mode of evolution in the history of life, since the

844

ordinary course of evolution might include many cases of

845

contingent events, that is, associations of entities in a

tran-846

sient collective, including any scaffolds—associations that

847

are not necessarily selective responses or the outcomes of

848

(12)

849 Likewise, adopting a broader ontology could affect

850 how evolutionary theorists think about evolution.

Popu-851 lation thinking and tree-thinking came after essentialist

852 conceptions of the living words, when populations and

853 lineages were recognized as central objects of

evolution-854 ary studies [210]. A shift towards collectives and

scaf-855 folded evolution might encorage a similar development:

856 the emergence of an openly pluralistic processual

think-857 ing, consistent with Carl Woese’s proposal to reformu-858 late our view of evolution in terms of complex dynamic

859 systems [211].

860 Further unifying the evolutionary theory

861 Using a network-based approach to analyse dynamic

sys-862 tems also permits explanations that rely purely on

statis-863 tical properties [212] or on topological or graph

864 theoretical properties [213, 214] besides standard

expla-865 nations devoted to unravelling mechanisms responsible

866 for a phenomenon. Moreover, because of the

inclusive-867 ness of the network model, disciplines already

recog-868 nized for their contribution to evolutionary theory

869 (microbiology, ecology, cell biology, genetics, etc.) could

870 become even more part of an interdisciplinary research

871 program on evolution, effectively addressing current

is-872 sues, consistent with the repeated calls for

transdisci-873 plinary collaborations [19–21, 215]. Disciplines that

874 were not central in the Modern Synthesis—chemistry, 875 physics, geology, oceanography, cybernetics or

linguis-876 tics—could aggregate with evolutionary biology. Since a 877 diversity of components gets connected by a diversity of

878 edges in networks featuring collectives, as a result of a

879 diversity of drivers, several explanatory strategies could

880 be combined to analyze evolution. This extension to

881 seemingly foreign fields makes sense when the

compo-882 nents/processes studied by these other disciplines are

883 evolutionarily or functionally related to biotic

compo-884 nents and processes (either as putative ancestors of

bio-885 logical components and processes, like the use of a

886 proton gradient in cells, which possibly derived from

887 geological processes affecting early life [216], or as

de-888 scendants of biological systems, e.g. technically

synthe-889 sized life forms, which have a potential to alter the

890 future course of standard biological evolution).

891 Remarkably, this mode of unification of diverse

scien-892 tific disciplines would be original: the integration would

893 not be a unification in the sense of logical positivism

894 [217]—namely reducing a theory to a theory with more

895 basic laws, or a theory with a larger extension. It would

896 be a piecemeal [218] unification. Some aspects would be

897 unified through a specific kind of graph modeling

898 (because some interactions, namely mechanical,

chem-899 ical, ecological ones, and a range of time scales are

privi-900 leged in a set of theories), while other theories might be

901 unified by other graph properties (like different types of

902

edges and components). For example, the fermentation

903

hypothesis for mammalian chemical communication

904

could be analyzed in a multipartite network framework,

905

which would involve nodes corresponding to individual

906

mammals, nodes corresponding to microbes, and nodes

907

corresponding to odorous metabolites. Nodes

corre-908

sponding to mammals could either be colored to reflect

909

an individual’s properties (its lineage, social position,

910

gender, sexual availability), or these nodes could be

con-911

nected by edges that reflect these shared properties,

912

which defines a first host subnetwork. This host

subnet-913

work can itself be further connected to a second

subnet-914

work, namely the microbial subnetwork in which nodes

915

representing microbes, colored by phylogenetic origins,

916

could be connected to reflect microbial interactions

917

(gene transfer, competition, metabolic cooperation, etc.).

918

Connections between the host and microbial

subnet-919

works could simply be made by drawing edges between

920

nodes representing individuals hosting microbes, and

921

nodes representing these microbes. Moreover, nodes

922

representing mammals and nodes representing microbes

923

could be connected to nodes representing odorous

me-924

tabolites to show what odours are associated with what

925

combinations of hosts and microbes. Elaborating this

926

network in a piecemeal fashion would involve

cooper-927

ation between chemists, microbiologists, zoologists and

928

evolutionary biologists.

929

Of note, the use of integrated networks could

prag-930

matically address a deep concern for evolutionary

stud-931

ies, by connecting phenomena that occur at different

932

timescales: development and evolution [219] or ecology

933

and evolution [220]. Considering transient collectives

934

(thus processes) as stable entities at a given time-scale,

935

when these collectives change much more slowly than

936

the process in which they take part, amounts to a focus

937

on interactions occurring at a given time scale by

treat-938

ing the slower dynamics as stable edges/nodes. Then,

939

various parts of the networks embody distinct

time-940

scales, which may provide a new form of timescale

inte-941

gration, working out the merging of timescales from the

942

viewpoint of the model, and with resources intrinsic to

943

the model itself. The reason for this is that a node in an

944

interaction network Ni, describing processes relevant at

945

a time scale i, can itself be seen as the outcome of

an-946

other (embedded) interaction network Nj, unfolding at a

947

time scale j. This nestedness typically occurs when the

948

node in Ni represents a collective process, involving

949

components that evolve sufficiently slowly with respect

950

to the system considered at the time scale i to figure as

951

an entity, a node in Ni. In the case of a PPI network Ni,

952

each node conventionally represents a protein, but the

953

evolution of each protein could be further analysed as

954

the result of mutation, duplication, fusion and shuffling

955