Fidelity of parent-offspring transmission and the evolution of social behavior in structured populations

(1)

HAL Id: hal-01484829

https://hal.archives-ouvertes.fr/hal-01484829

Submitted on 7 Mar 2017

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

To cite this version:

Florence Débarre. Fidelity of parent-offspring transmission and the evolution of social behav- ior in structured populations. Journal of Theoretical Biology, Elsevier, 2017, 420, pp.26 - 35.

�10.1016/j.jtbi.2017.02.027�. �hal-01484829�

(2)

Fidelity of parent-offspring transmission and the evolution of social behavior in structured

populations

F. Débarre

Centre Interdisciplinaire de Recherche en Biologie (CIRB), Collège de France, CNRS UMR 7241 - Inserm U1050, 11, Place Marcelin Berthelot, 75231 Paris Cedex 05, France

ORCID: 0000-0003-2497-833X [email protected]

Abstract

The theoretical investigation of how spatial structure affects the evolution of social behavior has mostly been done under the assumption that parent-offspring strategy transmission is perfect,i.e., for genetically trans- mitted traits, that mutation is very weak or absent. Here, we investigate the evolution of social behavior in structured populations under arbitrary mutation probabilities. We consider populations of fixed sizeN, structured such that in the absence of selection, all individuals have the same probability of reproducing or dying (neutral reproductive values are the all same). Two types of individuals,AandB, corresponding to two types of social behavior, are competing; the fidelity of strategy transmission from parent to offspring is tuned by a parameterµ. Social interactions have a direct effect on individual fecundities. Under the assumption of small phenotypic differences (implying weak selection), we provide a formula for the expected frequency of typeAindividuals in the population, and deduce conditions for the long-term success of one strategy against another. We then illustrate our results with three common life-cycles (Wright-Fisher, Moran Birth-Death and Moran Death-Birth), and specific population structures (graph-structured populations). Qualitatively, we find that some life- cycles (Moran Birth-Death, Wright-Fisher) prevent the evolution of altruistic behavior, confirming previous results obtained with perfect strategy transmission. We also show that computing the expected frequency of altruists on a regular graph may require knowing more than just the graph’s size and degree.

Keywords:mutation, relatedness, altruism, evolutionary graph theory

(3)

1 Introduction

Most models on the evolution of social behavior in structured populations study the outcome of competition between individuals having different strategies and assume that strategy transmission from parents to their offspring is almost per- fect (i.e., when considering genetic transmission, that mutation is either van- ishingly small or absent). This is for instance illustrated by the use of fixation probabilities to assess evolutionary success (e.g., Rousset & Billiard, 2000; Rous- set, 2003; Nowak et al., 2004; Nowak, 2006; Ohtsuki et al., 2006). Yet, mutation has been shown to affect the evolutionary fate of social behavior (Frank, 1997;

Tarnita et al., 2009) and is, more generally, a potentially important evolutionary force. Here, we explore the role of imperfect strategy transmission—genetic or cultural—from parents to offspring on the evolution of social behavior, when two types of individuals, with different social strategies, are competing. We are interested in evaluating the long-term success of one strategy over another.

A population in which mutation is not close (or equal) to zero will spend a non-negligible time in mixed states (i.e., in states where both types of individ- uals are present), so instead of fixation probabilities, we need to consider long- term frequencies to assess evolutionary success (Tarnita et al., 2009; Wakano &

Lehmann, 2014; Tarnita & Taylor, 2014). We will say that a strategy is favored by selection when its expected frequency is larger than what it would be in the absence of selection.

Obviously, lowering the fidelity of parent-offspring strategy transmission—

e.g., by increasing the probability of mutation—reduces the relative role played by selection. But in a spatially structured population, the fidelity of parent- offspring strategy transmission also affects the spatial clustering of different strate- gies, and in particular whether individuals that interact with each other have the same strategy or not; this effect takes place even in the absence of selection.

Consequently, the impact of imperfect strategy transmission may differ accord- ing to how the population is structured.

In this study, we consider populations such that, in the absence of selection

(when social interactions have no effect on fitness), all individuals have equal

chances of reproducing, and equal chances of dying. In other words, in such a

population of size N , the neutral reproductive value of each site is 1/N (Taylor,

1990; Maciejewski, 2014; Tarnita & Taylor, 2014). We provide a formula that gives

the long-term frequency of a social strategy in any such population, for arbitrary

mutation rates, and for any life-cycle (provided population size remains equal

to N ). This formula is a function of the probabilities that pairs of individuals

are identical by descent. These probabilities are obtained by solving a linear

system of equations, and we present explicit solutions for population structures

(4)

with a high level of symmetry (structures that we call “n-dimensional graphs”).

We finally illustrate our results with widely used updating rules (Moran models, Wright-Fisher model) and specific population structures.

2 Models and Methods

2.1 Population structures

We consider a population of fixed size N , where each individual inhabits a site corresponding to the node of a graph D; each site hosts exactly one individual.

The edges of the graph, {d

i j

}

1≤i,j≤N

, define where individuals can send their off- spring to. We consider graphs D that are connected, i.e., such that following the edges of the graph, we can go from any node to any other node (potentially via other nodes). This simply means that there are not completely isolated sub- populations. Another graph, E , with the same nodes as graph D but with edges {e

_{i j}

}

_1≤i,j≤N

, defines the social interactions between the individuals; E can be the same graph as D , but does not have to be (Taylor et al., 2007a; Ohtsuki et al., 2007; Débarre et al., 2014). The edges of the two graphs can be weighted (i.e., d

_{i j}

and e

_{i j}

can take any non-negative value) and directed (i.e., we can have d

i j

6= d

j i

or e

i j

6= e

j i

for some sites i and j ). For instance, dispersal in a subdi- vided population is represented by a weighted graph (the probability of sending offspring to a site in the same deme as the parent is different from the probability of sending offspring to a site in a different deme.) Finally, we denote by D and E the adjacency matrices of the dispersal and interaction graphs, respectively (D = {d

_{i j}

}

_1≤i,j≤N

, E = {e

_{i j}

}

_1≤i,j≤N

).

Regular dispersal graphs In this study, we focus on dispersal graphs that are regular, i.e., such that for all sites i , the sum of the edges to i and the sum of the edges from i are both equal to ν :

N

X

j=1

d

i j

=

N

X

j=1

d

j i

= ν, (1)

where ν is called degree of the graph when the graph is unweighted. All the graphs depicted in the article (figures 1 and 3) satisfy eq. (1), and then are regu- lar. Note that there is no specific constraint on the interaction graph E .

More detailed results are then obtained for regular graphs that display some

level of symmetry, that we now describe:

(5)

Transitive dispersal graphs A transitive graph is such that for any two nodes i and j of the graph, there is an isomorphism that maps i to j (Taylor et al., 2007a,b): the graph looks the same from every node. In other words, the dis- persal graph is transitive when it is “homogeneous” (sensu Taylor et al., 2007a), i.e., when all nodes have exactly the same properties in terms of dispersal. In fig- ure 1, graphs (b)–(e) are transitive. On the other hand, all the nodes of graph (a) are different (for instance, node 9 is in a triangle while node 12 is not), so this regular graph is not transitive.

Transitive undirected dispersal graphs A graph is undirected if for any two nodes i and j, the weight of the edge from i to j is equal to the weight of the edge from j to i (i.e., there is no need to use arrows when drawing the edges of the graph). The dispersal graph is undirected when for all sites i and j, d

i j

= d

j i

. In figure 1, graphs (b), (c), (e) are both transitive and undirected.

n

-dimensional

1-D 2-D

(a)

8 9 10

11 12

1

2

3 4 5

6 7

8

9 10

11 12

(b)

1 2

3 4 6 5 7 8 9

10 11 12

(c)

1

2 3 4 5

6

7 8 9 10 11 12

(d)

1 2

3 4 6 5 7 8 9

10 11 12

(e)

1 2 3 4

4 5 6 7

1 2 3 4

5 6 7 8

9 10 11 12

Figure 1:

Examples of regular graphs of size

¹²

. The graphs on the first line are unoriented and unweighted graphs of degree

ν=3

; Graph (d) is oriented, graph (e) is weighted. (a) is the Frucht graph, and has no symmetry. Graphs (b) and (d) are one-dimensional, graphs (c) and (e) are two-dimensional (see main text).

“n-dimensional” dispersal graphs We call “n-dimensional graphs” tran-

sitive graphs whose nodes can be relabelled with n-long indices, such that the

graph is unchanged by circular permutation of the indices in each dimension

(see eq. (2)). We denote by N the ensemble of node indices: N = {0, . . . , N

₁

−

(6)

1} × · · · × {0, . . . , N

n

− 1}, with Q

n

k=1

N

n

= N ; numbering is done modulo N

k

in di- mension k. Then for all indices i , j and k of N , node labeling is such that for all edges (modulo the size of each dimension),

d

i j

= d

i+k,j+k

. (2) In figure 1, graphs (b) and (d) are 1-dimensional: we can label their nodes such that the adjacency matrices are circulant. Graphs (c) and (e) are 2-dimensional:

the adjacency matrices are block-circulant, with each block being circulant. In 1(c), one dimension corresponds to the angular position of a node (N

1

= 6 posi- tions), and the other dimension to the radial position of a node (N

2

= 2 positions, inner or outer hexagon). In 1(e), one dimension corresponds to the horizontal position of a node (N

1

= 4 positions) and the other to the vertical position of a node (N

2

= 3 positions). Condition eq. (2) may sound strong, but is satisfied for the regular population structures classically studied, like stepping-stones (e.g., cycle graphs, lattices), or island models (Taylor, 2010; Taylor et al., 2011).

2.2 Types of individuals and social interactions

There are two types (A and B) of individuals in the population, corresponding to two strategies of social behavior. There are no mixed strategies: an individual of type A plays strategy A, and individuals do not change strategies. The indicator variable X

_i

represents the type of the individual present at site i: X

_i

is equal to 1 if the individual at site i is of type A, and X

i

is equal to 0 otherwise (X

i

= 1

A

(i)).

A N -long vector X gathers the identities of all individuals in the population, and X is the population average of X (X = P

N

i=1

X

_i

/N ). The ensemble of all possible states is Ω = {0, 1}

^N

.

Individuals in the population reproduce asexually. Fecundities are affected by social interactions, and are gathered in a N -long vector f . We assume that the genotype-phenotype map is such that the two types A and B are close in phenotype space: the individual living at site i expresses a phenotype δ X

i

, with δ ¿ 1 (a feature called “ δ -weak selection” by Wild & Traulsen (2007)).

An individual’s fecundity depends on the phenotypes of the individuals it in- teracts with and on its own phenotype ( δ X

i

for the individual at site i). Without loss of generality, we can write the fecundity of the individual living at site i as

f

_i

(X , δ) = φ

i

(e

_1i

δ X

₁

, . . . , e

_{l i}

δ X

_l

, . . . , e

_{N i}

δ X

_N

;δ X

_i

) , (3a)

where the N first arguments correspond to the potential interactants (an indi-

vidual can also be interacting with itself if e

i i

> 0, which can occur for instance

with a common-good), and the last (N + 1) argument is the phenotype of the

focal individual.

(7)

Scaling fecundities such that the baseline in the absence of selection is equal to 1, a first-order expansion of eq. (3a) yields

f

i

(X, δ ) = 1 +δ Ã

_N

X

l=1

¡ e

_{l i}

X

_l

∂

(l)

φ (0, . . . , 0; 0) ¢

+ X

i

∂

(N+1)

φ (0, . . . , 0; 0)

!

+ O( δ

²

), (3b) where ∂

(n)

φ

i

represents the partial derivative of φ

i

with respect to its n

^th

ele- ment.

We do not need specify a particular shape for φ

i

; the only assumption that we make is that it does not matter where the interactions actually take place, only that they do take place, and that it does not matter either where the focal indi- vidual is (i.e., there are no external sources of heterogeneity affecting individual fecundities). So for all i and l , 1 ≤ i, l ≤ N , we can write ∂

(l)

φ

i

(0, . . . , 0; 0) = b and

− c ₌ _∂

_(N+1)

_φ

_i

(0, . . . , 0; 0). Then eq. (3b) becomes f

i

(X , δ) = 1 + δ

Ã b

N

X

l=1

e

l i

X

l

− c X

i

!

+ O(δ

²

). (3c) In other words, no matter the choice of the fecundity functions φ, provided that only the phenotypes of the individuals and their interactants matter, at the first order in δ we only need two parameters, b and c , to characterize the fecundity functions.

Our results will be valid for any b and c , but throughout the article, we will consider the case where b _> 0 and c _> 0, so that type-A individuals are “altruists”

providing benefits ( b ) and paying a cost ( c ), and we will seek to understand the impact of imperfect strategy transmission on the frequency of altruists.

Finally, we note that when δ = 0, all individuals in the population, whichever their type, have the same fecundity: the trait is then neutral.

2.3 Reproduction and strategy transmission

The expected number of successful offspring established at site j at the next time step, descending from the individual who is living at site i at the current time step, is denoted by B

j i

( f (X, δ )), written B

j i

for simplicity. “Successful off- spring” of a focal individual means individuals who descend from this focal in- dividual and who are alive and established at the start of the next time step. Be- cause there is exactly one individual per site, 0 ≤ B

j i

≤ 1. We assume that B

j i

does not depend on external factors such as temporal fluctuations independent of the state of the population.

Individuals imperfectly transmit their strategy to their offspring. We do not

specify the nature of this transmission (it can be genetic, it can be vertical cul-

(8)

tural transmission), but we use for simplicity the term “mutation” to character- ize transmission failure. Mutation occurs with probability µ , 0 < µ ≤ 1; when mutation occurs, the offspring are of type A with probability p and of type B otherwise (0 < p < 1). For instance, under this mutation scheme, the offspring of an individual of type A is also of type A with probability 1 −µ+µ p (Taylor et al., 2007b; Nowak et al., 2010; Tarnita & Taylor, 2014). The parameter p controls the asymmetry of mutation, and it is also the expected frequency of type-A individ- uals in the absence of selection (i.e., when δ = 0). Although the use of the word

“mutation” hints at a genetic transmission of the trait, this framework can also describe vertical cultural transmission, so µ does not have to be small. The mu- tation probability, however, cannot be zero; if it were, the all-A and all-B states would be absorbing: we would end up either with only type-A or only type-B individuals in the population, and we would not be able to define a stationary distribution of population states—for similar reasons, p cannot be 0 nor 1.

We denote by D

_i

(f (X ,δ)) (or D

_i

for simplicity) the probability that the indi- vidual living at site i is dead at the beginning of the next time step, given that the population is currently in state X . This probability of death at site i can be ex- pressed as a function of the probabilities of birth and establishment of offspring at site i , summing over the locations j of the potential parents:

D

_i

= X

N j=1

B

_{i j}

. (4)

There is exactly one individual per site, so at a given site i , there can be at most one successfully established offspring at each time step, and 0 ≤ D

i

≤ 1. On the other hand, the expected number of offspring of the parent currently living at site i is 0 ≤ P

N

j=1

B

j i

≤ N .

Finally, we are considering dispersal graphs such that in the absence of se- lection (δ = 0), all individuals have the same probability of reproducing, and all individuals have the same probability of dying—meaning that all sites in the population have the same reproductive value 1/N (Taylor, 1990; Caswell, 2001;

Lieberman et al., 2005; Maciejewski, 2014; Allen et al., 2015); this implies that for all sites i

X

N j=1

B

_{j i}

(f (X , 0)) = B

^∗

= D

_i

( f (X , 0)). (5)

The parameter B

^∗

is the expected number of offspring produced by an individ-

ual during a time step in the absence of selection ( δ = 0); it is the same for all

individuals, but the value taken by B

^∗

depends on the life-cycle that is consid-

ered.

(9)

2.4 Life-cycles

Most of our results are derived without specifying a life-cycle (also called “up- dating rule”). In the Illustrations section, we will give specific examples using classical life-cycles: Moran models (Birth-Death and Death-Birth), with exactly one birth and one death during a time step, and the Wright-Fisher model, where all adults die and are replaced by new individuals at the end of a time step.

2.5 Simulations

Stochastic simulations, coded in C , were run to numerically confirm the analyti- cal results. For each combination of parameters, a simulation was run for 4 × 10

⁹

generations, the state of the population being sampled every 400 generations, where one generation corresponds to N time steps with the Moran updating, and 1 time step with the Wright-Fisher updating. A set of parameters corre- sponded to a choice of updating rule, of population structure, of mutation prob- ability ( µ ∈ {0.01, 0.025, 0.05, 0.1, 0.15, 0.2}) and of mutation bias (p ∈ {0.3, 0.5}.

3 Results

Expected frequency of type-A individuals in the population

We describe here the key steps of the computation of the expected frequency of type-A individuals in the population and refer the reader to Appendix A for mathematical details.

We denote by Ω the set of all possible states of the population (Ω = {0, 1}

^N

).

No state is absorbing (thanks to mutation, a lost strategy can always reappear), and all states are accessible. We denote by ξ (X, δ , µ ) the stationary distribu- tion of population states, i.e., the probability that, after a long enough number of time steps, the population is in state X, in a model with strength of selec- tion (phenotype differences) δ and mutation probability µ . Notation E £ ¤

de- notes expectation, for instance the expected state of the population is E £

X ¤

= P

X∈Ω

X ξ (X , δ , µ ). . The expected frequency of type-A individuals in the popula- tion, denoted by E £

X ¤

, can be computed considering what happens during one

during step. Given state X of the population, at the end of the time step, the state

of the individual living at site i depends on whether it has survived during the

time step (first term within the brackets of eq. (6)), and, if it has been replaced,

on the type of the newly established offspring (second term within the brackets);

(10)

we then take the expectation over all population states, and obtain:

E £ X ¤

= X

X∈Ω

1 N

N

X

i=1

· (1 − D

i

)X

i

+

N

X

j=1

B

i j

¡

X

j

(1 −µ ) + µ p ¢

¸

ξ (X , δ , µ ). (6) This is the expected frequency of type-A individuals in the population. For in- stance, if we run a simulation of the model for a very long time, the average over time of the frequency of type-A individuals will provide an estimation of E £

X ¤

; this quantity does not depend on the initial state of the population.

We then assume that selection is weak, i.e., δ is small, and write a first-order expansion of eq. (6) that contains derivatives of ξ, D

i

and B

i j

with respect to δ.

For the last two, we further use the chain rule with the variables f

k

, which rep- resent the fecundity of the individual living at site k. In doing so, we let appear quantities that are the expectations of the state of pairs of sites when no selec- tion is acting (i.e., when δ = 0; we call these “neutral expectations” and ξ (X , 0, µ ) is called the neutral stationary distribution):

P

j k

= X

X∈Ω

X

j

X

k

ξ(X , 0,µ) = E

0

£ X

j

X

k

¤

. (7)

The fact that these neutral expectations appear in our equations does not mean that selection is initially not acting and then “turned on”: selection is act- ing all the time, but it is weak because phenotypic differences are small ( δ ¿ 1).

At the first order in δ, we can ignore the effect of selection on the expected state of pairs of sites, and this is why we only need neutral expectations (eq. (7)).

Eventually, we deduce that the expected frequency of individuals of type A in the population can be written as

E £ X ¤

≈ p + δ µ B

^∗

N

· b

Ã

_N

X

j,k,l=1

Ã

_N

X

i=1

(1 −µ ) ∂

fk

B

i j

−∂

fk

D

j

!

e

_{l k}

P

_{j l}

+ µ X

^N

i,j,k,l=1

∂

fk

B

i j

e

_{l k}

p

²

!

− c Ã

_N

X

j,k=1

Ã

_N

X

i=1

(1 − µ) ∂

fk

B

_{i j}

− ∂

fk

D

_j

!

P

_{j k}

+ µ X

^N

i,j,k=1

∂

fk

B

_{i j}

p

²

! ¸ , (8) with P as defined in eq. (7), ∂

fk

being a shorthand notation for

_∂^∂_f

k

¯

_δ=0

, and P

N

i,j,k,l=1

being a compact way of writing P

N i=1

P

N j=1

P

N k=1

P

N

l=1

. Eq. (8) is an ap- proximation at the first order in δ (we neglect terms in δ

²

and higher). A weak mutation approximation of eq. (8) is presented in Appendix A.4.

Eq. (8) is still implicit, because we need to evaluate the P

i j

terms, which we

now do.

(11)

Expected state of pairs of sites at neutrality

We recall that P

i j

, defined in eq. (7), is also the probability that both sites i and j are occupied by individuals of type A, at neutrality (i.e., when δ = 0). Under van- ishing mutation ( µ → 0), convenient connections can be made between identity in state and identity-by-descent (Cockerham & Weir, 1993; Rousset & Billiard, 2000), and then with coalescence times (Slatkin, 1991, 1993; Rousset, 2004; Allen et al., 2012). Here as well, we can characterize P

i j

in terms of probabilities of identity-by-descent, Q

i j

. Two individuals at sites i and j are said to be identical by descent (IBD) if they share a common ancestor and if no mutation occurred in their lineages since this common ancestor (Kimura & Crow, 1964, note though that the original definition is with an infinite allele model, where each mutation creates a new allele). If two individuals are IBD, then they are both of type A with probability p, the expected state of a single individual at neutrality. If two individuals are not IBD, then they are both of type A with probability p

²

. Sim- plifying, we obtain

P

i j

= p

²

+Q

i j

p (1 − p) (9) (Rousset & Billiard, 2000; Allen & Nowak, 2014) (see Appendix B.1 for more de- tails). Eq. (9) also valid when i = j . So we can work with IBD relationships.

To find the probabilities of identity-by-descent, we first write the probability that two individuals at sites i and j are IBD given the state X of the population at the previous time step, and then take the expectation of this conditional prob- ability. We can still do so without specifying the way the population is updated (using notation as in Allen et al. (2015)), and the resulting equation is presented in Appendix B.1, eq. (B.1). This equation can also be adapted to specific up- dating rules, as shown in the Illustrations section (details of the calculations are provided in Appendix B).

Keeping in mind that Q

i j

= Q

j i

and that Q

i i

= 1, we then have to solve a linear system of N (N − 1)/2 equations to obtain explicit formulas for all the Q

i j

terms, for any regular graph. More explicit formulas for Q

_{i j}

can be found for regular graphs, and in particular for n-dimensional graphs, as we will see in the Illustrations section. Finally, we can gather all probabilities of identity by de- scent in a matrix Q = {Q

_{i j}

}

_1≤i,j≤N

.

Back to the expected frequency of type-A individuals

Using the relationship between the expected state of pairs of sites P

i j

and prob-

abilities of identity-by-descent Q

i j

(eq. (9)), we can rewrite eq. (8) as follows (see

(12)

Appendix A.5 for details):

E £ X ¤

≈ p + δ p(1 − p) µ B

^∗

N

· b

Ã

_N

X

j,k,l=1

Ã

_N

X

i=1

(1 − µ ) ∂

fl

B

i j

− ∂

fl

D

j

! e

_kl

Q

_{j k}

!

− c Ã

_N

X

j,k=1

Ã

_N

X

i=1

(1 −µ)∂

fk

B

_{i j}

− ∂

fk

D

_j

! Q

_{j k}

! ¸ ,

(10)

where as before ∂

fk

is a shorthand notation for

_∂^∂_f

k

¯

_δ=₀

, and the sums are written in a compact way.

Interpretation For each focal individual at site k , we consider the influence that this individual can have on an identical-by-descent individual at site j (Q

j k

), by affecting the production of new identical-by-descent individuals by j ((1 − µ) P

N

i=1

B

_{i j}

), or j’s survival (D

_j

). This can occur because of intrinsic changes (the cost of being social c ) in the fecundity of individual k ( ∂

fk

), and because the focal k provides a benefit to an individual l ( b e

_kl

) – where l is j itself or another individual in the population – changing l’s fecundity (∂

fl

), with repercussions on j . Finally, we note that the factor associated to (− c ) is non-negative (see Ap- pendix A.6.)

Structure parameter We say that a strategy is favored if its frequency at the mutation-selection-drift equilibrium is higher than what it would be in the ab- sence of selection. For type A, this translates into E £

X ¤

> p. With eq. (10), this condition becomes

P

N j,k,l=1

¡P

N

i=1

(1 − µ ) ∂

f_l

B

i j

−∂

f_l

D

j

¢ e

kl

Q

j k

P

N j,k=1

¡P

N

i=1

(1 − µ ) ∂

fk

B

i j

− ∂

fk

D

j

¢ Q

j k

| {z }

κ

b − c > 0. (11)

Hence, a single parameter, κ , summarizes, for a given life-cycle, the structure of the population and the effect of mutation (Tarnita et al., 2009; Taylor & Ma- ciejewski, 2012); κ is interpreted as a scaled coefficient of relatedness, that in- cludes the effect of competition (Lehmann & Rousset, 2010).

Alternative formulation The presence of µ at the denominator in eq. (10) might look ominous, given that our equation is meant to be valid for any mutation probability. However, we note that the probabilities of identity by descent can be written

Q

_{i j}

= 1 + µ Q ˜

_{i j}

, (12)

(13)

since in the limit µ → 0, all individuals in the population are identical by de- scent. If we now replace Q

i j

using eq. (12) in eq. (10), recalling that the size of the population is fixed (eq. (4)), we obtain

E £ X ¤

≈ p + δ p(1 − p) B

^∗

N

· b

Ã

_N

X

j,k,l=1

Ã

_N

X

i=1

(1 − µ ) ∂

f_l

B

i j

− ∂

f_l

D

j

!

e

kl

Q ˜

j k

−

N

X

i,j,k,l=1

∂

f_l

B

i j

e

kl

!

− c Ã

_N

X

j,k=1

Ã

_N

X

i=1

(1 − µ ) ∂

fk

B

i j

−∂

fk

D

j

! Q ˜

_{j k}

−

N

X

i,j,k=1

∂

fk

B

i j

! ¸ . (13) This confirms that dangerous looking denominator µ in eq. (10) is not problem- atic, even for small mutation probabilities. The sums P

N

i,j=1

B

i j

correspond to the total number of births in the population during one time step, which is inde- pendent of the composition of the population in the life-cycles that we consider as examples (so the last terms on each line of eq. (13) will disappear).

4 Illustrations

4.1 Updating rules

The results presented so far were valid for any updating rule, provided it is such that population size remains equal to N . We now express the expected frequency of type-A individuals for specific updating rules, commonly used in studies on the evolution of altruistic behavior in structured populations: the Moran model and the Wright-Fisher model. Under a Moran model (Moran, 1962), exactly one individual dies and one individual reproduces during one time step; hence, at neutrality, B

^∗

= 1/N (B

^∗

was defined in eq. (5)). The order of the two events matters, so two updating rules are distinguished (Ohtsuki & Nowak, 2006; Oht- suki et al., 2006): Birth-Death and Death-Birth. In both cases, payoffs are com- puted at the start of each time step, before anything happens.

4.1.1 Moran model, Birth-Death

Any regular graph Under a Birth-Death (BD) updating, an individual j is cho- sen to reproduce with a probability equal to its relative fecundity in the popula- tion ( f

j

/ P

N

l=1

f

l

); then its offspring disperses at random along the D graph, and so replaces another individual i with a probability d

j i

/ ν , so that

B

i j

= f

j

P

N l=1

f

l

d

j i

ν , and D

j

=

N

X

i=1

B

j i

= P

N

i=1

f

i

d

i j

ν P

N l=1

f

l

. (14)

(14)

Note that with this updating rule, the probability of dying D

j

depends on the composition of the population. With these probabilities of reproducing and dy- ing eq. (10) becomes,

E £ X ¤

≈ p +δ p(1 − p) µ

· b

µ

N

X

k,l=1

1 −µ

N e

_kl

Q

_{l k}

−

N

X

j,k,l=1

µ d

_{l j}

N ν − µ

N

²

¶ e

_kl

Q

_{j k}

¶

− c Ã

1 − µ −

N

X

j,k=1

µ d

k j

N ν − µ N

²

¶ Q

j k

! ¸ ,

(15a)

or, using matrix notation,

E £ X ¤

≈ p + δ p(1 − p) µ

· b

µ

β^BDD

z }| {

1 − µ

N Tr (E · Q)−

β^BD_I

z }| {

µ Tr (E · D · Q)

N ν − µ Tr (E · 1

N×N

· Q) N

²

¶ ¶

− c µ

1 −µ

| {z }

γ^BDD

−

µ Tr (D · Q)

N ν − µ Tr ( 1

N×N

· Q) N

²

¶

| {z }

γ^BDI

¶¸

,

(15b) where Tr (M) denotes the trace of a matrix M, i.e., the sum of its diagonal ele- ments, and 1

N×N

is the N -by-N matrix of ones. Each of the factors associated to the b _{and (−} c ) terms contain direct (

D

) effects, discounted by indirect effects (

I

).

Recall that since we moved from a description with the expected state of pairs of sites (P

i j

, eq. (8)) to a description with probabilities of identity-by-descent (Q

i j

, eq. (10)), we interpret the different terms in terms of survival and production of identical-by-descent offspring.

Interpretation The direct effect term β

^BD_D

corresponds to the additional identical-by-descent (hereafter, IBD) offspring (1 − µ ) produced by interacting with IBD individuals

^Tr(E_N^·^Q)

. Where there is only one type of interactant (for instance, neighbors on a lattice, or members of the same group),

^Tr(E_N^·^Q)

can be described as the relatedness to social interactants times the number of in- teractants. But when there are different types of interactants (e.g., on a non- symmetric structure like figure 1(a), or when there are weights on the interac- tion graph E ), then we cannot talk of “a” relatedness, and instead consider an averaged relatedness, weighted by the interaction graph E .

Social interactions also have indirect consequences ( β

^BD_I

). First, a focal k that

helps (e

_kl

) an individual l who can send offspring (d

_{l j}

) to a site j occupied by an

individual IBD to the focal (Q

j k

), indirectly affects the survival of that individual

(15)

k (

^Tr(E·D·Q)_N_ν

); since this is about survival, there is no µ involved here. The second term of β

^BD_I

corresponds to competitors l whose increased fecundity (thanks to interactions with a focal k, e

kl

) could indirectly reduce the birth rate of individ- uals j IBD to the focal (Q

j k

), but whose fecundity increase was “wasted” by the production of non-IBD offspring ( µ ).

The terms associated to (− c ) have a similar interpretation. Here, we consider the consequences of the cost of being social, i.e., of the reduction in a focal individ- ual k’s fecundity. The direct effect γ

^BD_D

corresponds to this reduction of fecundity and its impact on the production of IBD individuals (1− µ). The indirect effects γ

^BD_I

are due i) to the indirect changes in the survival of an individual j IBD to a focal individual k (Q

_{j k}

), who is less likely to be replaced by the offspring of k (d

_{k j}

) since k is less fecund, and ii) to the increased relative fecundity of individ- uals j IBD to the focal k (Q

j k

), “wasted” by the production of non-IBD offspring ( µ ).

When transmission is almost perfect (µ → 0), we recover Grafen & Archetti (2008)’s result that the competition neighborhood under a Birth-Death updating is one dispersal step away (hence the D terms in eq. (15b)). Decreasing the fi- delity of parent-offspring transmission, by increasing µ, not only changes prob- abilities of identity-by-descent (Q), but also the kind of competition to take into account. This is because social interactions affect both the birth and death of individuals, and the issue of transmission fidelity only concerns reproduction, not survival.

Probabilities of identity by descent With the Birth-Death updating rule, the probabilities of identity by descent satisfy, for any i and j 6= i ,

Q

_{i j}

= 1 − µ 2 ν

X

N k=1

¡ d

_{k j}

Q

_ki

+ d

_ki

Q

_{k j}

¢

(16) (see Appendix B.2 for details on the derivation). For generic regular graphs, we have to solve a system of N (N − 1)/2 equations to find the probabilities of iden- tity by descent.

Transitive undirected graphs When the graph is transitive and undirected, prob- abilities of identity by descent verify

Q = µλ

⁰M

µ

I

N

− 1 − µ ν D

¶

₋1

, (17)

(16)

where I

N

is the identity matrix, and µλ

⁰_M

is such that Q

i,i

= 1 for all i (the

M

index stands for “Moran”). With eq. (17), eq. (15) simplifies into

E £ X ¤

≈ p + δ p(1 − p) N

· b

µ −2 +µ

1 − µ Tr (E · Q) + λ

⁰_M

Tr (E)

1 − µ + Tr (E · 1

N×N

) λ

⁰_M

N

¶

− c µ

N −2 + µ

1 − µ + λ

⁰_M

N 1 − µ + λ

⁰M

¶ ¸ .

(18)

The term Tr (E) /N corresponds to social interactions with oneself; it is usually considered as null in the case of pairwise interactions, but is not for common good type of interactions (when benefits are pooled and then redistributed).

We show in Appendix C.1.3 that the sum of the other two terms associated to the benefits b is negative or zero. So unless interactions with oneself are strong (large Tr (E) /N ), the factor modulating the effect of benefits b is non-positive.

We noticed previously that the factor associated to (− c ) is non-negative; conse- quently, the expected frequency of altruists cannot be greater than what it would be in the absence of selection (i.e., E £

X ¤

≤ p.) when interactions with oneself are small.

Evaluating probabilities of identity by descent in transitive regular graphs still requires the inversion of a N by N matrix (eq. (17)), which can limit applica- tions. Results are simpler in graphs that match our definition of “n-dimensional graphs”; they depend on the dimensionality n of the graph and are presented in Appendix B.2.

4.1.2 Moral model, Death-Birth

Any regular graph Under a Death-Birth (DB) updating, the individual who is going to die is chosen first, uniformly at random (i is chosen with probability 1/N ). Then, all individuals produce offspring, and one of them (one offspring of parent j wins with probability f

j

d

j i

/ P

N

l=1

f

_l

d

_{l i}

) replaces the individual chosen to die. When d

i i

6= 0, one needs to clarify whether the individual chosen to die reproduces before dying or not; here we assume that this is the case, but some alternative formulations do not. Under this updating rule, we have

D

j

= 1

N , and B

i j

= 1 N

f

j

d

j i

P

N l=1

f

l

d

l i

. (19)

(17)

Using matrix notation, eq. (10) becomes E £

X ¤

≈ p +δ p (1 − p)

µ (1 − µ )N

· b

Ã

_N

X

k,l=1

e

kl

Q

l k

N −

N

X

i,j,k,l=1

d

j i

d

_{l i}

N ν

²

e

kl

Q

j k

!

− c Ã

1 −

N

X

i,j,k=1

d

_{j i}

d

_ki

N ν

²

Q

_{j k}

! ¸ ,

(20a)

or, using matrix form,

E £ X ¤

≈ p + δ p(1 − p) µ

· b

µ

β^DB_D

z }| {

1 − µ

N Tr (E · Q) −

β^DB_I

z }| {

1 −µ N ν

²

Tr ¡

E · D · D

^T

· Q ¢

¶

− c µ

1 − µ

| {z }

γ^DB_D

− 1 − µ N ν

²

Tr ¡

D · D

^T

· Q ¢

| {z }

γ^DB_I

¶¸

,

(20b)

where

^T

denotes transposition.

Interpretation We can again identify direct and indirect effects of benefits and costs; the direct effects (

D

) are the same as for the Birth-Death updating rule, but the indirect effects (

I

) differ. First, the indirect effects reflect the fact that competitors are now two dispersal steps away (Grafen & Archetti, 2008; Débarre et al., 2014). Under a Death-Birth updating rule indeed, individuals j and k are competing for a site i whose occupant has just been chosen to die if both j and k can send their offspring to i; this depends on d

_{j i}

d

_ki

, leading to the D · D

^T

products in eq. (20). Second, with a Death-Birth updating, social interactions do not affect the probability of dying, so we only take into account effects on reproduction, and we can factor in the (1− µ) terms.

Probabilities of identity by descent With the Death-Birth model as de- fined above, the system of equations for the probabilities of identity by descent at neutrality is the same as in eq. (16).

Transitive undirected graphs When the graph is transitive and undirected, eq. (17) still holds and eq. (20) simplifies into

E £ X ¤

≈ p +δ p(1 − p) N

· b

µ −2 + µ

1 − µ Tr (E · Q) + λ

⁰M

Tr (E · D) ν + λ

⁰_M

1 − µ Tr (E)

¶

− c µ

N −2 + µ 1 − µ + λ

⁰M

Tr (D)

ν + λ

⁰_M

N 1 − µ

¶ ¸ .

(21)

(18)

Now, even in the absence of self-interactions (i.e., even when Tr (E) = 0), the term associated to b can be positive. As it is the case in the absence of mutation (Oht- suki et al., 2007; Taylor et al., 2007a; Débarre et al., 2014), a key role is played by

^Tr(E·D)_ν_N

, which is higher the more the D and E graphs overlap; if we also scale the interaction graph E to control for the number of interactants, then a lower degree makes

^Tr(E_ν_N^·^D)

higher, and thereby increases the expected frequency of al- truists in the population.

We also note that eq. (21) (Death-Birth) and eq. (18) (Birth-Death) become the same when D/ν = 1

N×N

/N , i.e., when the dispersal graph is the complete graph (with self-loops), in other words when the population is unstructured.

This reflects the fact that in the Birth-Death updating, the individual who re- produces is chosen among all individuals of the population, while in the Death- Birth updating, the individual who reproduces is chosen locally, among the neigh- bors of the individual who just died. This scaling persists with arbitrary fidelity of parent-offspring transmission µ.

4.1.3 Wright-Fisher

Under a Wright-Fisher model, generations are non-overlapping: all adults pro- duce offspring, then all adults die and the offspring disperse and compete for establishment, so that

D

j

= 1, and B

i j

= f

j

d

j i

P

N l=1

f

l

d

l i

. (22)

In a Wright-Fisher model, at neutrality, B

^∗

= 1 (the entire population is renewed at each generation; in a Moran model we had B

^∗

= 1/N ); eq. (22) differing from its Moran Death-Birth equivalent (eq. (19)) by only a factor 1/N , we end up with the same equation as eq. (20) for the expected frequency of type-A individuals in the population. The difference between the Moran Death-Birth and Wright- Fisher life-cycles however lies in the evaluation of probabilities of identity by descent.

Probabilities of identity by descent Under a Wright-Fisher model, the en- tire population is replaced, so the equation is different from the one obtained under a Moran model; probabilities of identity by descent of two different indi- viduals satisfy (i 6= j )

Q

i j

= (1 −µ)

²

X

^N

k,l=1

d

_ki

ν

d

l j

ν Q

kl

. (23)

(19)

(see Appendix B.3 for details of the derivation.) In short, two individuals are identical by descent if there parents were, and if neither offspring is a mutant ((1− µ)

²

).

Undirected transitive graphs When the dispersal graph is undirected (D = D

^T

) and transitive, the probabilities of identity by descent verify

Q = µλ

W F⁰

µ

I

N

− (1 − µ )

²

ν

²

DD

¶

−1

, (24)

with µλ

⁰_{W F}

such that for all i , Q

i i

= 1, and the

W F

index stands for “Wright- Fisher”. With this, the expected frequency of type-A individuals becomes

E £ X ¤

≈ p + δ p(1 − p) N (1 − µ )

²

· b ^¡ ( − 2 + µ )Tr (E · Q) + λ

W F⁰

Tr (E) ¢

− c ^¡ N (−2 + µ ) + N λ

⁰W F

¢

¸ .

(25)

We can immediately see the difference with the Moran Death-Birth case (eq. (21)), caused by a different equation for the probabilities of identity by descent Q. Cru- cially missing in eq. (25) is the positive term

_1−µ^λ^M ^Tr(E·D)_Nν

: without it, the factor associated to the benefits b is negative unless interactions with oneself (Tr (E)) are strong enough, as was the case with the Moran Birth-Death updating.

As for the Moran model, evaluating probabilities of identity by descent in undirected transitive graphs (eq. (24)) involves the computation of the inverse of a N by N matrix. More explicit results can be obtained for “n-dimensional graphs”; they are presented in Appendix B.3.

4.2 Specific population structures

All numerical examples given in this section are derived with b > 0 and c > 0, so type-A individuals can be called altruists.

As an illustration, we explore the impact of mutation on the expected pro- portion of type-A individuals in graph-structured populations, in which the same graph defines dispersal and interactions among individuals (Lieberman et al., 2005; Hindersin & Traulsen, 2015; McAvoy & Hauert, 2015), so that E = D.

When the graph undirected and transitive, the equations for the expected

frequency of altruists (type-A individuals) can be further simplified; the formu-

las are given in the Appendix(eq. (C.8) and eq. (C.12)). Under a Wright-Fisher

updating, eq. (25) cannot be much further simplified.

(20)

4.2.1 Small graphs

For regular graphs of small size, the probabilities of identity by descent can be calculated directly using eq. (16) (Moran model) or eq. (23) (Wright-Fisher). In figure 2, we show the value of E £

X ¤

on three regular graphs that have the same size (N = 12) and the same degree ( ν = 3), and we consider three common life- cycles in populations of fixed size (Moran Death-Birth, Moran Birth-Death, Wright- Fisher). We compare the prediction based on eq. (8) (curves) to the outputs of stochastic simulations (points) (Comparable results are obtained with other val- ues of mutation bias p, see figure S1). For all life-cycles, increasing the mutation probability µ makes E £

X ¤

closer to its value at the mutation-drift equilibrium (p). The curves corresponding to different structures are almost undistinguish- able under a Moran model (figures 2(a) and (b))—the curve corresponding to the graph with no symmetry (red, squares) being a bit less similar though). In the Wright-Fisher model (figure 2(c)) however, the effects of the three structures are clearly different, even when µ becomes very small: knowing only the size (N ) and degree (ν) of a regular graph is not enough in this case to precisely predict the expected frequency of altruists in the population. This is because the λ

⁰_{W F}

terms greatly differ between the three graphs that we tested, all the more when µ → 0, while the values of λ

⁰_M

for the three structures remained close to each other.

4.2.2 Large graphs: variations on a circle

When the number of nodes gets larger, we have to concentrate on graphs with a high level of symmetry. Here we will consider 1-dimensional graphs (graphs whose nodes can be relabelled to satisfy eq. (2)) that are undirected, and hence that can be categorised as undirected transitive graphs. For simplicity, we can consider a circle graph, such that the nodes are arranged on a circle, and each node is connected to its two neighbors only. Here, we assume that the num- ber of nodes is infinite: N → ∞. As previously, a given node hosts exactly one individual (see figure 3(a)).

Under a Moran model, using eq. (B.12b), we find for µ > 0 λ

⁰_M

=

p (2 − µ)

p µ , (26a)

and, although the quantity is not needed to compute E £ X ¤

under a Moran model

(see eq. (C.8) and eq. (C.12)), the probability of identity by descent between two

(21)

Population structures

8 9 10

11 12

1

2

3 4 5

6 7

8

9 10

11 12

1 2

3 4 6 5 7 8 9

10 11 12

1

2 3 4 5

6

7 8 9 10 11 12

(a)Death-Birth

Mutation (µ)

E[X]

0.00 0.10 0.20 0.46

0.47 0.48 0.49 0.50 0.51 0.52

● ●● ● ● ●

(b) Birth-Death

Mutation (µ)

E[X]

0.00 0.10 0.20

0.46 0.47 0.48 0.49 0.50 0.51 0.52

●●● ● ● ●

(c)Wright-Fisher

Mutation (µ)

E[X]

0.00 0.10 0.20

0.46 0.47 0.48 0.49 0.50 0.51 0.52

●

● ● ●

Figure 2:

Expected frequency of type-A individuals

E£ X¤

, depending on popu- lation structure (legend on the first line), updating rule ((a): Moran Death-Birth, (b): Moran Birth-Death, (c): Wright-Fisher), and mutation probability

µ

(horizontal axis): Comparison between the theoretical prediction (curves) and the outcomes of numerical simulations (points). The horizontal dotted gray line corresponds to

p

, the expected frequency of type-

^A

individuals when there is no selection (i.e., when

δ=0

). Other parameters:

δ

=

^0.005

,

^p=1/2

,

b₌8

,

c₌1

.

neighbors on the circle is given by Q

M

= 1 − p

(2 − µ ) µ

1 − µ , (26b)

and we recover the formula presented in, e.g., Allen et al. (2012) (see Appendix B.2.4 for details). This result is plotted in figure 3(c). We however need to note that the first-order approximation for E £

X ¤

fails when both µ → 0 and N → ∞ : this is because the integral behind eq. (26a) does not converge when µ → 0. Similarly, for instance, the first order approximation for the probability that two neighbors are identical by descent 1 −µ (N − 1), which was obtained by Taylor et al. (2007a), fails when N is too large compared to µ.

Under a Wright-Fisher updating, the probability of identity by descent be-

tween neighbors is equal to 0. This is because all individuals reproduce at each

time step, and their offspring can only establish on the node on the left or on

the right of their parent, so that relatedness cannot build up (a feature called

checkerboard effect by Grafen & Archetti, 2008). This checkerboard effect is

also the reason why λ

⁰_{W F}

differed among the small graphs that we tested; for

(22)

instance, under a Wright-Fisher updating Q does not converge to 1

N×N

when µ → 0 with the graph depicted in figure 1(c) while it does for graph 1(b).

We can however modify the graph to allow for establishment in the parent’s node: with probability (1−m) the offspring remain where the parent was, other- wise they move to the right or the left-hand side node (with probability m/2 for each; see figure 32(b)). In this case, we find the following probability of identity by descent between neighbors:

Q

W F

= µ (2 − µ ) + 2 (1 − µ )

²

m (1 − m) −

q µ (2 − µ ) ¡

µ+ 2 m (1 − µ ) ¢ ¡

2 − µ − 2m (1 −µ ) ¢

2 (1 − µ )

²

m (1 − m) .

(27) (See Appendix B.3.4 for details; the corresponding value of λ

⁰_{W F}

is given in eq. (B.48b).) Q

W F

is undefined for µ = 0 or m = 1, and lim

_µ→0

Q

W F

= 1, but lim

_m→1

Q

W F

= 0.

The result is plotted in figure 3(d) for different values of the emigration proba- bility m.

(a)Circle graph (b) . . . with self-loops

1−m m2 m2

(c)Moran updating

Mutation (µ) Pr. neighbors IBD (QM)

0.0 0.4 0.8

0.0 0.2 0.4 0.6 0.8 1.0

(d)Wright-Fisher updating

Mutation (µ) Pr. neighbors IBD (QM)

0.0 0.4 0.8

0.0 0.2 0.4 0.6 0.8 1.0

0.750.5 0.9 0.999

m =

Figure 3:

Circle graphs, without (a) or with self-loops ((b); the weight of the self-

loop is

¹−m

), and Probability that two neighbors on the graph are identical by

descent, as function of the mutation probability

µ

, for the Moran updating on an

infinite circle graph (c), and for the Wright-Fisher updating on an infinite circle

graph with self loops (d). In (d), emigration probabilities

^m

take values

0.5

,

0.75

,

0.9

,

^0.999

(increasingly lighter curves).

(23)

5 Discussion

While most studies on the evolution of cooperation assume an almost perfect fi- delity of strategy transmission from parent to offspring, here, we explored the ef- fect of arbitrary mutation on the evolution of social behavior in structured pop- ulations. We provide a formula (eq. (10)) that gives the expected frequency of a given strategy, for any life-cycle, any fidelity of parent-offspring strategy trans- mission, and that is valid in populations of fixed size that are such that the re- productive values of all sites are equal (i.e., when all individuals have the same fecundity, they all have the same chance of actually reproducing). The formula depends on the probability of identity by descent of pairs of individuals, and we show how to compute those in general.

Identity by descent and expected state of pairs of sites

The effects of social interactions depend on the actual types of the individuals who interact. With imperfect strategy transmission from parents to their off- spring ( µ > 0), common ancestry does not guarantee that two individuals are of the same type. The concept of identity by descent, as we use it in this arti- cle, adds to common ancestry the condition that no mutation has occured in the two individuals’ lineages since the common ancestor (Kimura & Crow, 1964;

Taylor et al., 2007b), and hence guarantees that the two individuals are of the same type. Two individuals that are not IBD can be treated independently, and we can hence relate the probability that the individuals at two sites i and j to their expected state (see our eq. (9), Allen & Nowak (2014), or also Rousset &