• Aucun résultat trouvé

Integrating single marker tests in genome scans for selection : the local score approach

N/A
N/A
Protected

Academic year: 2021

Partager "Integrating single marker tests in genome scans for selection : the local score approach"

Copied!
3
0
0

Texte intégral

(1)

HAL Id: hal-01798027

https://hal.archives-ouvertes.fr/hal-01798027

Submitted on 23 May 2018

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Integrating single marker tests in genome scans for selection : the local score approach

María Inés Fariello Rico, Simon Boitard, Sabine Mercier, Magali San Cristobal

To cite this version:

María Inés Fariello Rico, Simon Boitard, Sabine Mercier, Magali San Cristobal. Integrating single

marker tests in genome scans for selection : the local score approach. Mathematical and Computational

Evolutionary Biology, Jun 2017, Porquerolles, France. pp. 1-1, 2017. �hal-01798027�

(2)

Open Archive TOULOUSE Archive Ouverte (OATAO)

OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible.

This is an author-deposited version published in : http://oatao.univ-toulouse.fr/

Eprints ID : 19674

To cite this version : Fariello Rico, María Inés and Boitard, Simon and Mercier, Sabine and San Cristobal, Magali Integrating single marker tests in genome scans for selection : the local score

approach. (2017) In: Mathematical and Computational Evolutionary Biology, 12 June 2017 - 16 June 2017 (Porquerolles, France).

(Unpublished)

Any correspondence concerning this service should be sent to the repository

administrator: [email protected]

(3)

Integrating single marker tests in genome scans for selection : the local score approach

María Inés Fariello 1 , Simon Boitard 2 , Sabine Mercier 3 , Magali San Cristobal 4

(1) Univ. de la República, Montevideo, Uruguay. (2) GenPhySE, INRA Toulouse, France. (3) IMT, Univ. Toulouse, France. (4) Dynafor, INRA Toulouse, France.

M OTIVATION

• Detect genomic regions with high genetic differentiation between populations, signatures of adaptive selection.

Single-marker statistics have a large variance and ignore LD (Linkage Disequilinrium).

Haplotype-based tests require individual data, not availale when sequencing DNA pools (Pool-seq).

Window-based tests: how to choose window size? Statistical sig- nificance of a window?

Local score: detects regions statistically enriched in markers with high genetic differentiation, without defining fixed windows.

D EFINITION

• For each marker m, score X m = − log10(p m ) − ξ (black points), p m p-value of a test for selection (i.e. rejecting neutrality).

Cumulate scores using the Lindley process (solid line) h 0 = 0, h m = max(0, h m 1 + X m )

Excursions above 0 of the Lindley process, indicate genomic re- gions enriched in high scores / low p-values (green interval).

• Here p m is the p-value of the FLK test (Bonhomme et al , 2010).

I MPORTANT FEATURES

• One single tunning parameter: ξ, p-value threshold in log10 scale.

High values put emphasis on high scores: strong selection.

Low values put emphasis on extended regions: recent selection.

→ ξ = 1 recommended to optimize detection power.

Statistical significance of an excursion depends on:

→ the number of markers in the sequence (M )

→ the auto-correlation of scores (ρ).

Two new approaches to compute it, as a function of M and ρ:

analytical formula: valid if single-marker p-values are uniform under neutrality.

re-sampling approach: valid for all datasets.

S IMULATION RESULTS

Two divergent populations, one neutral and one under selection.

• Methods: single-marker test (FLK), haplotype-based test (hapFLK, Fariello et al , 2013), window-based FLK tests (mean, diffx, prob >2), local score, detection threshold computed from our re-sampling ap- proach (SLgT) or neutral simulations (SLgE).

• Observed type I error 6% for SLgT, 5% for other tests.

F=0.2 sequence

Detection P ow er SLgT _1 SLgE _1 mean difFix prop >2 hapFLK FLK

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

F=0.4 sequence

SLgT _1 SLgE _1 mean difFix prop >1 hapFLK FLK

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

● Sel. coeff

0.10 0.50 1.0

F=0.2 chip

Detection P ow er SLgT _1 SLgE _1 mean difFix prop >2 hapFLK FLK

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

F=0.4 chip

SLgT _1 SLgE _1 mean difFix prop >1 hapFLK FLK

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

A PPLICATION TO DIVERGENT QUAIL LINES

• Divergent selection on behaviour, pool-seq from each line (G50).

• Chromosome 1: much clearer signal with the local score.

• Ten significant regions genome-wide, with relevant candidate genes related to autistic disorders or behavorial traits in Humans.

C ONCLUSIONS

• The local score accounts for LD whithout individual genotypes.

Statistical significance of candidate regions easy to compute.

Increased detection power compared to single-marker, window- based or haplotype-based tests.

• Can be applied to any single-marker test providing p-values.

Références

Documents relatifs

We hypothesised that (i) the pristine forest will be characterised by greater species richness of testate amoe- bae than similar but less pristine taiga regions due to higher

Cette base recense non seulement les publications académiques archivées dans RERO DOC, mais également les références des publications dites «profes-sionnelles» et «médias»:

The paper [55] establishes limit theorems for a class of stochastic hybrid systems (continuous deterministic dynamic coupled with jump Markov processes) in the fluid limit (small

Her research interests include sensor networks, component model and software architecture for distributed multimedia applications, quality of service. Philippe Roose is

Since 1992, we have been carrying out an upland rice breeding program for Latin American and the Caribbean based on the improvement of synthetic populations of broad genetic

The current process of generating the implementation view in IASA depends on the specified topology, the communication mode of the access points, the chosen implementation

We find that these thresh- olds mainly depend on bacteria quantitative traits such as nutrient con- sumption ability, growth conversion factor, death rate, mutation (forward or