• Aucun résultat trouvé

Patrocles: a database of polymorphic miR-mediated gene regulation in vertebrates

N/A
N/A
Protected

Academic year: 2021

Partager "Patrocles: a database of polymorphic miR-mediated gene regulation in vertebrates"

Copied!
25
0
0

Texte intégral

(1)

Patrocles: a database of

polymorphic miR-mediated gene

regulation in vertebrates

Denis Baurain

Samuel Hiard

Wouter Coppieters

Carole Charlier

Michel Georges

(2)

Polymorphic miR-mediated

gene regulation

AAAAAAAAA…. 3’-UTR mature miR miRNP Pri -miR nucleus Pre -miR Host gene ? Exportin5 cytoplasm Drosha complex Dicer Helicase mRNA miR/miR* miRNP

Targets (1)

miRs (100s)

Silencing machinery

(overall effect)

DNA

Sequence

Polymorphisms: DSPs

Considerable sequence space is devoted to miR-mediated gene regulation

(targets, miRs, silencing machinery)

DSPs in silencing components are likely to contribute to (complex)

phenotypic variation including disease

Proof in animals: Texel sheep, Clop et al. (2006) Nat. Genet. 38:813-818

Suggestions in humans: Sethupathy & Collins (2008) TIG 24:489-497

(3)

Polymorphic miR-mediated

gene regulation

AAAAAAAAA…. 3’-UTR mature miR miRNP Pri -miR nucleus Pre -miR Host gene ? Exportin5 cytoplasm Drosha complex Dicer Helicase mRNA miR/miR* miRNP

Targets (1)

miRs (100s)

Silencing machinery

(overall effect)

DNA

Sequence

Polymorphisms: DSPs

Considerable sequence space is devoted to miR-mediated gene regulation

(targets, miRs, silencing machinery)

DSPs in silencing components are likely to contribute to (complex)

phenotypic variation including disease

Proof in animals: Texel sheep, Clop et al. (2006) Nat. Genet. 38:813-818

Suggestions in humans: Sethupathy & Collins (2008) TIG 24:489-497

http://www.patrocles.org/

Mining public databases for

SNPs and other DSPs in the

3 sequence compartments

(4)

Patrocles - Overview

Patrocles

miRBase

miRs 8nt motifs

UCSC

alignments

Ensembl

3’-UTRs SNPs

SymAtlas

gene expr.

GEO

DGV

HapMap

1000 genomes

CNVs gene expr. genotypes allele freqs

Literature

8nt motifs miR expr. CNVs eQTL machinery

(5)

Patrocles - Overview

Patrocles

miRBase

miRs 8nt motifs

UCSC

alignments

Ensembl

3’-UTRs SNPs

SymAtlas

gene expr.

GEO

DGV

HapMap

1000 genomes

CNVs gene expr. genotypes allele freqs

Literature

8nt motifs miR expr. CNVs eQTL machinery

Updates and Synchronization...

Currently, 7 species

human

chimp

mouse

rat

dog

cow

chicken

(6)

target sites in 3’-UTRs target site motifs SNPs in 3’-UTRs 3’-UTRs 2,674,395 (12.4%) 4,072,176 (15.5%) sequence space 19,595 30,290 (9.0%)

conserved L NOT X-targets

57,154 64,010 (22.4%)

conserved X NOT L-targets

9,436 10,425 (27.7%)

conserved X AND L-targets

31,416 37,700 X AND L-targets 455,620 661,187 X OR L-targets 219,392 375,054 L-targets 267,644 323,833 X-targets 58 59 X AND L-octamers 948 1,164 X OR L-octamers 466 683 L-octamers 117 170 miR* 484 676 miR 540 540 X-octamers 111,178 (87.8%) 114,305 (83.9%)

known ancestral allele

126,589 136,159 total 21,634,548 26,261,732 sequence space 21,911 24,319 genes mouse human

(7)

target sites in 3’-UTRs target site motifs SNPs in 3’-UTRs 3’-UTRs 2,674,395 (12.4%) 4,072,176 (15.5%) sequence space 19,595 30,290 (9.0%)

conserved L NOT X-targets

57,154 64,010 (22.4%)

conserved X NOT L-targets

9,436 10,425 (27.7%)

conserved X AND L-targets

31,416 37,700 X AND L-targets 455,620 661,187 X OR L-targets 219,392 375,054 L-targets 267,644 323,833 X-targets 58 59 X AND L-octamers 948 1,164 X OR L-octamers 466 683 L-octamers 117 170 miR* 484 676 miR 540 540 X-octamers 111,178 (87.8%) 114,305 (83.9%)

known ancestral allele

126,589 136,159 total 21,634,548 26,261,732 sequence space 21,911 24,319 genes mouse human

(8)

Friedman et al. (2009) Genome Res. 19:92-105

Targets – Methods

2 collections of 8nt motifs

X-targets: 540 8nt motifs (mammals)

conserved in 3’-UTRs, putative miR target sites

Xie et al. (2005) Nature 434:338-345

L-targets: 683 8nt motifs (human)

rc(2-8nt)+A from mature miRs in miRBase

Lewis et al. (2005) Cell 120:15-20

2 collections of 7nt motifs (from L-targets)

7mer-A1

7mer-m8

(9)

target sites in 3’-UTRs target site motifs SNPs in 3’-UTRs 3’-UTRs 2,674,395 (12.4%) 4,072,176 (15.5%) sequence space 19,595 30,290 (9.0%)

conserved L NOT X-targets

57,154 64,010 (22.4%)

conserved X NOT L-targets

9,436 10,425 (27.7%)

conserved X AND L-targets

31,416 37,700 X AND L-targets 455,620 661,187 X OR L-targets 219,392 375,054 L-targets 267,644 323,833 X-targets 58 59 X AND L-octamers 948 1,164 X OR L-octamers 466 683 L-octamers 117 170 miR* 484 676 miR 540 540 X-octamers 111,178 (87.8%) 114,305 (83.9%)

known ancestral allele

126,589 136,159 total 21,634,548 26,261,732 sequence space 21,911 24,319 genes mouse human

(10)

target sites in 3’-UTRs target site motifs SNPs in 3’-UTRs 3’-UTRs 2,674,395 (12.4%) 4,072,176 (15.5%) sequence space 19,595 30,290 (9.0%)

conserved L NOT X-targets

57,154 64,010 (22.4%)

conserved X NOT L-targets

9,436 10,425 (27.7%)

conserved X AND L-targets

31,416 37,700 X AND L-targets 455,620 661,187 X OR L-targets 219,392 375,054 L-targets 267,644 323,833 X-targets 58 59 X AND L-octamers 948 1,164 X OR L-octamers 466 683 L-octamers 117 170 miR* 484 676 miR 540 540 X-octamers 111,178 (87.8%) 114,305 (83.9%)

known ancestral allele

126,589 136,159 total 21,634,548 26,261,732 sequence space 21,911 24,319 genes mouse human

(11)

Targets - Concordance between

X and L target site motifs

540 8mers 577 7mers 554 6mers 683 8mers 1265 7mers 1448 6mers

91%

40%

(12)

Targets - Conserved vs.

(13)

1. human: A ...TTTGGTG

A

AACCAAC... => ancestral allele

human: G ...TTTGGTG

G

AACCAAC... => derived allele

chimp ...TTTGGTG

A

AACCAAC... => sibling species

2. rat ...TTTGGTG

A

AACAAAC...

mouse ...CTTGGTG

A

AACAAAC...

3. dog ...TTTGGTG

A

AACTAAC...

cow ...TTTGGTG

A

AACTAAC...

(3/3) TTTGGTG

A

(3/3) TTGGTG

A

A

(3/3) TGGTG

A

AA

(3/3) GGTG

A

AAC

(2/3) not in dog/cow gtg

a

aacc

(2/3) not in dog/cow tg

a

aacca

(2/3) not in dog/cow g

a

aaccaa => hsa-miR-29b-2*

(2/3) not in dog/cow

a

aaccaac

Targets

(14)

1. human: A ...TTTGGTG

A

AACCAAC... => ancestral allele

|||||||||

3'-GAUUCGGUGGUACACUUUGGUC-5' => hsa-miR-29b-2*

|||.|||||

human: G ...TTTGGTG

G

AACCAAC... => derived allele

chimp ...TTTGGTG

A

AACCAAC... => sibling species

2. rat ...TTTGGTG

A

AACAAAC...

mouse ...CTTGGTG

A

AACAAAC...

3. dog ...TTTGGTG

A

AACTAAC...

cow ...TTTGGTG

A

AACTAAC...

(3/3) TTTGGTG

A

...

...

(2/3) not in dog/cow g

a

aaccaa => hsa-miR-29b-2*

(2/3) not in dog/cow

a

aaccaac

Targets

Patrocles SNPs - Methods

CNC DNC not cons. P (CC) DC conserved ? der anc site \ allele +S +W7C / S7C

(15)

Targets

Patrocles SNPs - Results

mouse human 56 37 837 741 S 2,065 2,290 3,295 1,944 P 7,573 8,545 11,244 9,006 CNC 7,250 7,732 10,328 7,392 DNC 496+65 951+102 959+58 1,546+50 DC+CC 17,505 19,657 26,719 20,679 total Lewis Xie Lewis Xie pSNP class

# destructions

=

# creations

(16)

Targets - Patrocles SNPs

Evidence for purifying selection

SNP shuffling in 3’-UTR sequence space with preservation of trinucleotide context

(17)

Targets - Patrocles SNPs

Evidence for purifying selection

SNP shuffling in 3’-UTR sequence space with preservation of trinucleotide context

human - DC

possible elimination of SNPs affecting conserved targets

22 to 35% in human

53 to 67% in mouse

Chen & Rajewsky (2006) Nat. Genet. 38:1452-1456

depletion of SNPs in conserved miR target sites when compared to

(18)

Targets

Prioritization for lab validation

most interesting pSNPs are

pSNPs destroying conserved target sites

pSNPs creating target sites in anti-targets

to yield a phenotype, target and miR have to be

expressed in the same tissue (at the same time)

co-expression plots for human and mouse

target genes: SymAtlas

miRs: Landgraf et al. (2007) Cell 129:1401-1414

two different kinds of plots

comparing miR and target

comparing miR host gene (if any) and target

miR target

(19)

pSNPs - Co-expression plots

rs34542287 A/G [0.985/0.015] Destroyed Conserved target site miR-9 vs. actin-binding LIM protein 1

ACCA[A]AGA

rs28399411 G/A [0.994/0.006] Destroyed Conserved target site

miR-32 vs. Axonal membrane protein GAP-43 TGTGC[A]AT

mature miR counts

[+] direct evidence

[–] gross tissue mapping

host gene expression

[+] perfect matching of tissues

[–] indirect evidence

(20)

pri-miRs (stem-loops) from miRBase

pDSPs altering miR

sequence

 SNPs (de-)stabilizing interaction (seed, mature non-seed)

pDSPs altering miR

concentration

 SNPs altering processing efficiency (anywhere in stem-loop)

 CNVs encompassing miR genes

human: http://projects.tcag.ca/variation/

mouse: She et al. (2008) Nat. Genet. 40:909-914

rat: Guryev et al. (2008) Nat. Genet. 40:538-545

eQTL (or allelic imbalance) corresponding to host genes (only human)

Morley et al. (2004) Nature 430:743-747

Cheung et al. (2005) Nature 437:1365-1369

Ge et al. (2005) Genome Res. 15:1584-1591

Stranger et al. (2005) PLoS Genet. 1:e78

Pant et al. (2006) Genome Res. 16:331-339

Dixon et al. (2007) Nat. Genet. 39:1202-1207

Goring et al. (2007) Nat. Genet. 39:1208-1216

Spielman et al. (2007) Nat. Genet. 39:226-231

Stranger et al. (2007) Nat. Genet. 39:1217-1224

Polymorphic miRs - Methods

A 5’ 9 8 7 6 5 4 3 2 miR seed target site 1 2 3 4 5 6 7 8 Targeted mRNA 1 5’ Pri-miR Host gene ? Pri-miR Host gene ?

(21)

Polymorphic miRs – Results

n.d. 85 affected miRs n.d. 78 eQTL miRs hosted in eQTL genes 0 256 affected miRs 0 158 CNVs miRs in CNVs 79 146 other 6 26 mature non-seed 4 12 seed 89 184 total 71 136 affected miRs SNPs in pre-miRs 466 676 pre-miRs mouse human

(22)

Polymorphic miRs – Results

Duan et al. (2007) Hum. Mol. Genet. 16:1124-1131

T

G

e.g., hsa-miR-125a

SNP in seed (+8) blocks processing

of pri-miR to pre-miR

(23)

manually curated list of 52 gene products involved in

RNA-mediated gene silencing

3 broad compartments

1. miR biogenesis: 4 (+4)

2. RISC/mRNP: 12 (+2)

3. P-bodies: 27 (+3)

pDSPs altering machinery gene

sequence

SNPs (non-synonymous, stop/frameshift, splicing site)

pDSPs altering machinery gene product

concentration

CNVs encompassing machinery genes (human, mouse, rat)

eQTL corresponding to machinery genes (human)

Silencing machinery – Methods

1

2

3

(24)

Silencing machinery – Results

n.d. 21

machinery genes identified as eQTL

0 17 affected genes 0 17 CNVs machinery genes in CNVs 52 42 splicing sites 2 45 stops / frameshifts 73 151 non-synonymous 127 237 total 35 49 affected genes SNPs in machinery genes 51 52 genes mouse human

(25)

Conclusions

other features of Patrocles

allelic imbalance plots (HapMap)

reported associations between pSNPs (or SNPs in

miR genes) and phenotypes

Patrocles Finder for custom sequences

most pSNPs are likely false positives due to

poor specificity of target site predictions

Patrocles still contains some interesting biology

(e.g., purifying selection on non-conserved sites)

systematic validation of pSNPs become possible

(e.g., AgoIP + estimation of allelic imbalance in

RISC-bound mRNAs; HITS-CLIP)

Références

Documents relatifs

Les États membres doivent prendre les mesures nécessaires pour parve- nir à une réduction ambitieuse et soutenue de la consommation de gobelets pour boissons et récipients pour

technologies pour l’environnement et l’agriculture (Irstea) et l’Institut national des sciences applique´es (Insa) de Lyon ont cre´e´ un poˆle commun de recherche dans le domaine

Sortie du statut de déchet pour les produits chimiques et objets ayant fait l’objet d’une régénération Un arr^ eté du 22 février 2019 3 fixe les critères de sortie de statut

Distinguant les espe`ces composant la biodiversite´ pour l’alimenta- tion et l’agriculture (plantes et animaux d’e´levage et sauvages qui sous-tendent directement nos

Au chapitre I (Accès à une alimentation saine), l’article 24 énonce qu’au 1 er janvier 2022, les repas servis dans la restauration collective doivent conte- nir 50 % de

reality » de l’Organisation des Nations unies pour l’alimentation et l’agriculture (FAO) 1 rassemble les connaissances relatives a` la pollution du sol (type et sources de

3A, overexpression of E2F1–3 lead to an increased level of miR-20a compared with the control HeLa cells, suggesting that the E2F1, E2F2, and E2F3 transcription factors can induce

The complemented strain leads to lower Sbi protein levels compared to the wild-type N315 strain (Figure 2, panels B and C) because the expression of SprD from pCN38VsprD is higher