• Aucun résultat trouvé

Predicting isoform transcripts: What does the comparison of known transcripts in human, mouse and dog tell us?

N/A
N/A
Protected

Academic year: 2021

Partager "Predicting isoform transcripts: What does the comparison of known transcripts in human, mouse and dog tell us?"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: hal-02267357

https://hal.inria.fr/hal-02267357

Submitted on 19 Aug 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Distributed under a Creative Commons Attribution| 4.0 International License

Predicting isoform transcripts: What does the

comparison of known transcripts in human, mouse and dog tell us?

Nicolas Guillaudeux, Catherine Belleannée, Samuel Blanquart, Jean-Stéphane Varré

To cite this version:

Nicolas Guillaudeux, Catherine Belleannée, Samuel Blanquart, Jean-Stéphane Varré. Predicting iso- form transcripts: What does the comparison of known transcripts in human, mouse and dog tell us?.

JOBIM 2019 - Journées Ouvertes Biologie, Informatique et Mathématiques, Jul 2019, Nantes, France.

8, pp.1, 2019, �10.7490/f1000research.1117311.1�. �hal-02267357�

(2)

Nicolas Guillaudeux 1 , Catherine Belleannée 1 , Samuel Blanquart 1 , Jean-Stéphane Varré 2

1

Dyliss team, Univ Rennes, Inria, CNRS, IRISA, Rennes F-35000, France.

2

Univ. Lille, CNRS, Centrale Lille, UMR 9189 – CRIStAL, F-59000 Lille, France. [email protected]

To fill in the knowledge about transcript isoforms expressed from a gene, we have proposed a comparative genomics method allowing to identify orthologous exons shared by a pair of genes [1]. We predict transcript isoforms in human, mouse and in a non-model organism, dog, and we identify 135 conserved genes having common gene structures and common potential transcriptomes.

Predicting isoform transcripts: What does the comparison of known transcripts in human, mouse and dog tell us?

[1] S. Blanquart et al., “Assisted transcriptome reconstruction and splicing orthology.”, BMC Genomics 17 (2016).

Phylogenetic interpretation of gene structures

Most analyzed genes encountered structure divergence

At least one site shared in human

and dog At least one site

shared in mouse and dog

At least one site shared in human and mouse

Likely lost in mouse

Likely appeared in human and mouse ancestor

Figure 6. Distribution of duplicates (left) and singletons (right) components in functional site graphs.

135 genes have all functional sites and transcripts conserved over the 3 species

255 genes have all functional sites conserved over the 3 species but 120 express different transcripts.

Example:

• Frameshifts

• Alternative transcription

• False negative

Some genes have the same gene structure but different transcriptomes

Analyzing conservation over functional sites and transcripts

Figure 5. Transcript graph of CREM gene (ambiguous) (left). Analysis of 986 considered transcript graphs. 135 graphs reveal genes having common potential transcriptomes (right).

Figure 4. Functional site graph of CREM gene (left). Analysis of 1,663 considered functional site graphs. 255 graphs reveal genes having common gene structures (right).

Application: Predicting transcripts in human, mouse and dog

Figure 2. Estimated transcriptomes: known and predicted transcripts obtained using pairwise comparisons.

Modelling gene structures using comparative genomics to predict isoforms [1]

Our structure of a target gene (fig.1a)

• based on functional sites of known transcripts:

[ : start codon ] : stop codon < > : splice sites a letter: (part of) a coding exon

• using functional sites of known transcripts in an orthologous source gene (fig.1b):

Ø to reveal new functional sites or coding exons on the target gene (fig.1c)

Ø to predict new transcripts of the target gene, using predicted functional sites (fig.1d)

Known transcript Yes Predicted expressible transcript

No Not expressible transcript Expressible in ?

[A<.>B[C<.[D<.[E<.>F<.>G<.>H<I J].>K[L<. M[N<.[O<.P>Q[R<.>S<.>T].>U V]

[A<>BC<>F<>G<>H<>KL<>S<>T]

Figure 1. CREM gene: pairwise comparison between human (a) and mouse (b) reveals new functional sites or exons (red) on human gene (c). This lead to predict new expressible transcripts (d) in human.

(c)

(d)

[A<.>B[C<. .[E<.>F<.>G<.>H<I ].>K L<.[M N<.[

P>Q[R<.>S<.>T].>U ] [A<.>B[C<.[D<.[E<. .>G<.>H<I J].>K[L<.

[N<.[O<.P>Q[R<.>S<.>T].>U V]

(a) (b)

Data used:

2,167 orthologous genes & 18,109 known transcripts:

• human and mouse: CCDS

• dog: ENSEMBL

Pairwise comparisons [1]

and merging results

2,112

1,540 3,209

6,861 new predicted transcripts (fig. 2):

+15.5% +24.5% +50%

Genes with all transcripts shared (only triplets)

At least one duplicate At least one

triplet

At least one singleton

Figure 3. Considered gene components: (a) triplet, (b) duplicate, (c) singleton. “Ambiguous” graph components (d) are not considered.

Graph components:

“Ambiguous” components:

alternative transcription

Similar amount of sites specific in human and

mouse: likely

independent evolution

At least one functional site

specific in mouse

At least one functional site specific in dog

At least one functional site

specific in human At least one

triplet

At least one singleton

Nothing aligned (artefact)

Genes with all functional sites shared (only triplets)

At least one duplicate

(d)

(a) (b) (c)

255

We build two conservation graphs for each gene:

• conserved functional site graph (fig.4) & conserved transcript graph (fig.5)

• a graph component shows orthology relationships between species

• classification of graph components (3-species case):

• shared in the 3 species (fig.3a), in 2 species (fig.3b), specific to a species (fig.3c)

• only graphs without “ambiguous” components (fig.3d) are considered for analysis

Conservation of functional sites: Conservation of transcripts:

each node is a functional site of a gene each node is a transcript of a gene

each edge represents an orthology relationship each edge represents an orthology relationship

between two functional sites between two transcripts

Références

Documents relatifs

In this section, we give an example of a primitive unimodular proper S-adic subshift whose strong orbit equivalence class contains no minimal dendric subshift.. Theorem 4.5 provides

Parallèlement à ces travaux évaluatifs, le Panel 2008 permet d’étudier de nombreux as- pects des trajectoires des bénéficiaires de contrat aidé (ex : premier emploi après le

grade II and III meningiomas and when Ki67 labeling index was lower than 10%.Our results suggest that, EGFR protein isoforms without ICD and their corresponding mRNA variants

This review summarizes recent advances in the understanding of the biology of anti- inflammatory members of the IL-1 cytokine family, IL-1R antagonist (IL-1Ra), IL-36Ra, IL-37

Results: Our results showed that mesenchymal GBM tumors displayed increased glutamine uptake and utilization compared to both control brain tissue and other GBM subtypes..

Our data provide evidence that P 212 residue within the C-terminal proline-rich motif of hCD28 is essential for delivering pro-inflammatory signals and the natural P to A

Keywords: language origin and evolution, genetic basis for cognition and communication, cerebral mechanisms for higher functions, symbolic cognition.. Darwin in

The World Health Organization (WHO) in collaboration with the World Organization for Animal Health (OIE) convened a meeting of country representatives, manufacturers and