• Aucun résultat trouvé

GC content shapes mRNA storage and decay in human cells

N/A
N/A
Protected

Academic year: 2021

Partager "GC content shapes mRNA storage and decay in human cells"

Copied!
33
0
0

Texte intégral

(1)

HAL Id: hal-02431783

https://hal.archives-ouvertes.fr/hal-02431783

Submitted on 25 Nov 2020

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

GC content shapes mRNA storage and decay in human

cells

Maïté Courel, Yves Clément, Clémentine Bossevain, Dominika Foretek, Olivia

Vidal Cruchez, Zhou Yi, Marianne Bénard, Marie-Noelle Benassy, Michel

Kress, Caroline Vindry, et al.

To cite this version:

Maïté Courel, Yves Clément, Clémentine Bossevain, Dominika Foretek, Olivia Vidal Cruchez, et al..

GC content shapes mRNA storage and decay in human cells. eLife, eLife Sciences Publication, 2019,

8, �10.7554/eLife.49708�. �hal-02431783�

(2)

*For correspondence: dominique.weil@upmc.fr Competing interests: The authors declare that no competing interests exist. Funding:See page 26 Received: 26 June 2019 Accepted: 18 December 2019 Published: 19 December 2019 Reviewing editor: Karsten Weis, ETH Zurich, Switzerland

Copyright Courel et al. This article is distributed under the terms of theCreative Commons Attribution License,which permits unrestricted use and redistribution provided that the original author and source are credited.

GC content shapes mRNA storage and

decay in human cells

Maı¨te´ Courel

1

, Yves Cle´ment

2

, Cle´mentine Bossevain

1

, Dominika Foretek

3

,

Olivia Vidal Cruchez

4

, Zhou Yi

5

, Marianne Be´nard

1

, Marie-Noe¨lle Benassy

1

,

Michel Kress

1

, Caroline Vindry

6

, Miche`le Ernoult-Lange

1

,

Christophe Antoniewski

7

, Antonin Morillon

3

, Patrick Brest

4

,

Arnaud Hubstenberger

5

, Hugues Roest Crollius

2

, Nancy Standart

6

,

Dominique Weil

1

*

1

Sorbonne Universite´, CNRS, Institut de Biologie Paris Seine (IBPS), Laboratoire de

Biologie du De´veloppement, Paris, France;

2

Ecole Normale Supe´rieure, Institut de

Biologie de l’ENS, IBENS, Paris, France;

3

ncRNA, Epigenetic and Genome Fluidity,

Institut Curie, PSL Research University, CNRS UMR 3244, Sorbonne Universite´,

Paris, France;

4

Universite´ Coˆte d’Azur, CNRS, INSERM, IRCAN, FHU-OncoAge,

Nice, France;

5

Universite´ Coˆte d’Azur, CNRS, INSERM, iBV, Nice, France;

6

Department of Biochemistry, University of Cambridge, Cambridge, United

Kingdom;

7

Sorbonne Universite´, CNRS, Institut de Biologie Paris Seine (IBPS),

ARTbio Bioinformatics Analysis Facility, Paris, France

Abstract

mRNA translation and decay appear often intimately linked although the rules of this interplay are poorly understood. In this study, we combined our recent P-body transcriptome with transcriptomes obtained following silencing of broadly acting mRNA decay and repression factors, and with available CLIP and related data. This revealed the central role of GC content in mRNA fate, in terms of P-body localization, mRNA translation and mRNA stability: P-bodies contain mostly AU-rich mRNAs, which have a particular codon usage associated with a low protein yield; AU-rich and GC-rich transcripts tend to follow distinct decay pathways; and the targets of sequence-specific RBPs and miRNAs are also biased in terms of GC content. Altogether, these results suggest an integrated view of post-transcriptional control in human cells where most translation regulation is dedicated to inefficiently translated AU-rich mRNAs, whereas control at the level of 5’ decay applies to optimally translated GC-rich mRNAs.

Introduction

Translation, storage, localization and decay of mRNAs in the cytoplasm are closely coupled pro-cesses, which are governed by a large number of RNA-binding proteins (RBPs) (Hentze et al., 2018). These RBPs have to act in a coordinated manner to give rise to a proteome both coherent with cellular physiology and responsive to new cellular needs. mRNA fate is also intimately linked with their localization in membrane-less organelles, such as P-bodies (PBs). We recently identified the transcriptome and proteome of PBs purified from human cells. Their analysis showed that human PBs are broadly involved in mRNA storage rather than decay (Hubstenberger et al., 2017; Standart and Weil, 2018), as also observed using fluorescent decay reporters (Horvathova et al., 2017). However, the mechanism underlying the large but specific targeting of mRNAs to PBs is still unknown, though it clearly results in the co-recruitment of particular RBPs (Hubstenberger et al., 2017).

(3)

In mammalian cells, the RNA helicase DDX6, known for its involvement in mRNA decay and trans-lation repression, is a key factor in PB assembly (Minshall et al., 2009). Patients with neurodevelop-mental delay caused by heterozygous DDX6 missense mutations were recently identified, and their skin fibroblasts show a PB defect (Balak et al., 2019). Human DDX6 interacts with both translational repressors, and the decapping enzyme DCP1/2 and its activators (Ayache et al., 2015;Bish et al., 2015). Its yeast homologue Dhh1 is a cofactor of DCP2, as well as a translational repressor (Coller and Parker, 2005). The RBP PAT1B has also been defined as an enhancer of decapping, as it interacts with DDX6, the LSM1-7 heptamer ring and the decapping complex in mammalian cells (Vindry et al., 2017), while in yeast Pat1p activates Dcp2 directly (Nissan et al., 2010) and its dele-tion results in deadenylated but capped intact mRNA (Bonnerot et al., 2000; Bouveret et al., 2000). DDX6 and PAT1B interact with the CCR4-NOT deadenylase complex and the DDX6-CNOT1 interaction is required for miRNA silencing (Vindry et al., 2017;Chen et al., 2014;Mathys et al., 2014;Ozgur et al., 2015). DDX6 also binds the RBP 4E-T, another key factor in PB assembly, which in turn interacts with the cap-binding factor eIF4E and inhibits translation initiation, including that of miRNA target mRNAs (Kamenska et al., 2016). Altogether, DDX6 and PAT1B have been proposed to link deadenylation/translational repression with decapping. Finally, the 5’ 3’ exonuclease XRN1 decays RNAs following decapping by DCP1/2, a step triggered by deadenylation mediated by PAN2/3 and CCR4-NOT or by exosome activity (Łabno et al., 2016).

A number of RBPs also control mRNA fate in a sequence-specific manner, some of them localizing in PBs as well. For instance, the CPEB complex, best described in Xenopus oocytes (Minshall et al., 2007), binds the CPE motif in the 3’ untranslated region (UTR) of maternal transcripts through CPEB1, thus controlling their storage and their translational activation upon hormone stimulation (Standart and Minshall, 2008). Additional examples include the proteins which bind 3’UTR AU-rich elements (ARE), such as HuR and TTP, to control translation and decay, and play key roles in inflam-mation, apoptosis and cancer (Wells et al., 2017). Protein-binding motifs are generally not unique and rather defined as consensus sequence elements. In the case of RISC, binding specificity is given by a guide miRNA, which also hybridizes with some flexibility with complementary mRNA sequences. A variety of techniques have therefore been developed to identify the effective RNA targets of such factors, ranging from affinity purification (such as RIP or CLIP) to transcriptome and polysome profil-ing after RBP silencprofil-ing, providprofil-ing the groundwork to address systematic questions about post-tran-scriptional regulation.

In this study, we searched for broad determinants of mRNA storage and decay in unstressed human cell lines, using our transcriptome of purified PBs and several transcriptomic analyses per-formed after silencing of general translation and decay regulatory factors, including DDX6, PAT1B and XRN1. We also used datasets available from the literature, including a transcriptomic analysis after DDX6 silencing, a DDX6-CLIP experiment and various lists of RBP and miRNA targets. Their combined analysis revealed the central role of mRNA GC content which, by impacting codon usage, PB targeting and RBP binding, influences mRNA fate and contributes to the coordination between two opposite processes: decay and storage. Reporter mRNAs varying in their GC content confirmed that AU-rich mRNAs have a lower protein yield than GC-rich ones, that they preferentially localize to PBs, and that they have an enhanced capacity to form RNP granules in vitro.

Results

PBs mostly accumulate AU-rich mRNAs

We have previously shown that PBs store one third of the coding transcriptome in human epithelial HEK293 cells (Hubstenberger et al., 2017). Such a large transcript number led us to search for gen-eral distinctive sequence features that could be involved in PB targeting. We first analyzed transcript length, as it was reported to be key for mRNA accumulation in stress granules (Khong et al., 2017). When mRNAs were subdivided into six classes ranging from <1.5 kb to >10 kb, longer mRNAs appeared more enriched in PBs than shorter ones, with a moderate correlation between length and PB enrichment (Spearman r (rs) = 0.39, p<0.0001) (Figure 1A,Figure 1—figure supplement 1A,B). However, their increased length in PBs was less striking than previously observed for stress granule mRNAs (Khong et al., 2017) (Figure 1—figure supplement 1C).

(4)

PBs (HEK293)

A

E

D

C

42.8 median 49.1 57.3 m R N A n u m b e r GC content (%) PB-in PB-out rs -0.64 <40 40-4545-5050-5555-60>60 P B e n ri ch m en t (l o g 2) GC content (%) 3’UTR 5’UTR CDS PBs (HEK293) rs -0.22 rs -0.57 rs -0.55

GC content (%) GC content (%) GC content (%)

-4 -2 0 2 P B e n ri ch m en t (l o g 2) <40 40-4545-5050-5555-60>60 <40 40-4545-5050-5555-60>60 <40 40-4545-5050-5555-6060-6565-70>70 rs 0.39 <1.51.5-2 2-3 3-5 5-10 >10 -4 -2 0 2 P B e n ri ch m en t (l o g 2) PBs (HEK293) mRNA length (kb)

B

35 40 45 50 55 60 65 G C c o n te n t (% ) mRNA length (kb) <1.5 1.5-2 2-3 3-5 5-10 >10 -4 -2 0 2 all mRNAs 0 100 200 300 400 500 30 40 50 60 70 all mRNAs PB-in PB-out

Figure 1. PB mRNAs are AU-rich and longer than average. (A) Long mRNAs are particularly enriched in PBs. Transcripts were subdivided into six classes depending on their length (from <1.5 kb to >10 kb). The boxplots represent the distribution of their respective enrichment in PBs. The boxes represent the 25–75 percentiles and the whiskers the 10–90 percentiles. rs, Spearman correlation coefficient. (B) AU-rich mRNAs are particularly enriched in PBs. Transcripts were subdivided into six classes depending on their GC content (from <40 to >60%) and analyzed as in (A). (C) PBs mostly contain the AU-rich fraction of the transcriptome. The human transcriptome was binned depending on its GC content (0.7% GC increments). The graph represents the number of PB-enriched (PB-in, p<0.05, n = 5200) and PB-excluded (PB-out, p<0.05, n = 4669) transcripts in each bin. The distribution of all transcripts is shown for comparison (n = 14443). The median GC value is indicated below for each group. (D) mRNA localization in PBs mostly depends on the GC content of their CDS and 3’UTR. The analysis was repeated as in (B) using the GC content of the 5’UTR, CDS or 3’UTR, as indicated. For 5’UTRs, the >60% class was subdivided into three classes to take into account their higher GC content compared to CDSs and 3’UTRs.

0.57 and 0.55 are not significantly different (p=0.17), while 0.22 and 0.55 are (p<0.0001) (E) GC content is lower in enriched mRNAs than PB-excluded ones independently of their length. The GC content distribution of PB-enriched (PB-in, p<0.05) and PB-PB-excluded (PB-out, p<0.05) mRNAs was analyzed as in (B).

The online version of this article includes the following figure supplement(s) for figure 1: Figure supplement 1. PB-enriched mRNAs tend to be long and AU-rich.

(5)

Most remarkably, mRNA accumulation in PBs was dependent on their global nucleotide composi-tion, with a strong correlation between GC content and PB localization (rs = 0.64, p<0.0001). When transcripts were subdivided into six classes ranging from <40% to >60% GC, PB enrichment was predominant for those <45% GC (Figure 1B,Figure 1—figure supplement 1D,E). While remi-niscent of the low GC content reported for stress granule mRNAs in HEK293 cells (Khong et al., 2017), our reanalysis of the published dataset indicated that stress granule localization correlated weakly with the gene GC content (rs= 0.12, p<0.0001) and almost not at all with the mRNA GC content (rs = 0.06, p<0.0001). Indeed, comparing the GC content distribution of the transcripts that are enriched or excluded from PBs with all HEK293 cell transcripts, revealed that mRNA storage in PBs is confined to the AU-rich fraction of the transcriptome (Figure 1C).

As these transcripts also correspond to AU-rich genes (Figure 1—figure supplement 1F), it raised the possibility that the impact of GC content on PB enrichment resulted indirectly from the genomic context of the genes. To address this issue, we looked at the link between PB enrichment and meiotic recombination, which can influence GC content through GC-biased gene conversion (Duret and Galtier, 2009). The correlation between PB enrichment and meiotic recombination was much weaker than between PB enrichment and mRNA GC content (rs= 0.16 vs 0.64, p<0.0001 for both, significantly different from each other, p<0.0001). Moreover, the latter was almost unchanged when controlling for meiotic recombination (rs = 0.65 vs 0.64, p<0.0001). Finally, it was still significant when controlling for intronic or flanking GC content (rs = 0.33 and 0.45 respectively, all p<0.0001), showing that mRNA base composition and PB enrichment are associated independently of meiotic recombination or the genomic context. We also computed partial correla-tions to verify that the correlation between PB enrichment and GC content was not secondary to the correlation that exists between GC content and expression level, or between GC content and gene conservation (Figure 1—figure supplement 1G).

To refine the link between mRNA accumulation in PBs and their GC content, we analyzed sepa-rately the influence of their CDS and UTRs. Interestingly, mRNA accumulation in PBs correlated strongly with the GC content of both their CDS and 3’UTR (rs = 0.57 and 0.55, respectively, p<0.0001 for both), and weakly with the one of their 5’UTR (rs= 0.22, p<0.0001) (Figure 1D, Fig-ure 1—figFig-ure supplement 1E). Moreover, the lower GC content of PB-enriched mRNAs compared to PB-excluded ones was a feature independent of their length, since it was observed in all length ranges (Figure 1E,Figure 1—figure supplement 1B). Conversely, the longer length of PB mRNAs was a feature independent of their GC content (Figure 1—figure supplement 1E,H).

In conclusion, while PB mRNAs tend to be longer than average, their most striking feature is that they correspond to an AU-rich subset of the transcriptome.

GC bias in PBs impacts codon usage and protein yield

The strong GC bias in the CDS of PB mRNAs prompted us to compare the coding properties of PB-stored and PB-excluded mRNAs. Consistently, we found that the frequency of amino acids encoded by GC-rich codons (Ala, Gly, Pro) was lower in PB-stored than in PB-excluded mRNAs, while the fre-quency of those encoded by AU-rich codons (Lys, Asn) was higher (Figure 2A). The difference could be striking, as illustrated by Lys, whose median frequency in PB-excluded mRNAs was 32% lower than in PB-enriched mRNAs, thus ranging within the lower 17th centile of their distribution ( Fig-ure 2—figFig-ure supplement 1A). In addition to different amino acid usage, we observed dramatic var-iation in codon usage between the two mRNA subsets. For all amino acids encoded by synonymous codons, the relative codon usage in PBs versus out of PBs was systematically biased towards AU-rich codons (log2 of the ratio >0,Figure 2B). For example, among the six Leu codons, AAU was used 4-fold more frequently in PB-enriched than in PB-excluded mRNAs, whereas CUG was used 2-4-fold less frequently. This systematic trend also applied to Stop codons. Some additional codon bias indepen-dent of base composition (NNA/U or NNG/C) was also observed for 4 and 6-fold degenerated codons (Figure 2—figure supplement 1B,C). For instance, Leu was encoded twice more often by CUU than CUA in PB-enriched mRNAs, whereas the use of both codons was low in PB-excluded mRNAs. Similarly, Gly was encoded more often by GGG than GGC in PB mRNAs, whereas the use of both codons was similar in PB-excluded mRNAs (Figure 2C).

In human, 22 out of the 29 synonymous codons that are less frequently used (normalized relative usage <1) end with an A or U, and were therefore overrepresented in PB mRNAs (Figure 2—figure supplement 1D). Considering for each amino acid the codon with the lowest usage (called low

(6)

Ϭ Ϭ͘ϭ Ϭ͘Ϯ Ϭ͘ϯ Ϭ͘ϰ Ϭ͘ϱ ϮϬ ϯϬ ϰϬ ϱϬ ϲϬ ϳϬ ϴϬ ϵϬ ϭϬϬ >ŽǁƵƐĂŐĞĐŽĚŽŶĨƌĞƋƵĞŶĐLJ 'ϯ;йͿ

)LJXUH

$

%

' Ϭͬϯ ϭͬϯ Ϯͬϯ ϯͬϯ

(

&

    & ' , / < > D E W Y Z ^ d s t z ^ƚŽƉ

'

L CTT 0.19 0.09 CTA 0.09 0.06 Ratio 2.1 1.5 G GGC 0.25 0.21 GGG 0.41 0.27 Ratio 0.61 1.3 ƌƐͲϬ͘ϴϴ ƌƐͲϬ͘ϴϬ Ͳϭ͘ϱ Ͳϭ͘Ϭ ͲϬ͘ϱ Ϭ Ϭ͘ϱ ϭ͘Ϭ ϭ͘ϱ Ϯ͘Ϭ Ϯ͘ϱ ŽĚŽŶƵƐĂŐĞ W ͲŝŶ ͬW ŽƵ ƚƵ ƐĂ ŐĞ ;ů ŽŐ ϮͿ     !      *&  &' 6 OH QJ WK  N E

)

ϯ ϰ ϱ ϲ ϳ ϴ ϭϬ ϭϬϬ ϭϬϬϬ EƵŵďĞƌŽĨůŽǁƵƐĂŐĞĐŽĚŽŶƐ ƉƌŽƚĞŝŶLJŝĞůĚ;ůŽŐϭϬͿ ƌƐͲϬ͘ϰϲ WͲŝŶ WͲŽƵƚ WͲŝŶ WͲŽƵƚ WͲŝŶ WͲŽƵƚ WͲŝŶ WͲŽƵƚ Ϭ͘ϬϬ Ϭ͘Ϭϯ Ϭ͘Ϭϲ Ϭ͘Ϭϵ Ϭ͘ϭϮ ŝŶ ŽͲ ĂĐ ŝĚ ĨƌĞ ƋƵ ĞŶ ĐLJ     & ' , / < > D E W Y Z ^ d s t z WͲŝŶн;ůŽŐ&хϮ͕ƉфϬ͘ϬϱͿ WͲŝŶ;ůŽŐ&хϬ͕ƉфϬ͘ϬϱͿ ĂůůŵZEƐ WͲŽƵƚ;ůŽŐ&фϬ͕ƉфϬ͘ϬϱͿ WͲŽƵƚн;ůŽŐ&фͲϮ͕ƉфϬ͘ϬϱͿ ŵŝŶŽĂĐŝĚƵƐĂŐĞ

Figure 2. Codon usage is strongly biased in PBs. (A) PB mRNAs and PB-excluded mRNAs encode proteins with different amino acid usage. The graph represents the frequency of each amino acid in the proteins encoded by mRNAs enriched or excluded from PBs, using the indicated PB enrichment thresholds. (B) Codon usage bias in and out of PBs follows their GC content. The relative codon usage for each amino acid was calculated in PB-enriched in) and PB-excluded out) mRNAs, using a PB enrichment threshold of +/- 1 (in log2). The graph represents the log2 of their ratio (PB-Figure 2 continued on next page

(7)

usage codon thereafter), 14 out of 18 are NNA or NNU, with the exception of Thr, Ser, Pro, Ala. We calculated the frequency of low usage codons for each CDS, and plotted it as a function of the GC content at the third position (GC3) to avoid any confounding effects of the amino acid bias. As expected, the frequency of low usage codons correlated strongly and negatively with GC3, with AU-rich CDS having a higher frequency of low usage codons than GC-AU-rich CDS (Figure 2D). According to their distinct GC content, PB mRNAs had a higher frequency of low usage codons than PB-excluded mRNAs. However, the correlation coefficient between frequency of low usage codons and GC3 was very close for both mRNA subsets (rs = 0.88 for PB-enriched; 0.80 for PB-excluded mRNAs, p<0.0001 for both), meaning that their different frequency of low usage codons could be largely explained by their GC bias alone.

We previously reported that protein yield, defined as the ratio between protein and mRNA abun-dance in HEK293 cells, was 20-times lower for PB-enriched than PB-excluded mRNAs. This was not due to translational repression within PBs, as the proportion of a given mRNA in PBs hardly exceeded 15%, but rather to some intrinsic mRNA property (Hubstenberger et al., 2017). In this respect, the frequency of low usage codons correlated more with PB localization (rs = 0.59, p<0.0001) than with protein yield (rs= 0.21, p<0.0001, significantly different from 0.59, p<0.0001) (Figure 2—figure supplement 1E). Conversely, the CDS length correlated more with protein yield (rs = 0.43, p<0.0001) than with PB localization (rs = 0.26, p<0.0001, significantly different from 0.43, p<0.0001). Nevertheless, the length of the CDS and its GC content contributed indepen-dently to PB localization (Figure 2E). Finally, combining the frequency of low usage codons with the CDS length, that is, considering the absolute number of low usage codons per CDS, was a shared parameter of both protein yield (rs = 0.46, p<0.0001, Figure 2F) and PB localization (rs = 0.49, p<0.0001). Strikingly, CDS with more than 100 low usage codons were particularly enriched in PBs, while those under 100 were mostly excluded (Figure 2F). One of the mechanisms linking codon usage to translation yield could be the abundance of cognate tRNAs (Novoa and Ribas de Pou-plana, 2012). However, codon usage in PB-excluded mRNAs was not more adapted to the abun-dance of amino-acylated tRNAs (Evans et al., 2017) than codon usage in PB-enriched mRNAs (Figure 2—figure supplement 2). In conclusion, the strong GC bias in PB mRNAs results in both a biased amino acid usage in encoded proteins and a biased codon usage. Furthermore, the high number of low usage codons in PB mRNAs is a likely determinant of their low protein yield.

The PB assembly factor DDX6 has opposite effects on mRNA stability

and translation rate depending on their GC content

In human, the DDX6 RNA helicase is key for PB assembly (Minshall et al., 2009). It associates with a variety of proteins involved in mRNA translation repression and decapping (Ayache et al., 2015; Bish et al., 2015), suggesting that it plays a role in both processes. To investigate how DDX6 activity is affected by mRNA GC content, we conducted a polysome profiling experiment in HEK293 cells transfected with DDX6 or control b-globin siRNAs for 48 hr. In these conditions, DDX6 expression decreased by 90% compared to control cells (Figure 3—figure supplement 1A). The polysome pro-file was largely unaffected by DDX6 silencing, implying that DDX6 depletion did not grossly disturb global translation (Figure 3—figure supplement 1B). Polysomal RNA isolated from the sucrose

Figure 2 continued

in/PB-out) and was ranked by decreasing values for each amino acid. The GC content of each codon is gray-coded below, using the scale indicated on the right. (C) The usage of some codons is biased independently of their GC content. Two examples are shown encoding Leucine (L) and Glycine (G). (D) The frequency of low usage codons strongly correlates with the GC content of the CDS, independently of their PB localization. The frequency of low usage codons was calculated for mRNAs excluded (PB-out) and enriched (PB-in) in PBs using a PB enrichment threshold of +/- 1 (in log2). It was expressed as a function of the CDS GC content at position 3 (GC3). Note that the slopes of the tendency curves are similar for enriched and PB-excluded transcripts. The difference between the Spearman correlation coefficients (rs) are nevertheless statistically significant (p<0.0001). (E) PB mRNAs have longer CDS than PB-excluded mRNAs. The analysis was performed as inFigure 1E. (F) The number of low usage codons per CDS is a good determinant of both protein yield and PB localization. The protein yield was expressed as a function of the number of low usage codons for PB-enriched (PB-in) and PB-excluded (PB-out) mRNAs. rs, Spearman correlation coefficient.

The online version of this article includes the following figure supplement(s) for figure 2: Figure supplement 1. Amino acid usage and codon usage biases in PBs.

(8)

gradient fractions (Figure 3—figure supplement 1B) and total RNA were used to generate libraries using random hexamers to allow for poly(A) tail-independent amplification. As expected, both total and polysomal DDX6 mRNA was markedly decreased (by 72%) following DDX6 silencing (Figure 3— figure supplement 1C–E; Supplementary file 1, sheet1). Since DDX6 is cytoplasmic ( Ernoult-Lange et al., 2009) and has a role in mRNA decay, we assumed that changes in total mRNA accu-mulation generally reflected an increased stability of the transcripts, though we cannot exclude altered transcription levels for some of them. As polysomal accumulation can result from both regu-lated translation and a change in total RNA without altered translation, we then used the polysomal to total mRNA ratio as a proxy measurement of translation rate. Nevertheless, for few transcripts, polysomal enrichment may reflect an elongation block rather than an increased rate of initiation. Analysis of the whole transcriptome showed a link between mRNA fate following DDX6 depletion and their GC content, but, intriguingly, the correlation was positive for changes in total RNA (rs= 0.45, p<0.0001;Figure 3—figure supplement 1F) and negative for changes in polysomal RNA (rs = 0.32, p<0.0001;Figure 3—figure supplement 1G). Therefore, DDX6 depletion affected dif-ferent mRNA subsets in total and polysomal RNA.

The extent of mRNA stabilization steadily increased with the GC content and became predomi-nant for transcripts with >50% GC (Figure 3A, left panel, Figure 3—figure supplement 2A). This analysis was repeated on an independent dataset available from the ENCODE project (ENCODE Project Consortium, 2012), obtained in a human erythroid cell line, K562, following induction of a stably transfected DDX6 shRNA, and using an oligo(dT)-primed library. Despite the differences in cell type, depletion procedure and sequencing methods, again, mRNA stabilization preferentially concerned those with high GC content (rs = 0.59, p<0.0001;Figure 3A, right panel, Figure 3—figure supplement 2A; Supplementary file 1, sheet2). In contrast, following DDX6 silencing in HEK293 cells, the translation rate predominantly increased for transcripts with less than 45% GC (rs= 0.53, p<0.0001;Figure 3B,Figure 3—figure supplement 2A). As a result, mRNAs with the most upregulated translation rate were the least stabilized, and conversely (Figure 3—fig-ure supplement 2B).

To investigate how DDX6 activity was related to its binding to RNA, we used the CLIP dataset of K562 cells, also available from the ENCODE project. In both HEK293 and K562 cells, the mRNAs clipped to DDX6 were particularly stabilized after DDX6 knockdown, as compared to all mRNAs

A

all mRNAs <40 40-4545-5050-5555-60>60 -0.4 -0.2 0.0 0.2 0.4 0.6 siDDX6 (HEK293) shDDX6 (K562) rs 0.59 rs 0.45 all mRNAs <4040-4545-5050-5555-60>60 -0.6 -0.4 -0.2 -0.0 0.2 0.4 P o ly so m e/ to ta l F C ( lo g 2)

B

siDDX6 (HEK293) rs -0.53 DDX6 CLIP (K562)

C

rs 0.41 -2 0 2 C L IP e n ri c h m e n t (l o g 2 ) all mRNAs <40 40-4545-5050-5555-60>60 -0.2 0.0 0.2 0.4 T o ta l F C ( lo g 2 )

GC content (%) GC content (%) GC content (%) GC content (%)

<40

40-4545-5050-5555-60>60

Figure 3. DDX6 silencing has opposite effects on mRNA fate depending on their GC content. (A) mRNA stabilization after DDX6 silencing in HEK293 and K562 cells applies to GC-rich mRNAs. The fold-changes (FC) in mRNA accumulation were analyzed as inFigure 1B. (B) mRNA translation derepression after DDX6 silencing in HEK293 cells applies to AU-rich mRNAs. The fold-changes in translation rate (polysomal/total mRNA ratio) were analyzed as in (A). (C) GC-rich mRNAs are particularly enriched in the DDX6 CLIP experiment.

The online version of this article includes the following figure supplement(s) for figure 3: Figure supplement 1. Polysome profiling following DDX6 silencing.

Figure supplement 2. Impact of DDX6 binding and mRNA length on DDX6 dependency. Figure supplement 3. Impact of the GC content on DDX6-dependency.

(9)

(Figure 3—figure supplement 2C;Supplementary file 1, sheet3), while they were not translation-ally derepressed in HEK293 cells (Figure 3—figure supplement 2D). In agreement, mRNAs with a high GC content were preferentially enriched in the DDX6 CLIP experiment (rs = 0.41, p<0.0001; Figure 3C,Figure 3—figure supplement 2A). Then, as we previously showed that DDX6 can oligo-merize along repressed transcripts (Ernoult-Lange et al., 2012), we also considered mRNA length. While DDX6-dependent decay had a marginal preference for short transcripts (rs = 0.09, p<0.0001), as a combined effect of CDS and 3’UTR length (Figure 3—figure supplement 2E,F), DDX6-dependent translation repression was independent of the CDS length but higher on mRNAs with long 3’UTRs (rs = 0.16, p<0.0001;Figure 3—figure supplement 2E,G). Interestingly, the GC content of the CDS and the 3’UTR were similarly predictive of DDX6 sensitivity, whether for mRNA stability (rs = 0.42 and 0.40 for CDS and 3’UTR, respectively, p<0.0001 for both) or for translation repression (rs= 0.53 and 0.52, respectively, p<0.0001 for both), while the 5’UTR was less signifi-cant (rs= 0.18 and 0.15 for stability and translation repression, respectively, p<0.0001 for both; Figure 3—figure supplement 3A–C).

Altogether, we showed that DDX6 knockdown affected differentially the mRNAs depending on the GC content of both their CDS and 3’UTR, with the most GC-rich mRNAs being preferentially regulated at the level of stability and the most AU-rich mRNAs at the level of translation.

DDX6/XRN1 and PAT1B decrease the stability of separate sub-classes

of mRNAs with distinct GC content

DDX6 acts as an enhancer of decapping to stimulate mRNA decay, upstream of RNA degradation by the XRN1 5’ 3’ exonuclease. To investigate whether XRN1 targets are similarly GC-rich, we per-formed XRN1 silencing experiments in two cell lines. HeLa cells were transfected with XRN1 siRNA (Figure 4—figure supplement 1A;Supplementary file 1, sheet4), while HCT116 cells stably trans-fected with an inducible XRN1 shRNA were induced with doxycyclin (Figure 4—figure supplement 1B;Supplementary file 1, sheet5), both for 48 hr. In both cell lines, XRN1-dependent decay prefer-entially acted on mRNAs which were GC-rich (rs= 0.41 for HeLa and 0.49 for HCT116, p<0.0001 for both;Figure 4A,Figure 4—figure supplement 1A) and localized out of PBs (rs= 0.35, p<0.0001; Figure 4—figure supplement 1C), as observed for DDX6.

PAT1B is a well-characterized direct DDX6 partner known for its involvement in mRNA decay (Vindry et al., 2017;Braun et al., 2010;Ozgur et al., 2010;Vindry et al., 2019). As for DDX6, we assume that changes in steady-state mRNAs following PAT1B silencing generally reflect their increased stability (though, again, we cannot exclude some changes at the transcription level). How-ever, using our previous PAT1B silencing experiment in HEK293 cells (Vindry et al., 2017), we sur-prisingly found a negative correlation between mRNA stabilization after PAT1B and after DDX6 silencing (rs= 0.31, p<0.0001;Figure 4—figure supplement 1D;Supplementary file 1, sheet6), suggesting that they largely target separate sets of mRNAs. Unexpectedly, the correlation was how-ever positive with translational derepression after DDX6 silencing (rs = 0.45, p<0.0001;Figure 4— figure supplement 1E), indicating that PAT1B preferentially targets mRNAs that are translationally repressed by DDX6. Accordingly, these transcripts are prone to PB storage (rs= 0.49, p<0.0001; Fig-ure 4—figFig-ure supplement 1F), as reported previously (Vindry et al., 2017). Indeed, in contrast to DDX6 and XRN1 decay targets, PAT1B targets tended to be AU-rich (rs = 0.50, p<0.0001; Figure 4B,Figure 4—figure supplement 1A). To gain insight into the mechanism of regulation by PAT1B, we analyzed the read coverage in the PAT1B silencing experiment (Figure 4C) and found it to be unchanged over the whole transcriptome. In contrast, following XRN1 silencing, the 5’ cover-age was higher, confirming that such an analysis can reveal 5’ decay (Figure 4D). Of note, in control cells PAT1B target mRNAs had a higher 5’ coverage than average (Figure 4C), while XRN1 targets had a lower 5’ coverage than average (Figure 4D). These results suggest that mRNA accumulation in the absence of PAT1B does not result from their 5’ end protection.

In conclusion, DDX6 and PAT1B decrease the stability of distinct mRNA subsets, which strongly differ in their GC content. The results suggest that DDX6 is a cofactor of XRN1 5’ 3’ exonuclease, whereas PAT1B affects 3’ to 5’ degradation.

To obtain a global visualization of the results we conducted a clustering analysis of the various datasets (Figure 4E). Note that to avoid clustering interdependent datasets, we included the changes in polysomal RNA after DDX6 silencing rather than in the polysomal/total RNA ratio. Alto-gether, the heatmap shows that GC-rich mRNAs are excluded from PBs and tend to be decayed by

(10)

D

C

Ra w Z s co re

-

1 1 G C (% ) s iD D X6 (to ta l) s iXR N 1 (to ta l) s iPA T 1 B (to ta l) P -b o d ie s s iD D X6 (p o ly s .) GC-rich m R N A s AU-rich m R N A s

E

1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 -1 0 1 0.51 0.44 0.20 0.37 0.10 0.24 0.59 - 0.31 0.30 0.28 0.69 - 0.33 0.35 0.35 0.50 - - -- - -0 10 20 30 40 50 60 70 80 90 100 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 position (%) n orm a li ze d re a d s 0 10 20 30 40 50 60 70 80 90 100 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 position (%) no rma li z e d re a ds PAT1B targets all mRNAs XRN1 targets all mRNAs PAT1B targets all mRNAs siPAT1B siCtrl XRN1 targets

all mRNAs siCtrl

siXRN1

B

siPAT1B (HEK293)

A

siXRN1 (HeLa) shXRN1 (HCT116) rs 0.49 rs 0.41 rs -0.50 all mRNAs <40 40-4545-5050-5555-60>60 -0.5 0.0 0.5 T o ta l F C ( lo g 2) all mRNAs <40 40-4545-5050-5555-60>60 -0.5 0.0 0.5 T o ta l F C ( lo g 2 ) all mRNAs <40 40-4545-5050-5555-60>60 -0.2 0.0 0.2 0.4 0.6

GC content (%) GC content (%) GC content (%)

Figure 4. XRN1 and PAT1B targets have distinct GC content. (A) mRNA stabilization after XRN1 silencing in HeLa and HCT116 cells applies to GC-rich mRNAs. The analysis was performed as inFigure 1B. The GC content distribution for all mRNAs is presented for comparison (in gray). (B) mRNA stabilization after PAT1B silencing in HEK293 cells applies to AU-rich mRNAs. The analysis was performed as in (A). (C) Read coverage of PAT1B targets (FC >0.7, n = 330, solid lines) and all mRNAs (n = 16000, dashed lines) in the siPAT1B dataset. The read coverage was analyzed in each duplicate experiment and normalized as described in Materials and methods. The average value in control cells (gray lines) and after PAT1B silencing (peach Figure 4 continued on next page

(11)

a mechanism involving DDX6 and XRN1, while AU-rich mRNAs are recruited in PBs, they undergo DDX6-dependent translation repression and their stability depends on PAT1B.

Specific mRNA decay factors and translation regulators target mRNAs

with distinct GC content

Having shown that GC content is a distinctive feature of DDX6 and XRN1 versus PAT1B targets, we investigated the link between this global sequence determinant and a variety of sequence-specific post-transcriptional regulators for which relevant genome-wide datasets are available (Figure 5—fig-ure supplement 1A).

On the mRNA decay side (group I lists), we considered the Nonsense Mediated Decay (NMD) pathway, taking as targets the mRNAs cleaved by SMG6 (Schmidt et al., 2015), and the m6 A-associ-ated decay pathways, using the targets of the YTHDF2 reader defined by CLIP (Wang et al., 2014; Yang et al., 2015). We also analyzed mRNAs with a 5’UTR-located G4 motif, which have been shown to be preferential substrates of murine XRN1 in vitro (Bashkirov et al., 1997). On the transla-tion regulatransla-tion side (group II lists), we analyzed the TOP mRNAs, whose translatransla-tion is controlled by a TOP motif at the 5’ extremity (Thoreen et al., 2012), and targets of various PB proteins and/or DDX6 partners (Hubstenberger et al., 2017; Ayache et al., 2015): FXR1-2, FMR1, PUM1-2, IGF2BP1-3, the helicase MOV10, ATXN2, 4E-T, ARE-containing mRNAs and the targets of the two ARE-binding proteins HuR and TTP. We also included mRNAs with a CPE motif, since DDX6 is a component of the CPEB complex that binds CPEs (Minshall et al., 2007). Of note, among the group II factors, some are known to also affect mRNA half-life, as exemplified by the ARE-binding proteins (Wells et al., 2017). G4, ARE and CPE motifs have been defined in silico, while the targets of the various factors originate from RIP and CLIP approaches in human cells or mouse studies in the case of TOP mRNAs (see Materials and methods).

Intriguingly, compared to all mRNAs, group I list mRNAs were GC-rich, as well as TOP mRNAs and ATXN2 targets, whereas all other group II lists were AU-rich (Figure 5A). Furthermore, they shared common behavior in the various experiments. This is summarized inFigure 5Bin a heatmap representing their median value in each dataset, whileFigure 5—figure supplements 1and2 pro-vide detailed analysis, as described below.

Group I list mRNAs tended to be dependent on DDX6 and XRN1 but not on PAT1B for stability (Figure 5—figure supplement 1B–D), with nevertheless some variation between cell lines, as only SMG6 targets were sensitive to XRN1 depletion in HeLa cells (Figure 5—figure supplement 1C, upper panel). They did not accumulate in PBs and their translation rate was independent of DDX6 (Figure 5—figure supplement 2A,B). These results were consistent with their high GC content and our global analysis above. However, surprisingly, within PB-excluded mRNAs, there was little or no additional effect of being a SMG6 target, an YTHDF2 target or containing a G4 motif, neither for DDX6- nor for XRN1-dependent decay (Figure 5—figure supplement 2C,D).

Group II list mRNAs, except TOP mRNAs, ATXN2 and 4E-T targets, had the exact mirror fate compared to group I lists: they were stabilized following PAT1B silencing (Figure 5—figure supple-ment 1D), as previously reported for ARE-containing mRNAs and the targets of the ARE-BPs HuR and TTP (Vindry et al., 2017), but not following DDX6 or XRN1 silencing (Figure 5—figure supple-ment 1B,C); they were enriched in PBs and translationally more active after DDX6 silencing ( Fig-ure 5—figFig-ure supplement 2A,B), which is consistent with the reported presence of most of these regulatory proteins in PBs (Hubstenberger et al., 2017;Franks and Lykke-Andersen, 2007).

Figure 4 continued

lines) was plotted, with the bars representing the duplicate values. An expanded view of the dashed box is presented on the right panel. (D) Read coverage of XRN1 targets (FC >0.8, n = 199, solid lines) and all mRNAs (n = 13760, dashed lines) in the siXRN1 dataset. The data were analyzed as in (C). (E) Clustering analysis of mRNAs depending on their GC content, their differential expression after silencing DDX6, XRN1 or PAT1B, and their enrichment in PBs. Raw GC content and log2 transformed ratio of the other datasets were used for the clustering of both transcripts (lines) and datasets (columns). The values were color-coded as indicated on the right scale, and the Spearman correlation matrix is presented below (all p<10 48). The heatmap highlights the distinct fate of GC-rich and AU-rich mRNAs.

The online version of this article includes the following figure supplement(s) for figure 4: Figure supplement 1. Transcriptome analysis following XRN1 and PAT1B silencing.

(12)

A

H u R T T P F X R 1 F M R 1 C P E P UM 1 IG F 2BP 1 M O V 10 4E-T A T X N 2 ARE FXR 2 P UM 2 IG F 2B P 2 IG F 2B P 3 all mRNAs

B

G C (% ) s iD D X6 (to ta l) s iXR N 1 (to ta l) s iPA T 1 B (to ta l) P-b o d ie s s iD D X6 (p o ly /to ta l) G ro u p -I g e n e s max min G ro u p -I I g e n e s SMG6 YTHDF2 G4 TOP ARE HuR TTP FXR1 FRX2 FMR1 CPE PUM1 PUM2 IGF2BP1 IGF2BP2 IGF2BP3 MOV10 4E-T ATXN2 All mRNAs m iR ta rg e ts All mRNAs AGO1 AGO4 miR-99a-5p miR-24-3p miR-92a-3p miR-10a-5p let-7a-5p miR-10b-5p miR-16-5p miR-99b-5p miR-25-3p let-7f-5p miR-186-5p miR-18a-5p AGO2 AGO3 miR-93-5p miR-30e-5p miR-19a-3p miR-19b-3p miR-301a-3p miR-20a-5p miR-106b-5p miR-17-5p miR-101-3p miR-21-5p

D

G C (% ) s iD D X6 (to ta l) s iXR N 1 (to ta l) s iPA T 1 B (to ta l) P-b o d ie s s iD D X6 (p o ly /to ta l) 35 40 45 50 55 60 65 G C c o n te n t (% ) YTHDF2 G4 TOP SMG6 G C c o n te n t o f m iR N A t a rg e ts ( % ) AGO4 miR-93-5p miR-19b-3p miR-106b-5p let-7a-5p

miR-19a-3p miR-10a-5p miR-17-5p miR-92a-3p miR-20a-5p let-7f-5p

miR-21-5p miR-30e-5p miR-25-3p miR-186-5p miR-101-3p miR-301a-3p miR-18a-5p miR-10b-5p

miR-16-5p miR-99a-5p miR-99b-5p miR-24-3p all mRNAs AGO3 AGO2 AGO1 35 40 45 50 55 60 65

C

Group II Group I

Figure 5. GC biases in the targets of various RNA decay factors, translation regulators and miRNAs. (A) GC content biases in the targets of various RBPs. The targets of the indicated factors were defined using CLIP experiments or motif analysis (see Materials and methods). The boxplots represent the distribution of the GC content of their gene. The distribution for all mRNAs is presented for comparison (in gray) and the red dashed line indicates its median value. (B) Heatmap representation of the different factors depending on the behavior of their mRNA targets in the different datasets. The Figure 5 continued on next page

(13)

Among the three group II outsiders, ATXN2 targets and TOP mRNAs behaved like group I lists, except that they were not dependent on DDX6 for stability. ATXN2 is a major DDX6 partner, but one which is excluded from PBs (Ayache et al., 2015;Nonhoff et al., 2007), consistent with its tar-gets also being excluded from PBs (Figure 5—figure supplement 2A). However, these mRNAs were only weakly or not stabilized following DDX6 or XRN1 silencing (Figure 5—figure supplement 1B,C), leaving unresolved the function of the ATXN2/DDX6 interaction in the cytosol. TOP mRNAs are special within group II lists, in that their translational control relies on a cap-adjacent motif rather than on 3’UTR binding. 4E-T is a major DDX6 partner required for PB assembly (Ayache et al., 2015;Kamenska et al., 2016). Its targets were markedly enriched in PBs (Figure 5—figure supple-ment 2A), though, intriguingly, poorly affected by PAT1B silencing in terms of stability (Figure 5— figure supplement 1D), or by DDX6 silencing in terms of translation (Figure 5—figure supplement 2B). This dissociation between PB localization and mRNA fate indicated that PB recruitment is not sufficient for an mRNA to have a PAT1B-dependent stability and DDX6-dependent translation repression. It also pointed to a particular role of 4E-T in PB targeting or scaffolding.

Our analysis is also informative on the link between DDX6-dependent decay and codon usage. Previous yeast studies have debated whether suboptimal codons could enhance DDX6 recruitment to trigger mRNA decay (Radhakrishnan et al., 2016) or not (Chan et al., 2018;He et al., 2018). We showed above that GC-rich mRNAs, which tend to be decayed by DDX6 and to bind DDX6 (Figure 3A,C), are enriched for optimal rather than suboptimal codons (Figure 2D). Furthermore, the correlation between polysomal retention (defined as the fraction of total mRNA present in poly-somes) and DDX6-dependent decay was weak (rs= 0.10, p<0.0001;Figure 5—figure supplement 2E). Thus, in HEK293 cells, this mechanism seemed to account for a minor part of DDX6-dependent decay, if any, as also found in mouse stem cells (Freimer et al., 2018).

In conclusion, we observed that mRNA decay regulators preferentially target GC-rich mRNAs, which undergo DDX6- and XRN1-dependant decay, whereas most translation regulators preferen-tially target AU-rich mRNAs, which are subjected to storage in PBs and have a PAT1B-dependent stability .

Targets of the miRNA pathway have a biased GC content

The miRNA pathway, which leads to translation repression and mRNA decay, has been previously associated with DDX6 activity and PB localization (Bhattacharyya et al., 2006; Chu and Rana, 2006). To study this pathway, we used the list of AGO1-4 targets, as identified in CLIP experiments (Yang et al., 2015) (Figure 5—figure supplement 3A). In addition, we analyzed the experimentally documented targets of the 22 most abundant miRNAs in HEK293 cells (19 fromHafner et al., 2010, and three additional ones from our own quantitation, Figure 5—figure supplement 3B), as described in miRTarBase (Hsu et al., 2014). The mRNA targets of AGO proteins were AU-rich, as observed for most group II RBPs, and this was also true for the targets of most miRNAs when ana-lyzed separately (Figure 5C). Overall, they also shared common behavior in the various silencing experiments and PB dataset, with nevertheless some differences. This is summarized inFigure 5Din a heatmap representing their median value in each dataset, whileFigure 5—figure supplement 3 provides detailed analysis, as described below.

The AGO targets tended to accumulate in PBs (Figure 5—figure supplement 3C; note that the number of AGO4 targets was too small to reach statistical significance) and their translation rate was

Figure 5 continued

lines were ordered by increasing GC content, and the columns as inFigure 4E. (C) GC content biases in the targets of various AGO proteins and miRNAs. The AGO targets (in yellow) were defined using CLIP experiments and the miRNAs targets (in violet) using miRTarbase. The data are

represented as in (A). (D) Heatmap representation of AGO and miRNAs depending on the behavior of their mRNA targets in the different datasets. The data were represented as in (B), using the same color code.

The online version of this article includes the following figure supplement(s) for figure 5: Figure supplement 1. Targets of group I and II regulators (part I).

Figure supplement 2. Targets of group I and II regulators (part II). Figure supplement 3. Targets of the miRNA pathway.

Figure supplement 4. Targets of the group I and II regulators behave like mRNAs of similar GC content. Figure supplement 5. miRNA targets behave like mRNAs of similar GC content.

(14)

DDX6-dependent (Figure 5—figure supplement 3D). In terms of stability, only AGO2 targets were marginally DDX6-dependent (Figure 5—figure supplement 3E), and the effects were not stronger when analyzing separately the mRNAs enriched or excluded from PBs (Figure 5—figure supple-ment 3F). In contrast, their stability was PAT1B-dependent (Figure 5—figure supplement 3G).

The targets of the 22 miRNAs had a behavior overall similar to the targets of AGO proteins, with accumulation in PBs, DDX6-dependent translation, and PAT1B rather than DDX6- or XRN1-depen-dent stability (Figure 5D). However, our analysis revealed some differences between miRNAs, partic-ularly in terms of extent of PB storage (Figure 5—figure supplement 3H), which appeared associated with distinct GC content: at the two extremes, miR21-5p targets were particularly AU-rich and strongly enriched in PBs, while the targets of miR-99b-5p, the most GC-rich in these sets, were not. In terms of translation, miR-18a-5p targets were not sensitive to DDX6 silencing, despite clear enrichment in PBs. In terms of stability, the targets of the less GC-rich miR-99b-5p were sensitive to DDX6 but not PAT1B silencing.

In conclusion, miRNA targets generally tend to be AU-rich, like the targets of most translation regulators, and accumulate in PBs. While their translation depends on DDX6, their stability is not markedly affected following DDX6 or XRN1 silencing, but is dependent on PAT1B.

The GC content of mRNAs shapes post-transcriptional regulation

As the global GC content appeared closely linked to mRNA fate, but also to RBP and miRNA bind-ing, as well as to translation activity, our analyses then aimed at ranking the importance of these vari-ous features.

We first assessed the respective weight of the GC content and the binding capacity of particular RBPs. To this aim, we binned the whole transcriptome depending on its GC content (bin size of 500 transcripts). The median fold-changes of the bins in each RNAseq dataset were calculated and plot-ted as a function of their median GC content. Median values were similarly calculaplot-ted for the various group I and II target lists and overlaid for comparison (Figure 5—figure supplement 4). Surprisingly, the fold changes of the targets of particular RBPs generally fell very close to the tendency plot based on GC content only. This was particularly true for DDX6- and XRN1-dependent decay, with only 4E-T targets being more stabilized after DDX6 silencing than expected from their GC content ( Fig-ure 5—figFig-ure supplement 4A,B). In terms of translation rate, only FXR1 targets were slightly more translated after DDX6 depletion than expected from their GC content (Figure 5—figure supple-ment 4C). FXR1 targets were also more dependent on PAT1B for stability (Figure 5—figure supple-ment 4D) and more enriched in PBs (Figure 5—figure supplement 4E). In the case of HuR, TTP, FXR1-2, FMR1, PUM2, IGF2BP1-3 and MOV10, there was some, but minimal, additional PAT1B-sen-sitivity and PB enrichment.

Similarly, the fate of the miRNA targets was mostly in the range expected from their GC content (Figure 5—figure supplement 5). Nevertheless, some miRNA-specific effects were observed. For instance, the targets of several miRNAs were more stabilized than expected after DDX6 depletion, including miR-99b-5p, 92a-3p, 16–5 p, 18a-5p, 19a-3p, 19b-3p (Figure 5—figure supplement 5A), though this was not observed following XRN1 depletion (Figure 5—figure supplement 5B). Simi-larly, while the targets of miR-101–3 p and miR-21–5 p were both particularly enriched in PBs ( Fig-ure 5—figFig-ure supplement 5E), only miR-101–3 p targets were particularly dependent on PAT1B for stability (Figure 5—figure supplement 5D). Interestingly, we noted that the median GC content of the miRNA targets correlated with the GC content of the miRNA itself (Figure 5—figure supple-ment 5F). Thus, despite their small size, the miRNA binding sites tend to have a GC content similar to that of their full-length host mRNA, which affects their fate in terms of PB localization and post-transcriptional control. Altogether, our analysis showed that, in steady-state conditions, mRNA GC content is a major parameter in terms of PB localization and regulation by DDX6, XRN1 and PAT1B, while the presence of binding sites for regulatory proteins makes subsidiary contributions.

Next, the strong correlation observed between GC content and PB localization raised the possi-bility that localization out of PBs was sufficient to determine mRNA sensitivity to DDX6 and XRN1 decay. While a tempting hypothesis, it could not explain all DDX6- and XRN1-dependent decay, since TOP mRNAs were strongly excluded from PBs (Figure 5—figure supplement 2A), but unaf-fected by DDX6 or XRN1 silencing (Figure 5—figure supplement 1B,C). To address this issue more generally, we considered the fate of the minor subset of AU-rich mRNAs that were excluded from PBs. Compared to other similarly AU-rich transcripts, these mRNAs were indeed more sensitive to

(15)

XRN1-dependent decay (Figure 6—figure supplement 1A). However they were not more sensitive to DDX6-dependent decay (Figure 6—figure supplement 1B). This suggested that XRN1 prefer-ence for GC-rich mRNAs is at least in part related to their exclusion from PBs, whereas DDX6 has a true preference for GC-rich mRNAs.

Interestingly, these PB-excluded AU-rich mRNAs were strongly enriched in mRNAs encoding secreted proteins and proteins associated with membranous organelles, with GO categories related to mitochondria, intracellular organelles and extracellular matrix proteins representing up to 36% of the transcripts (Figure 6—figure supplement 1C). Thus, while mRNA localization in PBs is highly influenced by their GC content, it may also be outcompeted by retention on membranous organelles and plasma membrane.

Contribution of both the CDS and 3’UTR GC content to PB localization

The next major issue was to distinguish which of the CDS or 3’UTR is more important for PB localiza-tion, since they have very similar GC contents (rs= 0.72, p<0.0001).

As a first approach, we analyzed PB localization of long non-coding RNAs (lncRNAs) (Hubstenberger et al., 2017). The correlation between their GC content and PB accumulation was significant (rs = 0.20, p<0.0001), but much weaker than that observed for mRNAs ( 0.64, Figure 1B) or 3’UTRs ( 0.55,Figure 1D) ( 0.20 and 0.55 are significantly different, p<0.0001). In fact, AU-rich lncRNAs poorly accumulated in PBs, while GC-rich lncRNAs were excluded (Figure 6— figure supplement 1D,E). This suggested that the coding capacity of mRNAs was important for PB localization. As a second approach, we directly analyzed the respective contribution of the GC con-tent of CDS and 3’UTR to PB localization. On one side, we analyzed transcripts by groups of similar 3’UTR GC content. Their GC3 was systematically much lower in PB mRNAs than in PB-excluded mRNAs, with differences ranging between 9% and 13% GC (Figure 6A,Figure 6—figure supple-ment 1E). In a mirror analysis, we analyzed groups of transcripts with similar GC3. The importance of the 3’UTR GC content became visible only for GC3 higher than 50% GC (note that GC3 median value is 59% GC), with AU-rich 3’UTR allowing for their accumulation in PBs despite a GC-rich CDS (Figure 6B,Figure 6—figure supplement 1E). We concluded that both the CDS and the 3’UTR GC content are important for PB localization, with the CDS being the primary feature.

We speculate that suboptimal translation of AU-rich CDS makes mRNAs optimal targets for trans-lation regutrans-lation, since any control mechanism has to rely on a limiting step. Conversely, optimally translated transcripts would be better controlled at the level of stability. One prediction is that pro-teins produced in limiting amounts, such as those encoded by haplo-insufficiency genes, are more likely to be encoded by PB mRNAs. Genome-wide haplo-insufficiency prediction scores have been defined for human genes, using diverse genomic, evolutionary, and functional properties trained on known haplo-insufficient and haplo-sufficient genes (Huang et al., 2010; Steinberg et al., 2015). Using these scores, we found that haplo-insufficient mRNAs were indeed significantly enriched in PBs (Figure 6C).

To add experimental support to the importance of GC content for PB assembly, we conducted two assays. First, we analyzed the localization of reporter transcripts that differ only by the GC con-tent of their CDS. HEK293 cells stably expressing the PB marker GFP-LSM14A (Hubstenberger et al., 2017) were transfected with plasmids containing an AU-rich (36% GC) or GC-rich (58% GC) CDS that encodes the same Renilla luciferase (Rluc) protein. After 24 h cells were analyzed for luciferase activity and transcript localization. In agreement with our previous analyses, Rluc protein yield was considerably reduced (4.5-fold) using the AU-rich rather than the GC-rich ver-sion of the CDS, despite similar mRNA levels (Figure 6D). The localization of the Rluc transcripts was then analyzed by smiFISH using AU-rich or GC-rich specific probes (Figure 6—figure supplement 2A,Supplementary file 2) (Tsanov et al., 2016). PBs containing clusters of Rluc mRNA molecules were five times more frequent using the AU-rich than the GC-rich version of the CDS (Figure 6E,F). A similar result was obtained in HEK293 cells after PB immunostaining with DDX6 antibodies (Figure 6F,Figure 6—figure supplement 2B). Therefore, simply changing the GC content of this medium-size CDS (564 codons) was sufficient to modify mRNA localization in PBs.

Second, we tested the capacity of AU-rich and GC-rich RNA to form granules independently of translation. To this aim, we set-up a cell-free assay using HEK293 cells expressing GFP-LSM14A to monitor the formation of fluorescent PB-like granules and count them by flow cytometry, as previ-ously performed for PBs (Hubstenberger et al., 2017). After lysis and elimination of preexisting PBs

(16)

0 50 100 150 Haplo-insuĸciency 0 0.02 0.04 0.06 0.08 0.1 0.12 0.03 0.04 0.06 0.09 0.14 0.22 0.33 0.50 0.76 1.00 ĨƌĂĐƟŽŶŽĨ m RN As score PB-out PB-in

C

<35 35-40 40-45 45-50 50-55 >55 0.3 0.4 0.5 0.6 0.7 0.8 0.9 3'UTR GC content (%) G C 3 ( % ) <40 40-45 45-50 50-55 55-60 60-65 65-70 >70 0.3 0.4 0.5 0.6 GC3 (%) 3 'U T R G C c o n te n t (% )

B

A

PB-in PB-out PB-in PB-out

G

H

0 0.5 GC-rich 1 0.5 AU-rich 1 PB-like count (% ) RNA (µg) Cell lysate LSM14A 500g nucleus 11000g PBs Cell free extract soluble LSM14A + DDX6 PB-like granules LSM14A

Cells Pelleted PBs PB-like granules

E

D

AU-rich mRNA GC-rich mRNA

F

0 20 40 60 80 100 120 GC-rich AU-rich RNA protein Rluc/Fluc (% ) PBs with mRNA c lu s te r (%)

mRNA LSM14A mRNA LSM14A

0 10 20 30 40 50 60 70 80 0 4 8 12 16 20 D-DDX6 GC-rich AU-rich GFP-LSM14A GC-rich AU-rich exp. 1 exp. 2 exp. 3 exp. 4

Figure 6. The GC content of the CDS and the 3’UTR both contribute to PB localization. (A) General importance of the CDS. Transcripts were subdivided into six classes depending on the GC content of their 3’UTR (from <40 to >55%). The boxplots represent the distribution of their CDS GC content at position 3 (GC3) in PB-enriched (PB-in) and PB-excluded (PB-out) mRNAs. (B) Importance of the 3’UTR for GC-rich CDSs. Transcripts were subdivided into eight classes depending on their GC3 (from <40 to >70%). The boxplots represent the distribution of their 3’UTR GC content in PB-Figure 6 continued on next page

(17)

by centrifugation, addition of recombinant DDX6 triggered the formation of new granules on ice, in a dose-dependent manner (Figure 6—figure supplement 2C–E). These granules had a similar size to endogenous PBs (Figure 6G,Figure 6—figure supplement 2D). This reconstitution assay was surprisingly efficient, as granule formation required rather low concentrations of both the lysate com-ponents (about 100-fold lower than in cells, see Materials and methods) and recombinant DDX6 (0.17 mM versus 3.3 mM in cells,Ernoult-Lange et al., 2012). Next, the cell-free extract was briefly treated with micrococcal nuclease to decrease the amount of cellular RNA, and the assay was repeated with or without addition of an either AU-rich or GC-rich 1700 nt-long synthetic RNA (Figure 6H,Figure 6—figure supplement 2F). The AU-rich RNA increased the number of PB-like granules in a dose-dependent manner, while GC-rich RNA prevented their formation. Therefore, in the complex lysate environment and at 0˚C, uncapped non-polyadenylated AU-rich RNA specifically favor the condensation of granules that are DDX6-dependent and contain LSM14A, two proteins that play a major role in the assembly of cellular PBs.

We conclude from these experimental data and our previous analyses that both the CDS and the 3’UTR contribute to PB localization. Low GC content in the CDS likely acts, at least in part, through codon usage and low translation efficiency. In the 3’UTR low GC content could allow for the binding of RBPs with affinity for AU-rich motifs and/or influence RNA secondary structure.

Discussion

An integrated model of post-transcriptional regulation

Our combined analysis of the transcriptome of purified PBs together with transcriptomes following the silencing of broadly-acting storage and decay factors, including DDX6, XRN1 and PAT1B, pro-vided a general landscape of post-transcriptional regulation in human cells, where mRNA GC con-tent plays a central role. As schematized inFigure 7, GC-rich mRNAs are excluded from PBs and mostly controlled at the mRNA level by a mechanism involving the helicase DDX6 and the 5’ 3’ exo-nuclease XRN1. In contrast, AU-rich mRNAs are enriched in PBs and rather controlled at the level of translation by a mechanism also involving DDX6, while their accumulation tend to depend on a mechanism involving the DDX6 partner PAT1B and most likely 3’ decay. Accordingly, NMD and m6A-associated mRNA decay pathways tend to target GC-rich mRNAs, while most sequence-spe-cific translation regulators and miRNAs tend to target AU-rich mRNAs. The distinct fate of GC-rich and AU-rich mRNAs correlates with a contrasting protein yield resulting from both different codon usage and CDS length. Thus, 5’ mRNA decay appears to control preferentially mRNAs with optimal

Figure 6 continued

enriched (PB-in) and PB-excluded (PB-out) mRNAs. (C) The transcripts of haplo-insufficiency genes are enriched in PBs. The haplo-insufficiency score is the probability that a gene is haplo-insufficient, as taken from theHuang et al. (2010)study. The analysis was performed for PB-enriched (PB-in, n = 4646, median score 0.26) and PB-excluded (PB-out, n = 4205, median score 0.17) mRNAs. The difference of distribution of haplo-insufficiency scores was statistically significant using a two tail Mann-Whitney test: p<0.0001. The results were similar usingSteinberg et al. (2015)scores. (D) Protein yield is higher from a GC-rich than an AU-rich CDS. HEK293 cells were transfected with Rluc reporters differing by the GC content of their CDS, along with a control Fluc plasmid. After 24 hr, mRNA levels were measured by qPCR and protein levels by luciferase activity. The Rluc to Fluc ratio for the GC-rich reporter was set to 100 (n = 3). Error bars, SD. (E, F) Preferential localization of AU-rich transcripts in PBs. HEK293 cells expressing GFP-LSM14A were transfected with the AU-rich and GC-rich Rluc reporters and the localization of the Rluc transcripts (in red) was analyzed by smiFISH. Representative cells are shown in (E). Bar, 5 mm. Arrows indicate the PBs enlarged above. The experiment was performed in duplicate (exp. 1 and 2) and repeated in HEK293 cells where PBs were immunostained using DDX6 antibodies (exp. 3 and 4). The percentage of PBs containing clusters of Rluc transcripts in the four experiments is represented in (F). Exp.1: 56/75 PBs from 21/27 cells; exp.2: 87/75 PBs from 38/35 cells; exp.3: 31/32 PBs from 15/19 cells; exp.4: 72/ 83 PBs from 34/41 cells (G) Assembly of PB-like granules in cell-free extracts from HEK293 cells expressing GFP-LSM14A. The scheme recapitulates the main steps of the assay. Fluorescence microscopy images show that PBs in cells, PBs after cell lysis, and reconstituted PB-like granules have similar size. Bar, 10 mm. (H) AU-rich RNA favors the formation of PB-like granules. PB-like granules were assembled in cell-free extracts in the presence of AU-rich or GC-rich RNA, and counted by flow cytometry. Their number in the absence of added RNA was set to 100 (n = 3 experiments in duplicate, using two independent cell-free extracts and RNA preparations). Error bars, SD.

The online version of this article includes the following figure supplement(s) for figure 6:

Figure supplement 1. Role of PB localization in XRN1 and DDX6 sensitivity and importance of the coding property for PB localization. Figure supplement 2. The GC content of reporters RNAs is key for PB localization.

(18)

translation, which are mostly GC-rich, whereas translation regulation is mostly used to control mRNAs with limiting translational efficiency, which are AU-rich.

It should be stressed that this model only applies to post-transcriptional regulation pathways that involve PBs, XRN1, DDX6 and PAT1B. Moreover, while the analysis was consistent in proliferating cells of various origins, giving rise to a general model, it is possible that changes in cell physiology, for instance at particular developmental stages or during differentiation, rely on a different mecha-nism. In addition, our analysis focused on trends common to most transcripts, which does not pre-clude that particular mRNAs could be exceptions to the general model, being GC-rich and translationally controlled, or AU-rich and regulated by 5’ decay. In terms of translation yield and PB localization, this model is strongly supported by our experiments using AU-rich and GC-rich RNAs: AU-rich reporter mRNAs have a low protein yield compared to GC-rich ones, they preferentially localize in PBs in cells, and they enhance the formation of PB-like granules in a cell free extract.

GC content and codon usage

While the redundancy of the genetic code should enable amino acids to be encoded by synonymous codons of different base composition, the wide GC content variation between enriched and PB-excluded mRNAs has consequences on the amino acid composition of encoded proteins. It also strongly impacts the identity of the wobble base: in PB mRNAs, the increased frequency of A/U at position 3 of the codon mechanically results in an increased use of low usage codons. As CDS are also longer in PB mRNAs, it further increases the number of low usage codons per CDS in these mRNAs. Interestingly, we showed that the absolute number of low usage codons per CDS best cor-relates with low protein yield. Thus, these results provide a molecular mechanism to a previously unexplained feature of PB mRNAs, that is, their particularly low protein yield, which we reported was

PBs

Decay

DDX6

PAT1B

Decay

XRN1

DDX6

AT

GC

AT

GC

T

ran

sl

o

n

Figure 7. Schematic representation recapitulating the features of mRNA post-transcriptional regulation depending on their GC content.

The online version of this article includes the following figure supplement(s) for figure 7: Figure supplement 1. Distribution of the gene GC content in various eukaryotic genomes.

Figure

Figure 1. PB mRNAs are AU-rich and longer than average. (A) Long mRNAs are particularly enriched in PBs
Figure 2. Codon usage is strongly biased in PBs. (A) PB mRNAs and PB-excluded mRNAs encode proteins with different amino acid usage
figure supplement 1C–E; Supplementary file 1, sheet1). Since DDX6 is cytoplasmic (Ernoult- (Ernoult-Lange et al., 2009) and has a role in mRNA decay, we assumed that changes in total mRNA  accu-mulation generally reflected an increased stability of the tra
Figure 4. XRN1 and PAT1B targets have distinct GC content. (A) mRNA stabilization after XRN1 silencing in HeLa and HCT116 cells applies to GC-rich mRNAs
+4

Références

Documents relatifs

Fin octobre 2017, en France métropolitaine, parmi les personnes inscrites à Pôle emploi et tenues de rechercher un emploi (catégories A, B, C), 3 483 600 sont sans emploi (catégorie

On a également observé que les accidents vasculaires céré- braux de type non hémorragique sont survenus chez 4,4 % des patients du groupe recevant le placebo, comparativement à 3,4

2.5 Worm sampling When it comes to the measurement of Green’s functions and other correlation functions in general models, some care has to be taken, since the standard

(d,e) Representive plots of ICOS and Roquin (as Thy-1.1) expression (d) and mean fluorescence intensity of ICOS (e) in MEF cells sequentially transduced with retroviruses encoding

By analysing the degradation kinetics we demonstrate the existence of di fferent processes: (i) trapping of electrons in the gate insulator under positive gate bias, (ii) time-

In this manuscript I will focus on the int6 (Integration site 6) gene, the sixth member of the int genes family which has been demonstrated to be a frequent site were the

Relative apparent resistivity variation profile over depth using the Transmission configuration 372. during the drying of the concrete specimen: (a) at 20 °C; (b) at

Aucune importance Une légère importance Une certaine importance Une moyenne importance Une importance considérable Une énorme importance Une extrême importance 0 1