• Aucun résultat trouvé

Identifying epigenetic modifications

NGS techniques and RNA-seq increased our understanding of the genome and gene expression, respectively, but precise knowledge on epigenetic regulation was still missing. Although RNA-seq gives insights on expression and regulation of ncRNAs, no information on DNA methylation and chromatin modifications can be acquired.

Several techniques, based on NGS, were hence developed for epigenetic analyses.

The main methods for detection of DNA methylation already described are based on bisulphite conversion (Figure 8) and on precipitation of 5mC (Figure 9). Also detection of DNA methylation with methylation-specific restriction enzymes is possible, but this will not be elaborated here.124 Detection of PTMs, on the other hand, is done by chromatin immunoprecipitation with specific PTM targeting antibodies, followed by sequencing of captured DNA.125

In bisulphite-based methods, such as whole genome bisulphite sequencing (WGBS), DNA is treated with sodium bisulphite in a first step. This way, unmethylated cytosines are converted into uracil by deamination, while methylated ones are protected from conversion and remain unaltered. Upon sequencing, an uracil, thus corresponding to an unmethylated cytosine, will be recognised as thymine and a methylated cytosine remains detected as a cytosine (Figure 8).126 WGBS has a single-base resolution and gives an accurate overview of genome-wide methylation.127 However, it is an

Figure 8 In bisulphite sequencing, DNA is treated with sodium bisulphite. Methylated cytosines are protected from bisulphite, but unmethylated ones are deaminated to uracil. When afterwards the DNA is sequenced, 5mC is recognised as a cytosine, while an unmethylated one will be recognised as a thymine.126

3 Sequencing the (epi)genome 26 expensive technique and the bisulphite treatment itself is difficult to optimise and compromises the subsequent analysis of the sequencing data due to DNA fragmentation and reduced sequence complexity (as cytosines turn de facto into thymines).124 To decrease the cost of bisulphite sequencing, reduced representation bisulphite sequencing (RRBS) was developed. The sequenced DNA is here enriched for CpGs by using a methylation-independent restriction enzyme that recognises CpGs. Restriction enzymes cut CpGs and specific fragment sizes are selected before the DNA is treated with bisulphite and sequenced. A reproducible (but less comprehensive) genome-scale single-base overview is obtained, yet the costs are strongly reduced.128

Other techniques developed to reduce cost and downsides of bisulphite sequencing are based on enrichment of methylated DNA. Either a protein containing methyl-binding domains (MBD) or antibodies targeting 5mC are used for enrichment (Figure 9).129 In a first step, DNA is fragmented and methylated fragments are captured. The unbound fragments are subsequently washed away and the captured ones are sequenced. Enrichment with MBD proteins is called MBD-seq, while immunoprecipitation with antibodies is called methylated DNA immunoprecipitation sequencing (MeDIP-seq) or DIP-seq.130 The major advantage of these methods is that they are considerably cheaper than other methods, but are still genome-wide.

Figure 9 Enrichment-based sequencing of epigenetic modifications. (a,b,c) DNA is fragmented and specific proteins or antibodies are added to enrich for specific modifications, such as DNA methylation. (d) Unbound DNA is washed away and the DNA bound by the antibody or protein is retained. (e,f) A next-generation sequencing method is used to sequence the DNA.129

Due to the enrichment step, however, these methods are less quantitative as no absolute methylation levels are obtained, and results are no longer at a nucleotide resolution. Also for (Me)DIP-seq, affinity of the antibodies is a major drawback.131 The Infinium HumanMethylation450k BeadChip, developed by Illumina, is an alternative method for the detection of DNA methylation based on bisulphite conversion, yet without sequencing (Figure 10). The bisulphite converted DNA is hybridised against beads on a microarray containing over 450,000 CpG sites.132 Two different assays are used on the 450k BeadChip, namely Infinium I and Infinium II. In the former, hybridisation is done against two types of beads: one with a cytosine matching a methylated CpG and another with a thymine corresponding to an unmethylated one, which can be measured in the same colour channel. Only if the analysed DNA perfectly matches the bead, DNA extension occurs and a fluorescent signal will be observed (Figure 10(a)). In the Infinium II assay, on the other hand, a single bead is used with degenerate probes and determining the methylation status depends on the specific fluorescent signal, requiring two colour measurements (Figure 10(b)). The older 27k BeadChip only uses Infinium I assays, while on the 450k BeadChip both are combined.132,133 Both assays are also used on the newer HumanMethylationEPIC BeadChip, which contains 850,000 CpGs. Aside from beads targeting the same 450k BeadChip sites (only low quality probes were removed), also many CpG sites located in enhancer regions and CpGs outside of CpG islands are being queried.134

As for detection of DNA methylation, PTMs can be identified with antibodies, called chromatin immunoprecipitation sequencing (ChIP-seq, Figure 9). Chromatin fragments with a specific modification are isolated with an antibody and afterwards sequenced using a NGS technique. Depending on the specific antibody used, single PTMs can be detected. Again, specificity of the antibody is a tricky aspect of ChIP-seq and also the chromatin precipitation step is challenging.124,135

3 Sequencing the (epi)genome 28

Figure 10 Two assays used on the Infinium HumanMethylation BeadChip, Infinium I and II. (a) In the Infinium I assays bisulphite converted DNA is hybridised against two types of probes, an unmethylated (U) and a methylated (M) one. Only if the DNA perfectly matches the probe, single-base extension and emission of a fluorescent signal occurs. (b) The Infinium II contains only one type of probe for methylated as well as unmethylated DNA. The methylation status is derived from the specific fluorescent signal emitted.124

(a)

(b)

4 Analysis of sequencing data

The output of sequencing technologies depends on the technique used and subsequent analyses largely rely on the biological question at hand. Improvement of existing sequencing techniques is ongoing to date and new methods are under development. Furthermore, as the cost of sequencing drastically decreased over the last years, the amount of data to be analysed massively increased. The bottleneck hence does not lie in the sequencing itself anymore, but rather in the analysis of thereby generated sequencing data. Many computational methods for various analyses are being developed and here only those of importance for the further research in this thesis will be discussed.