Contact Contact Us
Contact Us

DNA Methylation (MeDIPS)

How Bisulfite Sequencing Works

Bisulfite sequencing is a technique that can determine DNA methylation patterns. The major difference from regular sequencing experiments is that bisulfite sequencing DNA is treated with bisulfite which converts cytosine residues to uracil, but leaves 5-methylcytosine residues unaffected. By sequencing and aligning converted DNA fragments, it is possible to perform methylation calls.

Bisulfite Analysis Protocol

  • Perform quality control:
    • Pooling
    • Sequence adapter trimming
    • contamination removal
  • Align reads with Bismark:
    • Supports the alignment of bisulfite-treated reads (whole genome shotgun BS-Seq (WGSBS), reduced-representation BS-Seq (RRBS) or PBAT-Seq (Post-Bisulfite Adapter Tagging) for the following conditions:
    • Sequence format either FastQ or FastA-single-end or paired-end reads. Files can be uncompressed or gzip-compressed (ending in .gz) -directional or non-directional BS-Seq libraries
      • Read conversion: convert C to T for reads
      • Genome conversion: convert C to T for genomes
      • Align converted reads to converted genomes
      • Identify uniquely mapped reads
Figure #2: A methylated cytosine.
  • MethylKit analysis
    • Annotation of DNA methylation information:
      • Basic stats about methylation data, such as coverage and percent methylation
      • Samples Correlation
      • Samples Clustering
      • PCA Analysis
      • Finding differentially methylated bases or regions.
      • Visualization of differential methylation events.
        • Horizontal bar plots show the number of hyper- and hypomethylation events per chromosome, as a percent of the sites with the minimum coverage and differential. By default, this is a 25% change in methylation and all samples with 10X coverage.
      • Annotation ofdifferentially methylated bases or regions

Methylation analysis Report

MethylationStats.xls: Report ofthe basic stats about the methylation data such as coverage and percent methylation.

Histogram of % CpG Methylation Plot: numbers on bars denote what percentage of locations are contained in that bin. Typically, percent methylation histogram should have two peaks on both ends. In any given cell, any given base are either methylated or not. Therefore, looking at many cells should yield a similar pattern where we see lots of locations with high methylation and lots of locations with low methylation.

Histogram of CpG Coverage Plot: We can also plot the read coverage per base information in a similar way, again numbers on bars denote what percentage of locations are contained in that bin. Experiments that are highly suffering from PCR duplication bias will have a secondary peak towards the right hand side of the histogram.

CpG Base Pearson Cor. Plot: check the correlation betweenhigh and low samples. Shows Scatter plots for all samples. Numbers on upper right corner denote pair-wise Pearson's correlation scores. The histograms on the diagonal are %methylation histograms

CpG Methylation Clustering Plot: cluster the samples based on the similarity of their methylation profiles. Hierarchical clustering of methylation profiles of the samples using Pearson's correlation distance., Samples closer to each other in principal component space are similar in their methylation profiles.

Methylation_Per_Bases.xls: contains methylation information for regions/bases that are covered in all samples.

MethylationsCounts_Per_Regions.xls: summarize methylation information over tiling windows rather than doing base-pair resolution analysis. adds up C and T counts from each covered cytosine and returns a total C and T count for each tile(win.size=1000).

All_DiffMeth_report.xls: Report all differential methylation calculation. The logistic regression basedmodeling and test is used to calculate P-values. P-values are adjusted to Q-values using SLIM method(Wang, Tuominen, and Tsai 2011).

DiffMeth25p_Report.xls: Report of the differentially methylated regions/bases based on q-value <0.01 and percent methylation difference larger than 25%.

Hypo_DiffMeth_Report.xls: bases/regions with lower methylation

Hyper_DiffMeth_Report.xls: bases/regions with higher methylation

diffMethPerChr.xls: visualization of the the distribution of hypo/hyper-methylated bases/regions per chromosome using the following function. The listshows percentages of hypo/hyper methylated bases over all the covered bases in a given chromosome.

PromoterCounts.xls: summarize methylation information over a set of defined promoters.

FeatureTargetStats.xls: a report of the percentage of intron/exon/promoters regions that overlap with differentially methylated bases.

GetAnnatationWithTSS.xls: annotation of differentially methylated regions with the distance to TSS and nearest gene name.