Contact Contact Us
Contact Us

Methylation

Bisulfite Analysis Protocol

Bisulfite conversion is a technique used to study DNA methylation by converting unmethylated cytosines to uracil while leaving 5-methylcytosine (5-mC) intact. Sequencing and aligning converted DNA fragments make it possible to analyze methylation status on an individual cytosine.

Whole Genome Bisulfite Sequencing (WGBS) is a complete genome coverage of methylation to detect every CpG and less common non-CpG sites like CNG across the entire genome.

  1. Step 1. Sequencing quality and trimming
  2. Step 2. Alignment to the genome
  3. Step 3. Remove PCR bias (deduplication)
  4. Step 3. CpG Island extraction
  5. Step 4. Visualization & Clustering/PCA
  6. Step 5. Differential methylation regions analysis
  7. Step 6. Annotation for DMRs/DMCs
  8. Step 7. Results as tables, mapping bam files, and summary statistics

Reduced Representation Bisulfite Sequencing (RRBS) is commonly used to focus on the areas of the genome that have nearly all CpG content at base resolution detail.

  1. Step 1. Sequencing quality and trimming
  2. Step 2. Align to the genome
  3. Step 3. Remove PCR bias (Coverage filtering)
  4. Step 3. CpGIslands extraction
  5. Step 4. Visualization & Clustering/PCA
  6. Step 5. Differential methylation regions analysis
  7. Step 6. Annotation for DMRs/DMCs
  8. Step 7. Results as tables, mapping bam files, and summary statistics

Targeted Bisulfite Sequencing (TBS) is a method that detects based resolution DNA at regions of interest.

Figure #1: Pie chart of differential methylated regions
  1. Step 1. Sequencing quality and trimming
  2. Step 2. Align to the genome
  3. Step 3. Remove PCR bias (Coverage filtering)
  4. Step 4. CpG Islands extraction
  5. Step 5. Visualization & Clustering/PCA
  6. Step 6. Methylation segmentation
  7. Step 7. Differential methylation regions analysis
  8. Step 8. Annotation for DMRs/DMCs
  9. Step 9. Results as tables, mapping bam files, and summary statistics

Methylation analysis Report

MethylationStats.xls: Report ofthe basic stats about the methylation data such as coverage and percent methylation.

Histogram of % CpG Methylation Plot: numbers on bars denote what percentage of locations are contained in that bin. Typically, percent methylation histogram should have two peaks on both ends. In any given cell, any given base are either methylated or not. Therefore, looking at many cells should yield a similar pattern where we see lots of locations with high methylation and lots of locations with low methylation.

Histogram of CpG Coverage Plot: We can also plot the read coverage per base information in a similar way, again numbers on bars denote what percentage of locations are contained in that bin. Experiments that are highly suffering from PCR duplication bias will have a secondary peak towards the right hand side of the histogram.

CpG Base Pearson Cor. Plot: check the correlation betweenhigh and low samples. Shows Scatter plots for all samples. Numbers on upper right corner denote pair-wise Pearson's correlation scores. The histograms on the diagonal are %methylation histograms

CpG Methylation Clustering Plot: cluster the samples based on the similarity of their methylation profiles. Hierarchical clustering of methylation profiles of the samples using Pearson's correlation distance., Samples closer to each other in principal component space are similar in their methylation profiles.

Methylation_Per_Bases.xls: contains methylation information for regions/bases that are covered in all samples.

MethylationsCounts_Per_Regions.xls: summarize methylation information over tiling windows rather than doing base-pair resolution analysis. adds up C and T counts from each covered cytosine and returns a total C and T count for each tile(win.size=1000).

All_DiffMeth_report.xls: Report all differential methylation calculation. The logistic regression basedmodeling and test is used to calculate P-values. P-values are adjusted to Q-values using SLIM method(Wang, Tuominen, and Tsai 2011).

DiffMeth25p_Report.xls: Report of the differentially methylated regions/bases based on q-value < 0.01 and percent methylation difference larger than 25%.

Hypo_DiffMeth_Report.xls: bases/regions with lower methylation

Hyper_DiffMeth_Report.xls: bases/regions with higher methylation

diffMethPerChr.xls: visualization of the the distribution of hypo/hyper-methylated bases/regions per chromosome using the following function. The listshows percentages of hypo/hyper methylated bases over all the covered bases in a given chromosome.

PromoterCounts.xls: summarize methylation information over a set of defined promoters.

FeatureTargetStats.xls: a report of the percentage of intron/exon/promoters regions that overlap with differentially methylated bases.

GetAnnatationWithTSS.xls: annotation of differentially methylated regions with the distance to TSS and nearest gene name.

Methylated DNA Immunoprecipitation

Methylated DNA Immunoprecipitation (MeDIP-Seq) is a large scale (chromosome or genome-wide) purification technique used to isolate methylated DNA fragments via an antibody raised against 5-methylcytosine (5mC). MeDIP-Seq requires a different type of software analysis than WGBS and RRBS, as the methodology varies significicantly.

Figure #2: The MeDIPS process
  1. Step 1. Reads Alignment
  2. Step 2. Statistical Analysis
  3. Step 3. Genomic locations
  4. Step 4. CpG density
  5. Step 5. Read count per samples
  6. Step 6. Normalized read count per sample
  7. Step 7. Mean value per group
  8. Step 8. LogFC between groups
  9. Step 9. P-value and adjusted p-value
  10. Step 10. Results as tables, mapping bam files, and summary statistics