Bisulfite sequencing is a technique that can determine DNA methylation patterns. The major difference from regular sequencing experiments is that bisulfite sequencing DNA is treated with bisulfite which converts cytosine residues to uracil, but leaves 5-methylcytosine residues unaffected. By sequencing and aligning converted DNA fragments, it is possible to perform methylation calls.
Bisulfite Analysis Protocol
MethylationStats.xls: Report ofthe basic stats about the methylation data such as coverage and percent methylation.
Histogram of % CpG Methylation Plot: numbers on bars denote what percentage of locations are contained in that bin. Typically, percent methylation histogram should have two peaks on both ends. In any given cell, any given base are either methylated or not. Therefore, looking at many cells should yield a similar pattern where we see lots of locations with high methylation and lots of locations with low methylation.
Histogram of CpG Coverage Plot: We can also plot the read coverage per base information in a similar way, again numbers on bars denote what percentage of locations are contained in that bin. Experiments that are highly suffering from PCR duplication bias will have a secondary peak towards the right hand side of the histogram.
CpG Base Pearson Cor. Plot: check the correlation betweenhigh and low samples. Shows Scatter plots for all samples. Numbers on upper right corner denote pair-wise Pearson's correlation scores. The histograms on the diagonal are %methylation histograms
CpG Methylation Clustering Plot: cluster the samples based on the similarity of their methylation profiles. Hierarchical clustering of methylation profiles of the samples using Pearson's correlation distance., Samples closer to each other in principal component space are similar in their methylation profiles.
Methylation_Per_Bases.xls: contains methylation information for regions/bases that are covered in all samples.
MethylationsCounts_Per_Regions.xls: summarize methylation information over tiling windows rather than doing base-pair resolution analysis. adds up C and T counts from each covered cytosine and returns a total C and T count for each tile(win.size=1000).
All_DiffMeth_report.xls: Report all differential methylation calculation. The logistic regression basedmodeling and test is used to calculate P-values. P-values are adjusted to Q-values using SLIM method(Wang, Tuominen, and Tsai 2011).
DiffMeth25p_Report.xls: Report of the differentially methylated regions/bases based on q-value <0.01 and percent methylation difference larger than 25%.
Hypo_DiffMeth_Report.xls: bases/regions with lower methylation
Hyper_DiffMeth_Report.xls: bases/regions with higher methylation
diffMethPerChr.xls: visualization of the the distribution of hypo/hyper-methylated bases/regions per chromosome using the following function. The listshows percentages of hypo/hyper methylated bases over all the covered bases in a given chromosome.
PromoterCounts.xls: summarize methylation information over a set of defined promoters.
FeatureTargetStats.xls: a report of the percentage of intron/exon/promoters regions that overlap with differentially methylated bases.
GetAnnatationWithTSS.xls: annotation of differentially methylated regions with the distance to TSS and nearest gene name.