How Eukaryote de novo genome Assembly and Annotation Works
Genome assembly is the process of taking many individual disconnected graphs (pieces of the DNA) that are processed
independently by an assembler and putting them back together to create the original assembly.
Eukaryote de Novo Assembly
Step 1 - Assess quality of sequenced reads
Step 2 - Preprocess raw data
Adapter removal
Trimming
Step 3 - Genome assembly
Short reads assembly with de Bruijn assemblers
Long reads genome assembly
Need long reads correction due to high rates of sequence errors
Need more sequence coverage (~50x)
Step 4 - Assembly evaluation
Compare assemblies, pick one with highest percentage of complete genes
At USU’s high-performance computing and bioinformatics facility, we have developed a series of parallel, multicore, CPU-based,
open-source pipelines for large-scale -omics data analysis, which enables efficient and parallel analysis of multiple datasets
in a short period of time.