Contact Contact Us
Contact Us

Eukaryotes: De Novo Genome Assembly and Annotation

How Eukaryote de novo genome Assembly and Annotation Works

Genome assembly is the process of taking many individual disconnected graphs (pieces of the DNA) that are processed independently by an assembler and putting them back together to create the original assembly.

Eukaryote de Novo Assembly

Step 1 - Assess quality of sequenced reads
Step 2 - Preprocess raw data
  • Adapter removal
  • Trimming
Step 3 - Genome assembly
  • Short reads assembly with de Bruijn assemblers
  • Long reads genome assembly
    • Need long reads correction due to high rates of sequence errors
    • Need more sequence coverage (~50x)
Step 4 - Assembly evaluation
  • Compare assemblies, pick one with highest percentage of complete genes
Step 5 - Scaffolding and gap filling
  • Remove repetitive contigs
  • Build scaffolds with longest contigs
  • Close large number of gaps
Step 6 - Compare assembly metrics
  • Total number of contigs in assembly
  • Largest contig
  • Total number of bases in assembly
  • Number of misassembled contigs
  • Misassembled contigs length
Step 7 - Gene prediction
  • Predict novel genes
  • Final consensus gene models
Step 8 - functional annotation
  • Identify differentially expressed genes
  • Annotate gene
  • GO terms
  • KEGG Pathways