bot4

Cards (13)

  • Genome annotation

    Identifying the locations of genes, their structures, and their functions in a genome
  • Genome annotation process
    1. Sequences decoded
    2. Identify gene locations
    3. Identify coding and non-coding regions
    4. Identify start and stop points of genes
    5. Identify functions of genes
  • Databases used for genome annotation
    • GenBank
    • EMBL
    • WormBase
    • FlyBase
  • Types of genome annotation
    • Structural annotation: Identification of genomic elements, 3D/4D protein structures, regulatory regions, coding regions, ORFs
    • Functional annotation: Adding biological information, gene expression, protein function
  • Structural annotation

    • Differentiate coding and non-coding regions
    • Identify start and stop codons
  • Structural annotation methods
    • Experimental data like expressed sequence tags (ESTs)
    • Bioinformatic analyses (ab initio)
  • Ab initio annotation
    Annotation methods that start with just the sequence to be annotated
  • Open reading frame (ORF)

    A portion of a genome that contains a sequence of bases that could potentially encode a protein, located between start and stop codons
  • Reading frames
    DNA is translated per codon (nucleotide triplet)
  • Useful ORFs
    • Based on homology approach, BLAST and sequence alignments, degeneracy of codons
  • Predicting gene functions
    Using gene knockout approaches like RNAi and CRISPR
  • Prokaryotic genes
    • Small genomes have high gene density, no introns, operons (one transcript, many genes), open reading frames (ORFs)
  • The NCBI Prokaryotic Genome Annotation Pipeline (PGAP) is designed to annotate bacterial and archaeal genomes