9 DNA Sequencing and Bioinformatics

Cards (47)

  • DNA sequencing is the determination of exact sequence of nucleotide bases in DNA
  • Next generation sequencing is very useful and very high-tech; yields high throughput products
  • Direct sequencing includes:
    • Manual sequencing
    • Chemical (Maxam-Gilbert) sequencing
    • Dideoxy (Sanger) sequencing
    • Automated fluorescent sequencing
  • Chemical (Maxam-Gilbert) sequencing is developed in the late 1970s by Allan M. Maxam and Walter Gilbert
    • Requires a double- or single- stranded version of the DNA region to be sequenced, with one end radioactively labeled
    • End-labelled DNA fragments are subjected to random cleavage at adenine, cytosine, guanine, or thymine positions using specific chemical agents
  • Chemical sequencing procedure:
    • We need our template which could be dsDNA or ssDNA
    • Labelled at one end of one strand with 32P (Phosphorus-32)
    • Alkaline phosphatase removes terminal P and labelled with 32P
    • Cleaving of DNA strand at specific position using chemical reaction
  • Cleaved bases of Dimethyl sulfate is G only
  • Cleaved bases of DMS + formic acid is A + G
  • Cleaved bases of Hydrazine is C + T
  • Cleaved bases of Hydrazine + NaCl is C only
  • Chemical sequencing tubes:
    • First tube → DMS
    • Second tube → Formic acid
    • Third tube → Hydrazine
    • Fourth tube → Hydrazine + Salt

    Fragments are radioactive labeled
  • Chemical sequencing
    • To visualize the fragments, the gel is exposed to X-ray film for autoradiography
    • A series of dark bands shows the location of radiolabeled DNA molecules
  • Dideoxy (Sanger) sequencing is an enzymatic method
    • AKA Chain termination method
    • Developed by Fred Sanger in 1977
    • Widely used for sequencing individual pieces of DNA
    • Gold standard for DNA sequencing
  • Dideoxy sequencing
    • Used to sequence individual pieces of DNA, such as bacterial plasmids or DNA copied in PCR
    • Gives high-quality sequence for relatively long stretches of DNA
    • Expensive and inefficient for larger-scale projects
  • Automated fluorescent sequencing has chemistry same as the manual sequencing using double stranded templates and cycling sequencing.
    • 2 approaches to automated fluorescent sequencing
    • Dye primer sequencing
    • Dye terminator sequencing
    • Goal: to label the fragments synthesized during the sequencing reaction according to their terminal ddNTP
  • Automated fluorescent sequencing color reactions:
    • ddATP → green dye
    • ddCTP → blue dye
    • ddGTP → black or yellow dye
    • ddTTP → red dye
  • Dye terminator sequencing is performed with one of the four fluorescent dyes attached to each of the ddNTPs instead of the primer.
    • Primer is unlabeled
    • Advantage: all four sequencing reactions are performed in the same tube (or well of a plate) instead of in four separate tubes
  • Pyrosequencing is included in the next generation sequencing; based on the sequencing by synthesis principle.
    • Relying on the detection of pyrophosphate release on nucleotide incorporation
    • No gels, fluorescent dyes, or ddNTP
  • In pyrosequencing, every addition of nucleotides, there will be pyrophosphate ions.
    • Add dNTPS one by one
    • This will react to subtrated in the mixture and then will yield oxyluciferin.
    • The light will give the signal that the dNTPs added will be the next sequence.
  • Apyrase will convert excess dntps into dndp, dnmp, and phosphate
  • Next generation sequencing is a powerful platform that has enabled the sequencing of thousands to millions of DNA molecules simultaneously.
    • Ultra-high throughput (massive parallel sequencing)
    • Instruments with the capacity to sequence larger volumes of samples
    • Made sequencing much Faster and less expensive.
  • We need adapter ligation for amplification in next gen sequencing
  • Read means sequence of nucleotides that is sequenced
  • Single-end sequencing is reading a fragment from only one strand
  • Paired-end sequencing is reading a fragment from both strand
  • Principle used by ROCHE 454: Pyrosequencing
  • Principle used by ILLUMINA: Sequencing by synthesis
  • Principle used by ABI SOLiD: Sequencing by ligation
  • Principle used by ION TORRENT SYSTEMS: Ion semiconductor sequencing
  • NGS technologies
    • All NGS sequencing platforms require a library obtained either by amplification or ligation with custom adaptor sequences.
    • This adaptor sequences allows for library hybridization to the sequencing chips and provide a universal priming sites for sequencing primers.
    • It facilitates in sequencing
  • NGS technologies
    • Each library fragments is amplified on a solid surface such as beads or a flat silicon surface, where it will hybridize.
    • This amplification creates clusters of DNA and each originating from a single library fragment and each cluster will act as individual sequencing reaction.
    • Each machine will have its own unique cycling conditions.
  • NGS technologies
    • Each machine provides raw data at the end of the sequencing run.
    • This raw data is a collection of DNA sequences generated at each clusters and this data could be further analyzed to provide more meaningful results
  • Pyrosequencing has large read lengths generation
    • High reagent cost
    • High error rate over strings of 6+ homopolymers
    • e.g., Roche 454
  • Roche 454 is the first NGS platform introduced in the market in 2005
  • Sequencing by synthesis is utilized the step-by-step incorporation of reversibly fluorescent and terminated nucleotides for DNA sequencing
    • e.g., Illumina
  • Sequencing by ligation does not utilize a DNA polymerase, instead, relies on sequencing by oligonucleotide ligation and detection.
    • Applied biosystems: SOLiD (Sequencing by Oligonucleotide Ligation and Detection)
  • Ion semiconductor sequencing utilize the release of hydrogen ions during the sequencing reaction to detect the sequence of a cluster.
    • Measures the release of H+
    • More cost effective and time efficient
    • e.g., Ion torrent system
  • Ion torrent system's principle is like pyrophosphate, but it measures hydrogen ions instead of pyrophosphate
  • Raw output for NGS are BCL files (format)
    • Raw data is needed to be separated because machines handle massive parallel sequencing
  • Bioinformatics is an interdisciplinary research area at the interface between computer and biological science
    • Involves the technology that uses computer for storage, retrieval, manipulation, and distribution of information related to biological macromolecules such as DNA, RNA, and proteins
  • Database is a computerized archive used to store and organize data in such a way that information can be retrieved easily via a variety of search criteria