Manipulating Genomes

Cards (132)

  • What is DNA sequencing?
    Identifying the base sequence of a DNA fragment
  • How have sequencing methods changed over time?
    • Used to be a manual process, however now it has become automated
    • Entire genomes can now be read
    • DNA sequencing allows for the nucleotide base sequence of an organism's genetic material to be identified and recorded
  • Advances in technology have enabled the development of high-throughput-sequencing methods which allow scientists to rapidly sequence the genomes of organisms
    • The use of a method called capillary electrophoresis enables the chain termination method to be carried out in a high-throughput way
    • The newest high-throughput methods do not involve electrophoresis and are known as next-generation sequencing methods e.g. nanopore sequencing and pyrosequencing
    • In the 1970s the chain termination method of sequencing was developed by Frederick Sanger and his colleagues
    • The chain termination method is also known as Sanger sequencing
  • The chain termination method of DNA sequencing uses modified nucleotides called dideoxynucleotides
    • Dideoxynucleotides can pair with nucleotides on the template strand during DNA replication
    • They will pair with nucleotides that have a complementary base
    • When DNA polymerase encounters a dideoxynucleotide on the developing strand it stops replicating, hence this method of sequencing is referred to as the chain termination method
  • Once the dideoxynucleotide is added to the developing strand DNA polymerase stops the replication of the developing DNA strand to produce a shortened DNA chain
    A) single
    B) primer
    C) polymerase
    D) dideoxynucleotide
    E) stop
  • Chain termination method (1):
    • 4 test tubes prepared that contain the DNA to be sequenced (in the form of a single-stranded template), DNA polymerase, DNA primers, free nucleotides A, C, T, and G, and 1 of the 4 types of dideoxynucleotide; A*, C*, T*, or G*
    • Test tubes incubated at a temperature that allows DNA polymerase to function
    • Primer anneals to start of single stranded template, producing a short section of double stranded DNA at the start of the sequence
    • DNA polymerase attaches to double stranded section and begins DNA replication using free nucleotides in test tube
  • Chain termination method (2):
    • At any time, DNA polymerase can insert one of the dideoxynucleotides by chance which results in the termination of DNA replication
    • Because each test tube only contains 1 type of dideoxynucleotide, it is possible to know what the terminal nucleotide of each fragment is (i.e. if the test tube contains A*, then researchers will know that the final nucleotide of every chain in that test tube is A)
    • Because the point at which the dideoxynucleotide is inserted varies with every strand, complementary DNA chains of varying lengths are produced
  • Chain termination method (3):
    • The new, complementary, DNA chains are separated from template DNA
    • Resulting single-stranded DNA chains are separated according to length using gel electrophoresis
    • Gel will have 4 wells, one each for A*, C*, T*, and G*
    • A fragment with only 1 nucleotide will travel all the way to the bottom of the gel, and every band above this on the gel represents the addition of 1 more base.
    • This allows the base sequence to be built up one base at a time
  • Gel electrophoresis - methods to seperate fragments of DNA by length by applying a voltage across a gel matrix. DNA fragments are negatively charged, so move through the gel towards the positive electrode. Smaller fragments travel faster through the gel, so will travel further in a given amount of time.
  • High-throughput sequencing (1):
    • Each type of dideoxynucleotide is labelled using a specific fluorescent dye
    • Dideoxynucleotides with adenine base (ddNA) labelled green
    • Dideoxynucleotides with thymine base (ddNT) labelled red
    • Dideoxynucleotides with cytosine base (ddNC) labelled blue
    • Dideoxynucleotides with guanine base (ddNG) labelled yellow
    • Single-stranded DNA chains separated according to mass using capillary electrophoresis
    • This has a very high resolution - capable of separating chains of DNA that vary by only 1 nucleotide in length
  • High-throughput sequencing (2):
    • A laser beam is used to illuminate all of the dideoxynucleotides, and a detector then reads the colour and position of each fluorescence
    • The detector feeds the information into a computer where it is stored or printed out for analysis
  • Note that because capillary electrophoresis is essentially still Sanger sequencing, it cannot be referred to as next-generation sequencing
    • Increase in speed enabled by high-throughput sequencing allows scientists to sequence and analyse genomes of many organisms
    • Scientists can determine the function of sections of DNA by 'knocking out' genes to see how this affects an organism
    • Genes can be rewritten to alter their function, and then inserted into cells using genetic engineering techniques; this means that scientists can potentially design new molecules with huge potential for drug production (synthetic biology)
    • Genome sequence data can also provide information about evolutionary relationships
  • Next-generation sequencing
    • Any method of DNA sequencing that has replaced the Sanger method is referred to as next-generation sequencing (NGS)
    • Thousands to millions of DNA molecules can be sequenced at the same time
    • NGS methods can be one thousand times faster than older methods of sequencing
    • The reduction in time required for sequencing means that costs are also greatly reduced
    • NGS methods cost roughly 0.1% of the cost of chain-termination methods
  • Nanopore sequencing is currently being developed by scientists
    • This method of sequencing will be extremely rapid and allow for sequence data to be obtained outside the lab and used for a range of applications
  • Examiners may ask you which DNA strand the base sequence has been obtained for. In Sanger sequencing methods, it is the base sequence of the developing/test strand that is being identified, not the template strand that was initially provided.
  • Give some benefits of genome-wide comparisons.
    • Comparing between species allows us to determine evolutionary relationships
    • Comparing between individuals of the same species allows us to tailor medical treatment to the individual
  • How can DNA sequencing be used in synthetic biology?
    Knowing the sequence of a gene allows us to predict the sequence of amino acids that will make up the polypeptide it produces. This in turn allows for development of synthetic biology.
    • A genome contains all of the genes within an organism
    • Advances in technology have allowed scientists to sequence the genes within an organism's genome
    • Sequencing projects have read the genomes of a wide range of organisms from flatworms to humans
    • Genome-wide comparisons can be made between individuals and between species
    • The genetic code can be used to predict the amino acid sequence within a protein
    • Once scientists know the amino acid sequence they can predict how the new protein will fold into its tertiary structure
    • This information can be used for a range of applications, such as in synthetic biology
    • Bioinformatics is a field of biology that involves the storage, retrieval, and analysis of data from biological studies
    • These studies may generate data on DNA sequences, RNA sequences, and protein sequences, as well as on the relationship between genotype and phenotype
    • High-power computers are required to create databases
    • The large databases contain information about an organism's gene sequences and amino acid/protein sequences
    • Once a genome is sequenced, bioinformatics allows scientists to make comparisons with the genomes of other organisms using the many databases available
    • This can help to find the degree of similarity between organisms which then gives an indication of how closely related the organisms are
    • This can be useful for scientists looking for organisms that could be used in experiments as a model organism for humans
  • The nematode worm Caenorhabditis elegans is an animal that has been used as a model organism for studying the genetics of organ development, neurone development and cell death. It was the first multicellular organism to have its genome fully sequenced and as it has few cells (less than 1000), and is transparent, it has been a useful model organism
  • Bioinformatics has contributed to the study of genetic variation, evolutionary relationships, genotype-phenotype relationships, and epidemiology
    • The genetic variation within a species can be investigated
    • Many individuals of the same species have their genomes sequenced and compared
    • A species that has a high level of genetic variation will exhibit a large number of differences in base sequences between individuals
  • The evolutionary relationships between species can be investigated by comparing the genomes of different species
    • Species with a small number of differences between their genomes are likely to share a more recent common ancestor than species with a large number of differences
    • The protein cytochrome c is involved in respiration, and so is found in a large number of species (including plants, animals, and unicellular organisms). For this reason it is especially useful for making comparisons between different species
    • Genome sequencing can aid the understanding of gene function and interaction
    • Genotype-phenotype relationships are explored by "knocking out" different genes (stopping their expression) and observing the effect it has on the phenotype of an organism
    • When an organism's genome sequence is known, scientists can target specific base sequences to knock out
    • Epidemiologists study the spread of infectious disease within populations
    • The genomes of pathogens can be sequenced and analysed to aid research and disease control
    • Highly infectious strains can be identified
    • E.g. the Delta variant of SARS-CoV-2 (a well-known coronavirus)
    • The ability of a pathogen to infect multiple species can be investigated
    • E.g. Ebola can infect primates as well as humans
    • The most appropriate control measures can be implemented based on the data provided
    • Potential antigens for use in vaccine production can be identified
  • Genome comparison in action: The Human Genome Project
    • A genome project works by collecting DNA samples from many individuals of a species. These DNA samples are then sequenced and compared to create a reference genome
    • More than one individual is used to create the reference genome as one organism may have anomalies/mutations in its DNA sequence that are atypical of the species
  • Applications of the Human Genome Project
    • The information generated from the HGP has been used to tackle human health issues with the end goal of finding cures for diseases
    • Scientists have noticed a correlation between changes in specific genes and the likelihood of developing certain inherited diseases
    • For example, several genes within the human genome have been linked to increased risk of certain cancers
    • There have also been specific genes linked to the development of Alzheimer's disease
  • Proteome: The full range of proteins produced by the genome.
  • Determining the proteome of humans is difficult as large amounts of non-coding DNA are present in human genomes
    • It can be very hard to identify these sections of DNA from the coding DNA
    • The presence of regulatory genes and the process of alternative splicing in human genomes also affects gene expression and the synthesis of proteins
    • The proteome is larger than the genome due to:
    • Alternative splicing
    • Post-translational modification of proteins (often takes place in the Golgi apparatus)
  • Alternative splicing allows for a single gene to produce multiple proteins
    • Synthetic biology is a recent area of research that aims to create new biological parts, devices, and systems, or to redesign systems that already exist in nature
    • It goes beyond genetic engineering,as it involves large alterations to an organism's genome. This new genome can cause a cell to operate in a novel way, not yet seen before
    • The assembly of the new genome can be done using existing DNA sequences or using entirely new sequences
    • These new sequences can be designed and written (using special computer programmes) so that they produce specific proteins