Bioinformatics

Created by

sop

Cards (25)

Bioinformatics - field which uses computers to store and analyze molecular biological information.
Bioinformatics is the marriage of biology and informatics
Bioinformatics - is about finding and interpreting biological data online.
Bioinformatics is an interdisciplinary field which harnesses different fields that are combined altogether to form bioinformatics.
Computer Science
Statistics
Mathematics
Biology
Infotechnology
3 Principal Components
Creation of databases - allows storage and management of large biological data sets
Development of algorithms and statistics
Use for analysis and interpretation of various types of biological data
Branches of Bioinformatics
Transcriptomics - about RNA molecules of living organisms
Microbiomics - genomes of bacteria, viruses, fungi, or parasites
Metabolomics - chemical process of metabolites
Genomics
Proteomics - sequence and 3D structure, and other properties of proteins
Bioinformatics Application
Retrieving DNA sequences from databases
Computing nucleotide compositions
Identifying restriction sites
Designing polymerase chain-reaction (PCR) primers
Identifying open reading frames (ORFs)
Predicting elements of DNA/RNA secondary structure
Finding repeats
Computing the optimal alignment between two or more DNA sequences
Finding polymorphic sites
Assembling sequence fragments
Creation and visualization of 3D structure models
in silico - virtual experimentation, done in a computer instead of a real laboratory
Ex: primer designing
Earliest DNA Sequences Protein Databases
International Nucleotide Sequence Database Collaboration (INSDC)
GenBank (from NCBI)
EMBL (European Molecular Biology Lab) from EBI
DDBJ (DNA DataBank of Japan)
Worldwide Protein Database (WPDB)
PDBj (Japan)
PDBe (Europe)
RCSB PDB (USA)
Ensemble - an automatic annotation database that determines the boundary of an exon and
intron of eukaryotic gene.
True 
(T/F) GenBank can provide the nucleotide and protein sequences of organisms
Data Inclusions in GenBank:
number of base pairs
Accession Number
Organism
Sources
Authors
Nucleotide or Protein Sequence
Features of GenBank
Pick Primers - for designing of primers
Run BLAST - to identify query sequences
Find in This Sequence
PBD is the main database used for the predication of the 3D structures of proteins and nucleic acids
Sequence Alignment - way of rearranging sequences of DNA, RNA, or protein to identify regions of similarity
Query Sequence - unknown sequence
Reference Sequence - known sequence
Importance of regions of similarity:
To understand functional, structural, or evolutionary relationships between the sequences
It may also help identify dissimilar regions of the DNA sequence useful for designing primers
Types of Sequence Alignment
Pairwise - compare 2 sequences
Multiple - compare 2 or more sequences
Types of Pairwise Alignment
Global Alignment - Matching the residues (bases or amino acids) of two sequences across their entire length
Local Alignment - Matching of two sequences from regionswhich have more similarity with each other
False 
(T/F) In EMBOSS water, dissimilar bases are indicated by an asterisk
MUSCLE – Multiple Sequence Comparison by Log Expectation
MAFFT – Multiple Alignment using Fast Fourier Transform
True 
T or F In Clustal Omega, residues are colored and similarities are designated with asterisks
False 
T or F In designing primers, you have to look for a part where there is a more similarities or asterisks
True 
T or F MUSCLE uses dash lines for gaps and asterisks for similarities