bot3

Cards (63)

  • Bioinformatics
    The use of computer to store, retrieve, analyse or predict the composition or structure of bio-molecules
  • Bioinformatics is the application of computational techniques and information technology to the organisation and management of biological data
  • Classical bioinformatics deals primarily with sequence analysis
  • Aims of bioinformatics
    • Development of database containing all biological information
    • Development of better tools for data designing, annotation and mining
    • Design and development of drugs by using simulation software
    • Design and development of software tools for protein structure prediction function, annotation and docking analysis
    • Creation and development of software to improve tools for analysing sequences for their function and similarity with other sequences
  • Applications of bioinformatics
    • Gene therapy
    • Drug designing
    • Antibiotic resistance
    • Crop development
    • Medicine biotechnology
    • Drought resistance
    • Evolutionary studies
    • Forensic analysis
    • Veterinary science
    • Weather analysis
    • Waste cleanup
  • Biological database
    A collection of biological data arranged in computer readable form that enhances the speed of search and retrieval and convenient to use
  • Biological data are complex, exception-ridden, vast and incomplete
  • A good database must have updated information
  • Importance of biological database
    • Retrieve a range of information like biological sequences, structures, binding sites, metabolic interactions, molecular action, functional relationships, protein families, motifs and homologous
  • Types of biological database
    • Nucleotide sequence database
    • Protein sequence database
    • Structure database
    • Domain and motif database
    • Gene expression database
    • Metabolic pathway database
  • Primary database
    Contains only sequence or structural information
  • Secondary database

    Derived from the analysis or treatment of primary data
  • Secondary databases are very important for inferring protein function
  • GeneBank
    One of the fastest growing repositories of known nucleotide sequences, has a flat file structure, readable by both humans and computers
  • GeneBank contains information such as accession numbers and gene names, phylogenetic classification and references to published literature
  • GeneBank has been developed and maintained at the NCBI, Bethesda, MD, USA, as a part of International Sequence Database Collaboration (INSDC)
  • GeneBank is an open access sequence database
  • GeneBank coordinates with individual laboratories and other sequence databases like EMBL and DDBJ
  • GeneBank is an annotated collection of all nucleotide sequences that are available to the public
  • The nucleotide database was divided into three databases at NCBI: CoreNucleotide database, Expressed Sequence Tag (EST) and Genome Survey Sequence (GSS)
  • CoreNucleotide database has most of the nucleotide sequences used. It also encloses all nucleotide records that are not in the EST and GSS databases
  • Submission of sequences to GeneBank can be done using BankIt, Sequin and tbl2asn tools
  • EMBL (European Molecular Biology Laboratory)

    A comprehensive database of DNA and RNA sequences, collected from scientific literature, patient offices and is directly submitted by researchers
  • EMBL has been prepared in collaboration with GeneBank (USA) and the DNA Database of Japan (DDBJ)
  • EMBL is established in 1980 and maintained by EBI (European Bioinformatics Institute)
  • Swiss-Port
    A curated protein sequence database that offers a high level of integration with other databases and also has a very low level of redundancy
  • Swiss-Port strives to provide protein sequences with a high level of annotation (for instance, the description of protein function, domain structure and post translational modifications, etc.)
  • Swiss-Port is established in 1986 and maintained collaboratively, since 1987, by the department of Medical Biochemistry of the University of Geneva and the EMBL data Library
  • TrEMBL is a computer–annotated supplement of Swiss-Port that contains all translations of EMBL nucleotide sequence entries, which is not yet integrated in Swiss-Port
  • Currently Swiss-Port have 0.5 and TrEMBL have 7.6 million sequences
  • Protein Information Resource (PIR)
    An integrated public bioinformatics resource to support genomic and proteomic research and scientific studies
  • PIR offers a wide variety of resources mainly oriented to assisting the propagation and consistency of protein annotations like PIRSF, ProClass and ProLINK
  • Protein sequence motif
    A set of conserved amino acid residues that are important for protein function and are located within a certain distance from one another
  • PROSITE database

    Consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them
  • PRINT
    A database for protein fingerprints. A fingerprint is a group of conserved motifs used to characterise a protein family
  • Protein domain
    An independently folded, structurally compact unit that forms a steady three-dimensional structure and shows a certain level of evolutionary conservation
  • ProDom
    A protein domain database automatically generated from the Swiss-Port and TrEMBL sequence database
  • SMART
    A highly reliable and sensitive tool for domain identification
  • COG
    A database and a convenient tool for motif and domain identification
  • PDB (Protein Data Bank)
    The main primary database for 3D structures of biological macromolecules determined by X-ray, crystallography and NMR