Recommended Conferences

Human Genetics and Genetic Disorders

Miami, USA

Tissue Engineering and Regenerative Medicine

Chicago, USA
Related Subjects
 

A sequence property approach to searching protein databases

Author(s): Hobohm U, Sander C

Abstract

Currently available sequence alignment programs are generally not capable of detecting functional and structural homologs in the twilight zone of sequence similarity, i.e. when the sequence identity falls below about 25%. Here we attempt to detect such weak similarities using an approach based on a notion of protein sequence similarity radically different from that used in sequential alignment. The approach defines protein sequence dissimilarity (or distance) as a weighted sum of differences of compositional properties such as singlet and doublet amino acid composition, molecular weight, isoelectric point (protein property search or PropSearch). With PropSearch, either single sequences can be used for a database query, or multiple sequences can be merged into an “average” sequence reflecting the average composition of a protein family. First, we show that members of structural protein families have a low mutual PropSearch distance when the weights are optimized to discriminate maximally between structural families. Second, we demonstrate the results of database searches using the PropSearch method. Such searches are very rapid when scanning a preprocessed database and do not require alignments. In cases in which conventional alignment tools fail to detect similarities PropSearch can be used to generate hypotheses about possible structural or functional relationships between a new sequence and sequences in the database.

Similar Articles

What is bioinformatics? A proposed definition and overview of the field

Author(s): LuscombeNM, Greenbaum D,Gerstein M

Bioinformatics in the post-sequence era

Author(s): Kanehisa M, Bork P

BioWarehouse: A bioinformatics database warehouse toolkit

Author(s): Lee TJ, Pouliot Y, Wagner V, Gupta P, Stringer-Calvert DW, et al.

Knowledge discovery in databases

Author(s): Piateski G, Frawley W, Matheus CJ

Data structure diagrams

Author(s): Bachman CW

Querying semi-structured data

Author(s): Abiteboul S

Supporting security and consistency for cloud database

Author(s): Ferretti L, Colajanni M, Marchetti M

Survey of graph database models

Author(s): Angles R, Gutierrez C

Gene ontology: The tool for the unification of biology

Author(s): Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al.

Information storage and retrieval

Author(s): Korfhage RR

Databases in genomic research

Author(s): Gelbart WM

An interactive bovine in silico SNP database (IBISS)

Author(s): Hawken RJ, Barris WC, McWilliamSM, Dalrymple BP

The ENZYME database in 2000

Author(s): BairochA

Information Systems and Architectures

Author(s): Bui AA, Morioka C