Abstract: A detailed comparison of protein domains that belong to families and superfamilies shows that structure is better conserved than sequence during evolutionary divergence. Sequence alignments, guided by structural features, permit a better sampling of the protein sequence space and effective construction of libraries for fold recognition. Sequence alignments are useful evolutionary models in defining structure-function relationships for protein superfamilies. The PASS2 database, maintained by the authors, presents alignments of proteins related at the superfamily level and characterised by low sequence similarity. The number of new superfamilies increased to 47% compared with the previous PASS2 version, which shows the crucial importance of updating the PASS2 database. In the current release of the PASS2 database, they align protein superfamilies using a structural alignment protocol. The authors also introduce two alignment assessment methods that depend on the average structural deviations of domains and the extent of conserved secondary structures. They also integrate new and important structural and sequence features at the superfamily level into the database. These features are conserved-unconserved blocks in proteins, spatial distribution of sequences using principal component analysis and a statistical view for each superfamily. The authors suggest that highly structurally deviant superfamily members could be removed as outliers, so that such extreme distant relationships will not obscure the alignment. They report a nearly-automated, updated version of the superfamily alignment database, consisting of 1776 superfamilies and 9536 protein domains, that is in direct correspondence with the SCOP (1.73) database.
Publication Year: 2011
Publication Date: 2011-10-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 7
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot