Reconstruction of protein backbones from the BriX collection of canonical protein fragments.

TitleReconstruction of protein backbones from the BriX collection of canonical protein fragments.
Publication TypeJournal Article
Year of Publication2008
AuthorsBaeten L, Reumers J, Tur V, Stricher F, Lenaerts T, Serrano L, Rousseau F, Schymkowitz J
JournalPLoS Comput Biol
Volume4
Issue5
Paginatione1000083
Date Published2008 May
ISSN1553-7358
KeywordsAmino Acid Sequence, Computer Simulation, Databases, Protein, Models, Chemical, Models, Molecular, Molecular Sequence Data, Peptide Fragments, Protein Conformation, Proteins, Sequence Analysis, Protein
Abstract

As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures.

DOI10.1371/journal.pcbi.1000083
Alternate JournalPLoS Comput. Biol.
PubMed ID18483555