The Open Structural Biology Journal

2009, 3 : 126-132
Published online 2009 October 8. DOI: 10.2174/1874199100903010126
Publisher ID: TOSBJ-3-126

Intrinsic Relationship of Amino Acid Composition/Occurrence with Topological Parameters and Protein Folding Rates

M. Michael Gromiha
Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), AIST Tokyo Waterfront Bio-IT Research Building, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan

ABSTRACT

Understanding the relationship between amino acid sequences and folding rates of proteins is an important task in computational and molecular biology. It has been shown that topological parameters, contact order, long-range order and total contact distance relate well with protein folding rates. In this work, we have systematically analyzed the relationship between amino acid composition/occurrence and protein folding rates along with topological parameters derived from protein three-dimensional structures. We found that the classification of proteins based on their structural classes and folding types (two and three-state proteins) could explain the relationship very well. The amino acid composition showed good correlation with protein folding rates for two-state proteins whereas the correlation is high with amino acid occurrence for three-state proteins. The composition of polar amino acids, Asn, Gln and Ser directly correlated with protein folding rates and a reverse trend was observed between the occurrence of hydrophobic amino acids, Ile and Gly and protein folding rates. The amino acid occurrence showed a positive correlation with folding rates in two-state proteins and a negative correlation in three-state proteins, which reveals that the presence of more number of amino acids in three-state proteins slows down the folding process. The analysis on slow and fast folding proteins showed that the slow folding proteins have appreciable number of residues that form multiple contacts with other residues. Further, we have combined different amino acids based on their chemical properties and analyzed the relationship with protein folding rates, and set up multiple regression equations for predicting protein folding rates.