Pafig (Prediction of amyloid fibril-forming segments) based on support vector machines, exploited 41 physicochemical properties to identify the specific regions associated with fibrillar aggregates.
It is proved that amyloid fibrillar aggregates of proteins or polypeptides are associated with many human diseases. Recent studies suggest that the specific short regions in the proteins are responsible to this aggregation. Therefore, prediction of such sequence stretches is critical for understanding diseases related to protein aggregation and finding potential therapeutic targets.
In this study, we proposed a method named Pafig (Prediction of amyloid fibril-forming segments) based on support vector machines, which exploited 41 physicochemical properties to identify the specific regions associated with fibrillar aggregates. Using a 10-fold cross validation test on Hexpepset dataset, Pafig achieved good performance that the overall accuracy (Q2) and Matthews correlation coefficient (MCC) were about 81% and 0.63, respectively. Further test of an additional dataset (AmylHex) indicated that Pafig outperformed than other methods. Moreover, 64,000,000 hexpeptides were predicted by Pafig. As a result, there are about 5.08% segments possessing higher aggregation propensity with Reliability Index(RI)¡Ý7.
Prediction of amyloid fibril-forming segments based on support machine vector. BMC Bioinformatics. 2009 Jan 30;10 Suppl 1:S45. [PubMed] [PDF]