Parepro(Prediction of amino acid replacement probability)
is a methodof identifying which non-synonymous single base changes have a deleterious effect on protein function, based on support vector machine (SVM).
This method consists of three components, which is residue difference (RD), the status of the mutation position (SM) and the mutation sequence environment (ME). Furthermore, the prediction of Parepro, although based on sequence information, doesnĄ¯t depend on existence of many homologous sequences. We use a dataset (HumVar) to train and test the method, which consists of 21185 single point mutations, 61% of mutations are disease-related, out of 3587 proteins. As a result, Parepro can reach more than 51% of the Matthews correlation coefficient (MCC) and 77% overall accuracy. The performance of Parepro outperforms other web-available predictors which use structure or evolutionary information.
Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines. BMC Bioinformatics. 2007 Nov 16;8(1):450 [PubMed] [PDF]
Jian Tian, Ningfeng Wu, Jun Guo, Xuexia Guo, Juhua Zhang, Yunliu
Fan