European Journal of Medicinal Chemistry 2009-07-01

Comparative chemometric modeling of cytochrome 3A4 inhibitory activity of structurally diverse compounds using stepwise MLR, FA-MLR, PLS, GFA, G/PLS and ANN techniques.

Kunal Roy, Partha Pratim Roy

Index: Eur. J. Med. Chem. 44 , 2913-22, (2009)

Full Text: HTML

Abstract

Twenty-eight structurally diverse cytochrome 3A4 (CYP3A4) inhibitors have been subjected to quantitative structure-activity relationship (QSAR) studies. The analyses were performed with electronic, spatial, topological, and thermodynamic descriptors calculated using Cerius 2 version 10 software. The statistical tools used were linear [multiple linear regression with factor analysis as preprocessing step (FA-MLR), stepwise MLR, partial least squares (PLS), genetic function algorithm (GFA), genetic PLS (G/PLS)] and non-linear methods [artificial neural network (ANN)]. All the five linear modeling methods indicate the importance of n-octanol/water partition coefficient (logP) along with different topological and electronic parameters. The best model obtained from the training set (stepwise regression) based on highest external predictive R(2) value and lowest RMSEP value also showed good internal predictive power. Other models like FA-MLR, PLS, GFA and G/PLS are also of statistically significant internal and external validation characteristics. The best model [according to r(m)(2) for the test set, as defined by P.P. Roy, K. Roy, QSAR Comb. Sci. 27 (2008) 302-313] obtained from ANN showed a good r(2) value (determination coefficient between observed and predicted values) for the test set compounds, which was superior to those of other statistical models except the stepwise regression derived model. However, based upon the r(m)(2) value (test set), which penalizes a model for large differences between observed and predicted values, the stepwise MLR model was found to be inferior to other methods except PLS. Considering r(m)(2) value for the whole set, the G/PLS derived model appears to be the best predictive model for this data set. For choosing the best predictive model from among comparable models, r(m)(2) for the whole set calculated based on leave-one-out predicted values of the training set and model-derived predicted values for the test set compounds is suggested to be a good criterion.

Related Compounds

Structure Name/CAS No. Articles
Piroxicam Structure Piroxicam
CAS:36322-90-4
Erythromycin Structure Erythromycin
CAS:114-07-8
Clotrimazole Structure Clotrimazole
CAS:23593-75-1
Sulfamethizole Structure Sulfamethizole
CAS:144-82-1
Triadimefon Structure Triadimefon
CAS:43121-43-3
propiconazole Structure propiconazole
CAS:60207-90-1
metyrapone Structure metyrapone
CAS:54-36-4
Ketoconazole Structure Ketoconazole
CAS:65277-42-1
methylimidazole Structure methylimidazole
CAS:616-47-7
2-Methylimidazole Structure 2-Methylimidazole
CAS:693-98-1