چکیده
|
In this work the aqueous solubilities of 145 drug-like compounds were predicted from their theoretical derived molecular descriptors. Descriptors which were selected by stepwise multiple subset selection methods are; 1st-order solvation connectivity index, average span R, overall hydrogen bond basicity, and percent of hydrophilic surface area. These descriptors can encode features of molecules which are effected on dispersion, hydrophobic and steric interactions between solute and solvent molecules. To develop quantitative structureactivity relationship (QSAR) models, the methods of multiple linear regressions, least-squares support vector machine, and artificial neural network (ANN) were used by applying the selected descriptors as their inputs. The obtained statistical parameters of these models revealed that ANN model was superior to other methods. The standard error (SE), average error (AE), and average absolute error (AAE) for ANN model are: SE = 0.714, AE = ¹0.178, and AAE = 0.546, while these values for internal test set are: SE = 0.830, AE = ¹0.056, and AAE = 0.630 and for external test set are: SE = 0.762, AE = ¹0.431, and AAE = 0.626, respectively. Moreover the leave-many-out cross validation test was used to further investigate the prediction power and robustness of model, which lead to RL10O2 = 0.816 and SPRESS = 0.32 for ANN model, which revealed the reliability of this model.
|