چکیده
|
n this work, quantitative structure-retention relationship (QSRR) approaches were applied for modeling and prediction of the retention index of 282 amino acids (AAs) and carboxylic acids (CAs). Descriptors that were used to encode structural features of molecules in a data set were calculated by using the Dragon software. The genetic algorithm (GA) and stepwise multiple linear regression (MLR) methods were used to select the most relevant descriptors. Then support vector machine (SVM), artificial neural network (ANN) and multiple linear regression were utilized to construct nonlinear and linear quantitative structure-retention relationship models. The obtained results using these techniques revealed that nonlinear models were much better than other linear ones. The GA-ANN model has the average absolute relative errors (AARE) of 0.054, 0.059 and 0.100 for training, internal and external test set. Applying the tenfold cross-validation procedure on GA-AAN model obtained the statistics of Q2 5 0.943, which revealed the reliability of this model.
|