Drexel University Home Pagewww.drexel.edu DREXEL UNIVERSITY LIBRARIES HOMEPAGE >>
iDEA DREXEL ARCHIVES >>

iDEA: Drexel E-repository and Archives > Drexel Theses and Dissertations > Drexel Theses and Dissertations > Development of a genetic algorithm-correlation analysis (GA/CA) program for classification of chemical compounds using mass spectral data

Please use this identifier to cite or link to this item: http://hdl.handle.net/1860/2803

File Description SizeFormat
Li_Fang.pdf3.24 MBAdobe PDFView/Open
Title: Development of a genetic algorithm-correlation analysis (GA/CA) program for classification of chemical compounds using mass spectral data
Authors: Li, Fang
Keywords: Chemistry
Mass spectrometry
Chemistry, Analytic
Issue Date: 11-Jul-2008
Abstract: A semi-automatic computer program GA/CA (genetic algorithm/correlation analysis) is developed in this project for the classification of chemical compounds using mass spectra. The program uses a genetic algorithm as the optimization method and correlation analysis as the evaluation method. In performing a classification, the GA/CA program searches for a group of mass peaks that best discriminate the substructure of interest using the mass spectra of known compounds, and then uses the search results on unknowns for prediction. The GA/CA program is able to perform the classification using mass spectra, neutral loss spectra and parent loss spectra, as well as perform data preprocessing techniques, such as intensity exponent scaling and thresholding. The GA/CA program is successfully used in two tests using library spectra: classification of lower aromatic compounds, and chlorine containing compounds. The chromosomes developed by the GA/CA program showed 100% prediction accuracy for the test compounds in both classification experiments. In the classification of carbamates, the best chromosomes developed by the GA/CA program result from use of the neutral loss spectra, which show a prediction accuracy of 93% on the test set. The prediction accuracy increased when the individual results obtained by use of mass spectra, neutral loss spectra and parent loss spectra are combined together. The GA/CA was also used for identification of the metabolites of the carbamate methyl thiophanate from LC-MS/MS data. Chromosomes were developed by the GA/CA program using spectra collected in the laboratory. The results showed that the GA/CA program identified three of the known metabolites correctly and one metabolite incorrectly. The GA/CA program also identified another possible metabolite that was not identified in a previous metabolic study. The GA/CA needs to be rewritten as a completely automatic program so that it can handle a larger number of spectral data and run for a large number of generations.
URI: http://hdl.handle.net/1860/2803
Appears in Collections:Drexel Theses and Dissertations

Items in iDEA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2007 MIT and Hewlett-Packard - Feedback