Feature selection by genetic algorithms for mass spectral classifiers. Yoshida, H, Leardi, R, Funatsu, K, & Varmuza, K Anal Chim Acta, 446(1-2):483–492, Elsevier, 2001.
doi  abstract   bibtex   
Mass spectral classifiers for 15 substructures have been computed that give discrete present/absent answers. For the development of classifiers, linear discriminant analysis (LDA) and partial least squares discriminant PLS (DPLS) have been used. The low resolution mass spectra were transformed into a set of 400 spectral features. Because each spectrum is described with so many features, some features may not be necessary, and others may contribute only noise. Therefore, the effect of feature selection has been investigated. The methods used were selection by Fisher ratios and selection by a genetic algorithm (GA). The first method is univariate, the second is multivariate; advantages and disadvantages of both are discussed. On the average, feature selection did not significantly change the classification performance compared with results that have been obtained with all features. However, it was possible to reduce the number of features considerably without a loss of classification performance. For a few substructures GA together with LDA resulted in much better classifiers than DPLS with all features. The features selected for classifications of a benzyl substructure and for the presence of chlorine have been interpreted in terms of mass spectrometric fragmentation rules.
@Article{yoshida01feature,
  author    = {Yoshida, H and Leardi, R and Funatsu, K and Varmuza, K},
  title     = {Feature selection by genetic algorithms for mass spectral classifiers},
  journal   = {Anal Chim Acta},
  year      = {2001},
  volume    = {446},
  number    = {1-2},
  pages     = {483--492},
  abstract  = {Mass spectral classifiers for 15 substructures have been computed that give discrete present/absent answers. For the development of classifiers, linear discriminant analysis (LDA) and partial least squares discriminant PLS (DPLS) have been used. The low resolution mass spectra were transformed into a set of 400 spectral features. Because each spectrum is described with so many features, some features may not be necessary, and others may contribute only noise. Therefore, the effect of feature selection has been investigated. The methods used were selection by Fisher ratios and selection by a genetic algorithm (GA). The first method is univariate, the second is multivariate; advantages and disadvantages of both are discussed. On the average, feature selection did not significantly change the classification performance compared with results that have been obtained with all features. However, it was possible to reduce the number of features considerably without a loss of classification performance. For a few substructures GA together with LDA resulted in much better classifiers than DPLS with all features. The features selected for classifications of a benzyl substructure and for the presence of chlorine have been interpreted in terms of mass spectrometric fragmentation rules.},
  doi       = {10.1016/S0003-2670(01)00910-2},
  keywords  = {compound classes; compound class prediction; machine learning; ML; direct prediction;},
  owner     = {Sebastian},
  publisher = {Elsevier},
  timestamp = {2018.07.01},
}

Downloads: 0