Skip to content

Dimensionality reduction tools for multivariate analysis

License

Notifications You must be signed in to change notification settings

akasharora123/Chemometrics

Repository files navigation

Chemometrics

Multivariate analysis for spectroscopy data

Overview

Chemometric analysis is the applicaiton of multivariate statistics methods to chemical characterization data to get insights about the quality/adulterants present or to classify different varieties of chemicals. Chemicals are commonly characterized by spectroscopy techniques such as mid-infrared or Fourier-transform infrared spectroscopy, which represents a series of intensities measured at different wavelengths. Typically, the intensities at different wavelengths are highly correlated, and the number of samples collected (data points) are less than the number of wavelengths (variables). Because of these two problems, the dimensionality reductions methods (PCA and LDA) are primarily used to project the data into independent-latent-variables space. Visualizing the data on 2D or 3D latent-variable space help identifying important features. For predictive modeling, regression based on dimensionality-reduction, such as PCR-regression and PLS-regression are often used.

Case Studies

  • Coffee variants: Infrared spectrum of two types of coffee, Arabica and Robusta, are measured by Downey et al. (J. Agric. Food. Chem. 1997, 45, 4357-4361). Using LDA, the two variants of coffee are differentiated along the LDA vector direction.
  • Geographic origin of olive oils: PCA and LDA are applied on the measured infrared spectrum of olive oils which helps clusterinng them in latent-vector space, and thereby helps identifying the country of origin for different varieties. Data source: Tapp et al., J. Agric. Food Chem., Vol. 51, No. 21, 2003.

Prerequisites

  • Python
  • NumPy
  • Pandas
  • Seaborn
  • MatplotLib
  • Scikit-learn

License

License

About

Dimensionality reduction tools for multivariate analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published