Probabilistic Programming for Spectroscopic Data Analysis

This repository contains Jupyter notebooks for probabilistic modeling of vibrational spectroscopic datasets. All models have been implemented using the Python probabilistic programming library PyMC3. To use the software, first setup your system by creating a virtual environment containing the required Python packages.

System preparation

1. Linux System Setup (Ubuntu 18.04)

General Requirements

First make sure Python3, pip and graphviz are installed on the system:

>> sudo apt install python3 python3-dev python-pip graphviz

Install virtualenv

To avoid messing up the Python installation, install virtualenv (https://virtualenv.pypa.io/en/stable):

>> pip install virtualenv

Once installed, create a new virtual enviroment by running the command:

>> virtualenv -p python3 PPSDA

You can activate the virtual environment by running:

>> source PPSDA/bin/activate

To deactivate the virtual environment:

>> deactivate

Install required Python packages

Activate and cd into the created virtual environment. Install the following Python packages:

>> pip install --upgrade numpy scipy scikit-learn matplotlib pandas jupyter seaborn pymc3 arviz graphviz

Run Jupyter Notebook

To run and experiment with the models, start a Jupyter Notebook server and open the .ipynb files containing the models inside the PPSDA/code/ directories:

>> jupyter notebook

2. Windows System Setup (Windows 7/10)

Install Miniconda

For Windows it is advised to first install the Miniconda environment (https://docs.conda.io/en/latest/miniconda.html). After Miniconda installation, open a shell and create a new environment:

>> conda create --name PPSDA

Enter the new environment by running:

>> conda activate PPSDA

To exit the new environment enter:

>> conda deactivate

Install required Python packages

Activate the new environment and install the following Python packages:

>> conda install numpy scipy scikit-learn matplotlib pandas jupyter seaborn pymc3

Install the arviz and graphviz libraries:

>> conda install -c conda-forge arviz python-graphviz

Run Jupyter Notebook

To run and experiment with the models, start a Jupyter Notebook server and open the .ipynb files containing the models inside the PPSDA/code/ directories:

>> jupyter notebook

Datasets

Coffees

The coffees dataset contains 56 FTIR samples of two coffee species, Arabica (29) and Robusta (27). The spectra were truncated to 800-2000 cm^-1. The dataset was obtained from: https://csr.quadram.ac.uk/example-datasets-for-download/

Juices

The juices dataset contains 983 FTIR samples originating from two classes of fresh fruit juices, non-strawberry (632) and strawberry (351). The spectra were truncated to 899-1802 cm^-1. The dataset was obtained from: https://csr.quadram.ac.uk/example-datasets-for-download/

Olive Oils

The olive oils dataset contains 120 FTIR samples originating from Spain (50), Italy (34), Greece (20) and Portugal (16), corresponding to four different classes. The spectra were truncated to 799-1897 cm^-1. The dataset was obtained from: https://csr.quadram.ac.uk/example-datasets-for-download/

Wines

The wines dataset contains 44 FTIR samples originating from wines produced from the same grape (100% Cabernet Sauvignon), but harvested in different geographical areas, Chile (15), Australia (12), South Africa (11) and Argentina (6). The dataset was obtained from: https://www.models.life.ku.dk/Wine_GCMS_FTIR

Tablets

The tablets dataset contains a collection of Near-infrared (NIR) and Raman spectra obtained from 4 different types of pharmaceutical tablets with a varying amount of active substance. NIR spectra: 310 samples of type A (70), B (80), C (80) and D (80). Raman spectra: 120 samples of type A (30), B (27), C (33) and D (30). The dataset was obtained from: https://www.models.life.ku.dk/Tablets

Beers

The beers dataset contains NIR and Raman spectra of Rochefort 8 (class 1) and Rochefort 10 (class 2) beers. NIR spectra: 44 samples of class 1 (28) and class 2 (16). Raman spectra: 45 samples of class 1 (29) and class 2 (16).

Name		Name	Last commit message	Last commit date
Latest commit History 204 Commits
code		code
data		data
scripts		scripts
thesis		thesis
.gitignore		.gitignore
CONTRIBUTORS.txt		CONTRIBUTORS.txt
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Probabilistic Programming for Spectroscopic Data Analysis

System preparation

1. Linux System Setup (Ubuntu 18.04)

General Requirements

Install virtualenv

Install required Python packages

Run Jupyter Notebook

2. Windows System Setup (Windows 7/10)

Install Miniconda

Install required Python packages

Run Jupyter Notebook

Datasets

Coffees

Juices

Olive Oils

Wines

Tablets

Beers

About

Releases

Packages

Languages

License

jnispen/PPSDA

Folders and files

Latest commit

History

Repository files navigation

Probabilistic Programming for Spectroscopic Data Analysis

System preparation

1. Linux System Setup (Ubuntu 18.04)

General Requirements

Install virtualenv

Install required Python packages

Run Jupyter Notebook

2. Windows System Setup (Windows 7/10)

Install Miniconda

Install required Python packages

Run Jupyter Notebook

Datasets

Coffees

Juices

Olive Oils

Wines

Tablets

Beers

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages