ohw20-proj-pyxpcm

OceanHackWeek 2020 project on pyXpcm

Installation

conda env create -f ohw20-proj-pyxpcm.yml
conda activate ohw20-proj-pyxpcm
pip install pyxpcm

Introduction

Use the package pyXpcm to identify ocean regions.

pyXpcm is a Python package that performs Profile Classification Model for ocean data, a statistical procedure to classify ocean vertical profiles into a finite set of “clusters”, based on a Gaussia Mixture Model. Depending on the dataset, such clusters can show space/time coherence that can be used in many different ways to study the ocean.

It consists in conducting un-supervised classification (clustering) with vertical profiles of one or more ocean variables.

Each levels of the vertical axis of each ocean variables are considered a feature. One ocean vertical profile with ocean variables is considered a sample.

The problem

Building off of Rosso et al., 2020 and previous work ( Maze et al., 2017 and Jones et al., 2019 ), I'd like to construct a toolbox to apply the Profile Classification Model used in that paper and others to identify ocean mixing in a generalized context. Previous papers have focused on two regions, and are possibly hard coded in order to take in the data and run the unsupervised machine learning, I'd like to generalize the framework around the algorithms in order to look at ocean (possibly atmospheric) mixing anywhere in the world.

Practical Steps

Reproduce the 'getting started/user guide' of pyXpcm. For this we use the example data provided in the package
Try out the classification by ourselves, from the start. For that, we need: a. Fetch Argo data for the North Atlantic, using ArgoPy (Choose a snapshot in time of the data) b. 'Clean' the data (remove un-necessary variables). c. Reduce the (vertical) dimensionality by performing a PCA (using scikit-learn?) d. Prepare the PCM model, by choosing the features (e.g. temperature and salinity) and the number of classes/clusters k. e. Use the pre-processed Argo data as an input for the model. f. Does the clustering changes when different time snapshots are used? (i.e, January Vs. July)
Try the classification with different data sources (maybe glyder data, or satellite SST?)

Participants

Carolina Camargo Nora Loose Daniela Munguia Chandana Mudeppa Shikha Singh Joseph Gum

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.gitignore		.gitignore
Argopy-Intro.ipynb		Argopy-Intro.ipynb
LICENSE		LICENSE
README.md		README.md
ohw20-proj-pyxpcm.yml		ohw20-proj-pyxpcm.yml
pyxpcm-GoMx-example.ipynb		pyxpcm-GoMx-example.ipynb
pyxpcm-User_guide.ipynb		pyxpcm-User_guide.ipynb
recoveringfunction.ipynb		recoveringfunction.ipynb
step1.ipynb		step1.ipynb
step1b.ipynb		step1b.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ohw20-proj-pyxpcm

Installation

Introduction

The problem

Practical Steps

Participants

About

Releases

Packages

Contributors 5

Languages

License

oceanhackweek/ohw20-proj-pyxpcm

Folders and files

Latest commit

History

Repository files navigation

ohw20-proj-pyxpcm

Installation

Introduction

The problem

Practical Steps

Participants

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages