Skip to content

nilsleh/oceanbench

 
 

Repository files navigation

OceanBench: Sea Surface Height Edition

About | Tutorials | Quickstart | Installation

pyver codestyle Open In Colab GitHub commit activity JupyterBook License: MIT License: CC BY 4.0

About

OceanBench is a unifying framework that provides standardized processing steps that comply with domain-expert standards. It is designed with a flexible and pedagogical abstraction to do:

  1. provides plug-and-play data and pre-configured pipelines for ML researchers to benchmark their models w.r.t. ML and domain-related baselines
  2. provides a transparent and configurable framework for researchers to customize and extend the pipeline for their tasks.

It is lightweight in terms of the core functionality. We keep the code base simple and focus more on how the user can combine each piece. We adopt a strict functional style because it is easier to maintain and combine sequential transformations.

There are five features we would like to highlight about OceanBench:

  1. Data availability and version control with DVC.
  2. An agnostic suite of geoprocessing tools for xarray datasets that were aggregated from different sources
  3. Hydra integration to pipe sequential transformations
  4. xrpatcher - A flexible multi-dimensional array generator from xarray datasets that are compatible with common deep learning (DL) frameworks
  5. A JupyterBook that offers library tutorials and demonstrates use-cases. In the following section, we highlight these components in more detail.

Tutorials JupyterBook

We have a fully fledged Jupyter-Book available to showcase how OceanBench can be used in practice. There are some quickstart tutorials and there are also some more detailed tutorials which highlight some of the intricacies of OceanBench. Some highlighted tutorials are listed in the next section.

Quickstart

Data Registry Open In GitHub

We have an open data registry located at the oceanbench-data-registry GitHub repository. You can find some more meta-data about the available datasets as well as how to download them yourself.

Ocean-Tools Open In GitHub

We have our utility functions which make up the backbone of the preprocessing, postprocessing, and some plotting. This is what we piece together to use for creating recipes, pipelines and tasks. This repo is located at jejjohnson/ocn-tools. See the docs (TODO) for more information.

Machine Learning Datasets Open In Colab

We have a set of tasks related to sea surface height interpolation and they come readily integrated into a digestable ML-ready format. We use our custom xrpatcher package to pipe xarray data structures to PyTorch datasets/dataloaders. For ML researchers who want to see how they can get started quickly, look at our Task-to-Patcher demo available. For more information about the datasets, see the oceanbench-data-registry.

LeaderBoard Open In Colab

OceanBench can be used to generate the leaderboard for our different interpolation challenges. To generate the leaderboards for different tasks with the available data we have, look at our LeaderBoard demo.

Machine Learning Example Open In Colab

Currently, the most successful algorithm for the SSH challenges is a Bi-Level Optimization algorithm (4DVarNet). To see a reproducible end-to-end example for how a SOTA method was used in conjunction with OceanBench, see our End-to-End demo.

Installation

conda (RECOMMENDED)

We use conda/mamba as our package manager. To install from the provided environment files run the following command.

git clone https://github.com/jejjohnson/oceanbench.git
cd oceanbench
mamba env create -n environments/linux.yaml

Jupyter

if you want to add the oceanbench conda environment as a jupyter kernel, you need to set the ESMF environment variable:

conda activate oceanbench
mamba install ipykernel -y 
python -m ipykernel install --user --name=oceanbench --env ESMFMKFILE "$ESMFMKFILE"

pip

We can directly install it via pip from the.

pip install "git+https://github.com/jejjohnson/oceanbench.git"

Note: There are some known dependency issues related to pyinterp and xesmf. You may need to manually install some of the dependencies before installing oceanbench via pip. See the pyinterp and xesmf packages for more information.

poetry

For developers who want all of the dependencies via pip, we can use poetry to install the package.

git clone https://github.com/jejjohnson/oceanbench.git
cd oceanbench
conda create -n oceanbench python=3.10 poetry
poetry install

Acknowledgements

We would like to acknowledge the Ocean-Data-Challenge Group for all of their work for providing open source data and a tutorial of metrics for SSH interpolation.

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%