model_seg_generalist

How to train

This model can be trained on multiple datasets. The preprocessing script expects the datasets, already in nnunet format, to be located in a specific directory (the directory containing nnUNet_raw, nnUNet_preprocessed, nnUNet_results). The indices of the datasets to be used for training must be passed as arguments to the script.

The datasets could be any of the following:

TEM dataset
SEM dataset
BF dataset (default)
BF dataset (wakehealth, human)
BF dataset (VCU, rabbit)

First, run the aggregation script with the required arguments. The script expects the following arguments:

--nnunet_dir: The path to the directory containing the datasets.
--dataset_ids: A list of indices of the datasets to be used for training. If no indices are provided, all datasets in the directory will be used.
--name: (Optional) The name you want to assign to the aggregated dataset. Defaults to 'Dataset444_AGG'.
--description: (Optional) A description for the aggregated dataset. Defaults to 'Aggregated dataset from all source domains'.
--k: (Optional) The number of folds for cross-validation. Defaults to 5.

Here is an example command to run the script:

python nnunet_scripts/aggregate_data.py --nnunet_dir path_to_directory_containing_dsets --dataset_ids <dataset_index_1> <dataset_index_2> ... <dataset_index_n> --name MyAggregatedDataset --description "Aggregated dataset for my experiment" --k 5

Replace <dataset_index_1> <dataset_index_2> ... <dataset_index_n> with the indices of the datasets you want to use for training.

This will create a new nnunet dataset. We can then run the initial setup, move the manual split in the preprocessed folder and start training:

source ./nnunet_scripts/setup_nnunet.sh NNUNET_DIR
./nnunet_scripts/train_nnunet.sh 444 AGG <GPU_ID> <FOLD_1> <FOLD_2> ... <FOLD_k>

To parallelize the execution of the training script for faster processing, you can run multiple instances of the script simultaneously, each handling a different fold for cross-validation. This is particularly useful when you have access to a machine with multiple GPUs. Here's an example command that demonstrates how to run training on 5 folds in parallel:

./nnunet_scripts/train_nnunet.sh 444 AGG <GPU_ID> 0 &
./nnunet_scripts/train_nnunet.sh 444 AGG <GPU_ID> 1 & 
./nnunet_scripts/train_nnunet.sh 444 AGG <GPU_ID> 2 & 
./nnunet_scripts/train_nnunet.sh 444 AGG <GPU_ID> 3 & 
./nnunet_scripts/train_nnunet.sh 444 AGG <GPU_ID> 4 &

Setting Up Conda Environment

To set up the environment and run the scripts, follow these steps:

Create a new conda environment:

conda create --name generalist_seg

Activate the environment:

conda activate generalist_seg

Install PyTorch, torchvision, and torchaudio. For NeuroPoly lab members using the GPU servers, use the following command:

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

For others, please refer to the PyTorch installation guide at https://pytorch.org/get-started/locally/ to get the appropriate command for your system.

Update the environment with the remaining dependencies:

conda env update --file environment.yaml

Inference

After training the model, you can perform inference using the following command:

python nnunet_scripts/run_inference.py --path-dataset ${nnUNet_raw}/Dataset<FORMATTED_DATASET_ID>_<DATASET_NAME>/imagesTs --path-out <WHERE/TO/SAVE/RESULTS> --path-model ${nnUNet_results}/Dataset<FORMATTED_DATASET_ID>_<DATASET_NAME>/nnUNetTrainer__nnUNetPlans__2d/ --use-gpu --use-best-checkpoint

The --use-best-checkpoint flag is optional. If used, the model will use the best checkpoints for inference. If not used, the model will use the latest checkpoints. Based on empirical results, using the --use-best-checkpoint flag is recommended.

Note: <FORMATTED_DATASET_ID> should be a three-digit number where 1 would become 001 and 23 would become 023.

Replicating Experiments

To replicate the inference experiments, execute the following script:

source ./nnunet_scripts/inference_and_evaluation.sh ${NNUNET_DIR} <DATASET_1> <DATASET_2> <DATASET_3> ... <DATASET_N>

For instance, to run the script with specific datasets, use the command below:

source ./nnunet_scripts/inference_and_evaluation.sh ${NNUNET_DIR} Dataset002_SEM Dataset003_TEM Dataset004_BF_RAT Dataset005_wakehealth Dataset006_BF_VCU Dataset444_AGG

In addition to the individual inference and evaluation scripts, there is an "ensemble_inference_and_evaluation.sh" script available. This script performs ensemble inferences using all the models listed and then evaluates the ensemble model. The arguments for this script are similar to the ones mentioned above, except <DATASET_K> represents all the models being ensembled.

To use the ensemble script, execute the following command:

source ./nnunet_scripts/ensemble_inference_and_evaluation.sh ${NNUNET_DIR} <DATASET_1> <DATASET_2> <DATASET_3> ... <DATASET_N>

To replicate out of distribution experiments (OOD), you can use the following script:

source ./nnunet_scripts/ood_results.sh <PATH_TO_OOD_DATASET> ${RESULTS_DIR}/nnUNet_results <DATASET_1> <DATASET_2> <DATASET_3> ... <DATASET_N>

Ensure the OOD dataset adheres to the following structure prior to executing the script:

├── <MODALITY 1>
│   ├── some species
│       ├── image_0000.png
│       └── (optional) labels
│            └── image.png
|   ...
│   └── another species
│       ├── image_0000.png
│       └── (optional) labels
│            └── image.png
└── <MODALITY N>
    ...

For instance:

├── BF
│   └── cat
│       └── CAT_0000.png
├── SEM
│   ├── dog
│   │   └── DOG_0000.png
│   └── human
│       ├── AGG_203_0000.png
│       └── labels
│            └── AGG_203.png
└── TEM
    └── macaque
        ├── labels
              ├── MACAQUE_000_0000.png
              ├── MACAQUE_001_0000.png
              ├── MACAQUE_002_0000.png
              ├── MACAQUE_003_0000.png
              ├── MACAQUE_004_0000.png
             ...
        ├── MACAQUE_000_0000.png
        ├── MACAQUE_001_0000.png
        ├── MACAQUE_002_0000.png
        ├── MACAQUE_003_0000.png
        ├── MACAQUE_004_0000.png
       ...

Authors

Armand Collin
Arthur Boschet

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
nnunet_scripts		nnunet_scripts
notebooks		notebooks
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

model_seg_generalist

How to train

Setting Up Conda Environment

Inference

Replicating Experiments

Authors

About

Releases 2

Packages

Contributors 2

Languages

axondeepseg/model_seg_generalist

Folders and files

Latest commit

History

Repository files navigation

model_seg_generalist

How to train

Setting Up Conda Environment

Inference

Replicating Experiments

Authors

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages