Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Latest commit

 

History

History
118 lines (92 loc) · 6.81 KB

sample_tasks.md

File metadata and controls

118 lines (92 loc) · 6.81 KB

Sample Tasks

Two sample tasks for the classification and segmentation pipelines. This document will walk through the steps in Training Steps, but with specific examples for each task. Before trying tp train these models, you should have followed steps to set up an environment and AzureML

Sample classification task: Glaucoma Detection on OCT volumes

This example is based on the paper A feature agnostic approach for glaucoma detection in OCT volumes.

Downloading and preparing the dataset

  1. The dataset is available here [1].

  2. After downloading and extracting the zip file, run the create_glaucoma_dataset_csv.py script on the extracted folder.

    python create_dataset_csv.py /path/to/extracted/folder
    

    This will convert the dataset to csv form and create a file dataset.csv.

  3. Upload this folder (with the images and dataset.csv) to Azure Blob Storage. For details on creating a storage account, see Setting up AzureML.

Setting up training

  1. Set up a directory outside of InnerEye to holds your configs, as in Setting Up Training. After this step, you should have a folder InnerEyeLocal beside InnerEye with files settings.yml and ML/runner.py.

Creating the classification model configuration

The full configuration for the Glaucoma model is at InnerEye/ML/configs/classification/GlaucomaPublic. All that needs to be done is change the dataset. We will do this by subclassing GlaucomaPublic in a new config stored in InnerEyeLocal/ML

  1. Create folder configs/classification under InnerEyeLocal/ML
  2. Create a config file called GlaucomaPublicExt.py there which extends the GlaucomaPublic class that looks like

from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic

class GlaucomaPublicExt(GlaucomaPublic): def init(self) -> None: super().init(azure_dataset_id="name_of_your_dataset_on_azure") ```

  1. In settings.yml, set model_configs_namespace to InnerEyeLocal.ML.configs so this config
    is found by the runner. Set extra_code_directory to InnerEyeLocal.

Start Training

Run the following to start a job on AzureML

python InnerEyeLocal/ML/runner.py --azureml=True --model=GlaucomaPublicExt --train=True

See Model Training for details on training outputs, resuming training, testing models and model ensembles.

Sample segmentation task: Segmentation of Lung CT

This example is based on the Lung CT Segmentation Challenge 2017 [2].

Downloading and preparing the dataset

  1. The dataset [3][4] can be downloaded here.
  2. The next step is to convert the dataset from DICOM-RT to NIFTI. Before this, place the downloaded dataset in another parent folder, which we will call datasets. This file structure is expected by the converison tool.
  3. Use the InnerEye-CreateDataset to create a NIFTI dataset from the downloaded (DICOM) files. After installing the tool, run
    InnerEye.CreateDataset.Runner.exe dataset --datasetRootDirectory=<path to the 'datasets' folder> --niftiDatasetDirectory=<output folder name for converted dataset> --dicomDatasetDirectory=<name of downloaded folder inside 'datasets'> --geoNorm 1;1;3
    Now, you should have another folder under datasets with the converted Nifti files. The geonorm tag tells the tool to normalize the voxel sizes during conversion.
  4. Upload this folder (with the images and dataset.csv) to Azure Blob Storage. For details on creating a storage account, see Setting up AzureML.

Setting up training

  1. Set up a directory outside of InnerEye to holds your configs, as in Setting Up Training. After this step, you should have a folder InnerEyeLocal beside InnerEye with files settings.yml and ML/runner.py.

Creating the segmentation model configuration

The full configuration for the Lung model is at InnerEye/ML/configs/segmentation/Lung. All that needs to be done is change the dataset. We will do this by subclassing Lung in a new config stored in InnerEyeLocal/ML

  1. Create folder configs/segmentation under InnerEyeLocal/ML
  2. Create a config file called LungExt.py there which extends the GlaucomaPublic class that looks like

from InnerEye.ML.configs.segmentation.Lung import Lung

class LungExt(Lung): def init(self) -> None: super().init(azure_dataset_id="name_of_your_dataset_on_azure") ```

  1. In settings.yml, set model_configs_namespace to InnerEyeLocal.ML.configs so this config
    is found by the runner. Set extra_code_directory to InnerEyeLocal.

Start Training

Run the following to start a job on AzureML

python InnerEyeLocal/ML/runner.py --azureml=True --model=LungExt --train=True

See Model Training for details on training outputs, resuming training, testing models and model ensembles.

References

[1] Ishikawa, Hiroshi. (2018). OCT volumes for glaucoma detection (Version 1.0.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.1481223

[2] Yang, J. , Veeraraghavan, H. , Armato, S. G., Farahani, K. , Kirby, J. S., Kalpathy-Kramer, J. , van Elmpt, W. , Dekker, A. , Han, X. , Feng, X. , Aljabar, P. , Oliveira, B. , van der Heyden, B. , Zamdborg, L. , Lam, D. , Gooding, M. and Sharp, G. C. (2018), Autosegmentation for thoracic radiation treatment planning: A grand challenge at AAPM 2017. Med. Phys.. . doi:10.1002/mp.13141

[3] Yang, Jinzhong; Sharp, Greg; Veeraraghavan, Harini ; van Elmpt, Wouter ; Dekker, Andre; Lustberg, Tim; Gooding, Mark. (2017). Data from Lung CT Segmentation Challenge. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2017.3r3fvz08

[4] Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. (paper)