Two sample tasks for the classification and segmentation pipelines. This document will walk through the steps in Training Steps, but with specific examples for each task. Before trying tp train these models, you should have followed steps to set up an environment and AzureML
This example is based on the paper A feature agnostic approach for glaucoma detection in OCT volumes.
-
After downloading and extracting the zip file, run the create_glaucoma_dataset_csv.py script on the extracted folder.
python create_dataset_csv.py /path/to/extracted/folder
This will convert the dataset to csv form and create a file
dataset.csv
. -
Upload this folder (with the images and dataset.csv) to Azure Blob Storage. For details on creating a storage account, see Setting up AzureML.
- Set up a directory outside of InnerEye to holds your configs, as in
Setting Up Training. After this step, you should have a folder InnerEyeLocal
beside InnerEye with files
settings.yml
andML/runner.py
.
The full configuration for the Glaucoma model is at InnerEye/ML/configs/classification/GlaucomaPublic. All that needs to be done is change the dataset. We will do this by subclassing GlaucomaPublic in a new config stored in InnerEyeLocal/ML
- Create folder configs/classification under InnerEyeLocal/ML
- Create a config file called GlaucomaPublicExt.py there which extends the GlaucomaPublic class that looks like
from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic
class GlaucomaPublicExt(GlaucomaPublic): def init(self) -> None: super().init(azure_dataset_id="name_of_your_dataset_on_azure") ```
- In
settings.yml
, setmodel_configs_namespace
toInnerEyeLocal.ML.configs
so this config
is found by the runner. Setextra_code_directory
toInnerEyeLocal
.
Run the following to start a job on AzureML
python InnerEyeLocal/ML/runner.py --azureml=True --model=GlaucomaPublicExt --train=True
See Model Training for details on training outputs, resuming training, testing models and model ensembles.
This example is based on the Lung CT Segmentation Challenge 2017 [2].
- The dataset [3][4] can be downloaded here.
- The next step is to convert the dataset from DICOM-RT to NIFTI. Before this, place the downloaded dataset in another
parent folder, which we will call
datasets
. This file structure is expected by the converison tool. - Use the InnerEye-CreateDataset to create a NIFTI dataset
from the downloaded (DICOM) files.
After installing the tool, run
Now, you should have another folder under
InnerEye.CreateDataset.Runner.exe dataset --datasetRootDirectory=<path to the 'datasets' folder> --niftiDatasetDirectory=<output folder name for converted dataset> --dicomDatasetDirectory=<name of downloaded folder inside 'datasets'> --geoNorm 1;1;3
datasets
with the converted Nifti files. Thegeonorm
tag tells the tool to normalize the voxel sizes during conversion. - Upload this folder (with the images and dataset.csv) to Azure Blob Storage. For details on creating a storage account, see Setting up AzureML.
- Set up a directory outside of InnerEye to holds your configs, as in Setting Up Training. After this step, you should have a folder InnerEyeLocal beside InnerEye with files settings.yml and ML/runner.py.
The full configuration for the Lung model is at InnerEye/ML/configs/segmentation/Lung. All that needs to be done is change the dataset. We will do this by subclassing Lung in a new config stored in InnerEyeLocal/ML
- Create folder configs/segmentation under InnerEyeLocal/ML
- Create a config file called LungExt.py there which extends the GlaucomaPublic class that looks like
from InnerEye.ML.configs.segmentation.Lung import Lung
class LungExt(Lung): def init(self) -> None: super().init(azure_dataset_id="name_of_your_dataset_on_azure") ```
- In
settings.yml
, setmodel_configs_namespace
toInnerEyeLocal.ML.configs
so this config
is found by the runner. Setextra_code_directory
toInnerEyeLocal
.
Run the following to start a job on AzureML
python InnerEyeLocal/ML/runner.py --azureml=True --model=LungExt --train=True
See Model Training for details on training outputs, resuming training, testing models and model ensembles.
[1] Ishikawa, Hiroshi. (2018). OCT volumes for glaucoma detection (Version 1.0.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.1481223
[2] Yang, J. , Veeraraghavan, H. , Armato, S. G., Farahani, K. , Kirby, J. S., Kalpathy-Kramer, J. , van Elmpt, W. , Dekker, A. , Han, X. , Feng, X. , Aljabar, P. , Oliveira, B. , van der Heyden, B. , Zamdborg, L. , Lam, D. , Gooding, M. and Sharp, G. C. (2018), Autosegmentation for thoracic radiation treatment planning: A grand challenge at AAPM 2017. Med. Phys.. . doi:10.1002/mp.13141
[3] Yang, Jinzhong; Sharp, Greg; Veeraraghavan, Harini ; van Elmpt, Wouter ; Dekker, Andre; Lustberg, Tim; Gooding, Mark. (2017). Data from Lung CT Segmentation Challenge. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2017.3r3fvz08
[4] Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. (paper)