Custom dataset, data loader #86

MancaZerovnikMekuc · 2019-11-22T19:38:20Z

Hi,

I have a custom 3D dataset. I have spent a lot of time trying to run preprocessing script on LIDC data but I still have some issues running preprocessing script - looks like my characteristic.csv file is not what it should be. Can somebody provide a description how the output of the script is formatted?

I want to run the maskrcnn on my custom 3D dataset which consists of volumes with voxelwise annotations.

Can somebody describe how the data should be formatted for the example data_loader?
"Example Data Loader for the LIDC data set. This dataloader expects preprocessed data in .npy or .npz files per patient and a pandas dataframe in the same directory containing the meta-info e.g. file paths, labels, foregound slice-ids."

From this I do not know how to format the data. Has anybody successfully ran the model on its own volumetric data with voxelwise annotations and can share the dataloader or some specificaton of the data formatting for the existing dataloader?

pfjaeger · 2019-11-22T20:22:53Z

you could generate the toy data and run some trainings on it. It will show you how the data is structured and read by the data loader.

MancaZerovnikMekuc · 2019-11-25T17:50:47Z

Thank you. I have done that. But I have multiple instances in one image. How to structure that kind of data? Also I would like to include patching. What and what form of data should be in variable "class_target" in meta_info_dict field created by preprocessing.py for your data_loader.py ?
Also, I have 3D data, not 2D as in toy example.

lspinheiro · 2020-01-03T01:31:16Z

The toy data seems to only handle a segmentation example. Is there any documentation about how to generate bounding box labels?

thanhpt55 · 2020-03-16T09:21:31Z

@MancaZerovnikMekuc could you show me how to read image, label when begin train. As possible as you can give me the example data structure?
Thank you

Gregor1337 · 2020-04-18T13:45:07Z

@MancaZerovnikMekuc

Thank you. I have done that. But I have multiple instances in one image. How to structure that kind of data? Also I would like to include patching. What and what form of data should be in variable "class_target" in meta_info_dict field created by preprocessing.py for your data_loader.py ?
Also, I have 3D data, not 2D as in toy example.

During batch generation (in the dataloader scripts) "class_target" information holds the RoI-wise class labels, i.e., one class label per RoI. It is structured as a list of lists of numbers per batch.
I.e., in your generate_train_batch function in your BatchGenerator, your final batch dictionary should hold an entry "class_target", which looks, e.g., like this: [ [0,1], [2,0], [1] ]. In that example you have 3 classes. Batch element one has two RoIs, the first is of class 0, the second of class 1. Batch element two also has two RoIs, first of class 2, second of class 0. Third batch element has only one RoI of class 1.
The id-number of a RoI equals its position within those batch elements lists. The id-number needs to correspond to the pixel-wise localization in the segmentation ground truth shifted by 1 since 0 is for background (all pixels that belong to RoI with id 0 need to be marked value 1 in the segmentation).
To include patching, I'd encourage you to follow the example in lidc_exp->dataloader.py PatientIterator. We only offer inclusive patching for the PatientIterator, i.e., during validation and testing, but not training. During training we sample patches instead of including the whole patched image. Apart from data loading you do not need to concern yourself with patching, it is all already implemented in the framework (in predictor.py).
The differences between 2D and 3D are marginal, you may look into lidc_exp for a guideline.

themantalope · 2021-05-31T16:23:43Z

@Gregor1337 - very helpful. would be nice to have a documentation file with this information somewhere in the repo. It might also be useful to see how to structure this from a toy experiment that generates multiple ROIs for a single training example.

rtgunti mentioned this issue Aug 24, 2020

Problem with one class training #115

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom dataset, data loader #86

Custom dataset, data loader #86

MancaZerovnikMekuc commented Nov 22, 2019

pfjaeger commented Nov 22, 2019

MancaZerovnikMekuc commented Nov 25, 2019 •

edited

Loading

lspinheiro commented Jan 3, 2020

thanhpt55 commented Mar 16, 2020

Gregor1337 commented Apr 18, 2020

themantalope commented May 31, 2021 •

edited

Loading

Custom dataset, data loader #86

Custom dataset, data loader #86

Comments

MancaZerovnikMekuc commented Nov 22, 2019

pfjaeger commented Nov 22, 2019

MancaZerovnikMekuc commented Nov 25, 2019 • edited Loading

lspinheiro commented Jan 3, 2020

thanhpt55 commented Mar 16, 2020

Gregor1337 commented Apr 18, 2020

themantalope commented May 31, 2021 • edited Loading

MancaZerovnikMekuc commented Nov 25, 2019 •

edited

Loading

themantalope commented May 31, 2021 •

edited

Loading