Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update RandomLabelsToImage #253

Closed
fepegar opened this issue Aug 5, 2020 · 19 comments · Fixed by #264
Closed

Update RandomLabelsToImage #253

fepegar opened this issue Aug 5, 2020 · 19 comments · Fixed by #264
Labels
documentation Improvements or additions to documentation

Comments

@fepegar
Copy link
Owner

fepegar commented Aug 5, 2020

@GFabien @romainVala

Now that I've included the ICBM template, do you think it's a good idea to simplify a bit the example in RandomLabelsToImage? Also, Colin27 2008 is very large.

For example:

import torchio
from torchio import RandomLabelsToImage
from torchio.datasets import ICBM2009CNonlinearSymmetryc

icbm = ICBM2009CNonlinearSymmetryc(load_4d_tissues=False)

# Default Gaussian parameters
transform = RandomLabelsToImage(
    pv_label_keys=['gm', 'wm', 'csf'],
)
transformed = transform(icbm)
transformed['image'].save('/tmp/lab2im.nii')

# Specify Gaussian parameters
gaussian_parameters = {
    'gm': dict(mean=2, std=0.1),
    'wm': dict(mean=4, std=0.1),
    'csf': dict(mean=6, std=0.1),
}
transform = RandomLabelsToImage(
    pv_label_keys=['gm', 'wm', 'csf'],
    gaussian_parameters=gaussian_parameters,
)
transformed = transform(icbm)
transformed['image'].save('/tmp/lab2im.nii')

lab2im

I can also add an option in the ICBM class to create a label map from the PV images.

@GFabien
Copy link
Contributor

GFabien commented Aug 5, 2020

Yes I think it would be more illustrative, there are a lot of classes in Colin27 2008 and here you can play both with PV and discretized images.

Talking about the refactoring of RandomLabelsToImage, what would you like to be changed? I thought you wanted to remove the pv_label_keys attribute to keep only the label_key attribute but I'm not sure anymore...

@fepegar fepegar changed the title Simplify example in RandomLabelsToImage Update RandomLabelsToImage Aug 5, 2020
@fepegar
Copy link
Owner Author

fepegar commented Aug 5, 2020

I haven't thought of what would need to be changed, but what it should support. How hard would it be to support passing the PV maps as a 4D image (all other transforms support 4D now). I thought this was your initial plan.

@GFabien
Copy link
Contributor

GFabien commented Aug 5, 2020

Yes it was indeed the initial plan but now I like having all my labels as individual attributes (it makes it very modular).

The question is more "should it only support 4D images as a single sample attribute or should it also support distributed 4D images (i.e. having the different channels as differents attributes)?". Even if I like the handling of distributed 4D images I feel that for now it is the only place in the code where we are doing something like this and it can make it a bit awkward and hard to maintain. If this can be useful somewhere else then probably we should find a way to refactor this mechanism of aggregating the different channels to make it more transparent to the transform. If this is the only place where it's useful then we should just get rid of it.

Supporting PV maps as 4D image is really easy, you just just have to directly take the tensor instead of stacking channels.

@fepegar
Copy link
Owner Author

fepegar commented Aug 5, 2020

I'm not sure it's useful to add functionalities to store multiple 3D images in Image, where it can just be 4D.

Supporting PV maps as 4D image is really easy, you just just have to directly take the tensor instead of stacking channels.

Great.

As I don't think there are many people using this transform, I suppose we can leave it as it is for now.

@GFabien
Copy link
Contributor

GFabien commented Aug 5, 2020

I'm not sure it's useful to add functionalities to store multiple 3D images in Image, where it can just be 4D.

To be clear I was talking about the workaround used so far of having multiple 3D Image in one Subject.

As I don't think there are many people using this transform, I suppose we can leave it as it is for now.

Ok, tell me when you think we should change it.

@fepegar
Copy link
Owner Author

fepegar commented Aug 5, 2020

To be clear I was talking about the workaround used so far of having multiple 3D Image in one Subject.

Oh ok, I misunderstood.

@fepegar fepegar added the documentation Improvements or additions to documentation label Aug 5, 2020
@romainVala
Copy link
Contributor

Yes it was indeed the initial plan but now I like having all my labels as individual attributes (it makes it very modular).

can you precise, what modularity it makes

for now it is the only place in the code where we are doing something like this

which line of the code ?

@GFabien
Copy link
Contributor

GFabien commented Aug 6, 2020

can you precise, what modularity it makes

If you have all your label maps as separate files you can very easily pick which label maps you use for every experiment you run. If you don't need them all, you can free memory by not taking them all.

which line of the code ?

It's not one precise line of code, it's the fact of handling the different channels as different attributes of the sample. It is the only transform where it's done because before transforms were only handling 3D images but here it was designed for 4D images but whitout support for images directly stored as 4D images. Now 4D images are supported so we may wonder if the original implementation still makes sense or if we should only support 4D images as single attributes. There are pros and cons for removing the handling the different channels as different attributes:

  • pros: it will make the implementation simpler and easier to maintain
  • cons: we will lose a bit of modularity (+ in our specific case we will have to change all our files but that's not a big issue)

@fepegar
Copy link
Owner Author

fepegar commented Aug 6, 2020

In theory, you can load just part of a saved array using e.g. nib.load(path).dataobj[..., (0, 2, 3)] (assuming channels are last in the saved NIfTI). We could also devise some technique to merge 3D volumes into 4D. For example, Image could take not just a path or tensor but a list of them.

@GFabien
Copy link
Contributor

GFabien commented Aug 6, 2020

In theory, you can load just part of a saved array using e.g. nib.load(path).dataobj[..., (0, 2, 3)] (assuming channels are last in the saved NIfTI).

This is interesting, it would be nice that torchio enables this but there is no hurry.

We could also devise some technique to merge 3D volumes into 4D. For example, Image could take not just a path or tensor but a list of them.

That would be the best solution and it would make it transparent to the transform. If this seems good to you I'll make a PR to refactor RandomLabelsToImage to handle only single label map (be it a PV label map or not) and enable Image to take a list of paths.

What I'm not sure about however is if I should make a clear distinction between PV and label maps. Indeed, you could find if you're facing a label map by rounding the tensor and compare it to itself but with datasets such as Colin27 2008 you can be surprised. Furthermore, label maps can now be given as one hot encoded label maps so you should take the argmax to get a usual label map. Maybe the discretize attribute is enough to make this distinction clear.

@fepegar
Copy link
Owner Author

fepegar commented Aug 6, 2020

That would be the best solution and it would make it transparent to the transform. If this seems good to you I'll make a PR to refactor RandomLabelsToImage to handle only single label map (be it a PV label map or not) and enable Image to take a list of paths.

Sounds good thanks! I think an if isinstance(path, list) and torch.cat(tensors) in Image.load() would do. Maybe also asserting that all affines are the same. This could be extended to Image.save() (taking a path per channel).

What I'm not sure about however is if I should make a clear distinction between PV and label maps. Indeed, you could find if you're facing a label map by rounding the tensor and compare it to itself but with datasets such as Colin27 2008 you can be surprised. Furthermore, label maps can now be given as one hot encoded label maps so you should take the argmax to get a usual label map. Maybe the discretize attribute is enough to make this distinction clear.

I guess you're right. At the end of the day, they are just soft or hard assignments, in 4D volumes, that can sometimes be summarized in a 3D volume. However, the Colin27 2008 is an outlier, their volume is clearly wrong (I email the maintainer, no reply).

@romainVala
Copy link
Contributor

romainVala commented Aug 6, 2020

I thing to keep different possible choices of the inputs (3D list, 4D, mixt list of 4D/3D) is convenient for the user, because in practice we do have this variability. (if not the user will have to add extra step to convert its givens files, less convenient but not a big deal ...). But let's deal for now only with the 2 cases 3Dlist or 4D)

I wonder if the bit modularity you loos with 4D could not be solve by mapping both input to the same internal structure:
for instance the subject creation of ICBM2009CNonlinearSymmetryc

if load_4d_tissues:

in case of 4D we have only the key "tissue" (containing o true 4D tensor), but if we add a way to automatically transform this key into multiple 3D key, for 4D images we could have 3 keys (each containing a 3D tensor)

tissue_0
tissue_1
tissue_2

or if an extra argument is given in the LabelMap, for instance 3D_key_name={'gm', 'wm', 'csf'} this will allow to have those keys directly in the sample (although the input file will stay 4D, and for the ICBM2009CNonlinearSymmetryc it will avoid the extra step to save on disk the separate 3D volumes ... (which I found very 'ugly' ...)

May be we should let the user the choice (with an extra args) to keep the sample key at 4D or the transform it as multiple 3D keys

PS the time to write this I miss the last 2 comments

@romainVala
Copy link
Contributor

That would be the best solution and it would make it transparent to the transform. If this seems good to you I'll make a PR to refactor RandomLabelsToImage to handle only single label map (be it a PV label map or not) and enable Image to take a list of paths.

to be sure I follow, if you do so, you'll loose the bit of modularity you was refering to ?

@GFabien
Copy link
Contributor

GFabien commented Aug 6, 2020

to be sure I follow, if you do so, you'll loose the bit of modularity you was refering to ?

No, it's just that you make the choice of the label maps you are using at loading time (when you create your Image) rather than when setting your transform.

@romainVala
Copy link
Contributor

ok, thanks, may be I missunderstand, and my comment #253 (comment) is not necessary ...
@GFabien how do you specify

gaussian_parameters = {

for 4D inputs ?

@GFabien
Copy link
Contributor

GFabien commented Aug 6, 2020

Instead of having

gaussian_parameters = {
    'GM': {...}, 'WM': {...}
}

You will have

gaussian_parameters = {
    0: {...}, 1: {...}
}

So you lose a bit in readability but you can always use a mapping at some point to find that 0 refers to GM and so on.

@fepegar
Copy link
Owner Author

fepegar commented Aug 6, 2020

I don't think you need a dict for that. I'd say a sequence for the means and a sequence for the stds, as in https://pytorch.org/docs/stable/torchvision/transforms.html#torchvision.transforms.Normalize

@GFabien
Copy link
Contributor

GFabien commented Aug 6, 2020

Why not, however it means that you have to specify every mean and std (even when you specify that a tissue will not be used because it will be directly taken from the original image).

@romainVala
Copy link
Contributor

romainVala commented Aug 6, 2020

So you lose a bit in readability but you can always use a mapping at some point to find that 0 refers to GM and so on.

you also lose a bit a flexibility, for case of 4D pv maps having different order (which will happen if you compare different software ...), you will need a different transform
whereas if you do the mapping (index, key_name) at the loading time, then same transform can be used (defined gm wm ect)
but an alternative is to allow a swap_label_order argument, at loading to allow a unique internal order independent of the one on the disk files

For cases of large label there also can be missing labels, this is an other story but easier to deal with name than with index ...

@fepegar fepegar linked a pull request Aug 12, 2020 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants