Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data augmentation #68

Closed
laurenmoos opened this issue Dec 8, 2020 · 5 comments
Closed

data augmentation #68

laurenmoos opened this issue Dec 8, 2020 · 5 comments

Comments

@laurenmoos
Copy link

support for data augmentation pipelines, optimally data augmentation pipelines that support higher-level optimization like policy search or HPO

data augmentation would have its own abstraction but be bound to a pipeline abstraction and then pipelines would be passed to collate for dual-augmentation of batches specific to many self-supervision tasks, collation could be associated more with descriptive methods for how much divergence is introduced (via randomness) for each of the two "paths".

This would allow researchers to use strategies such as curriculum/active learning with augmentation diverging as the consistency loss reduced (for example)

@philippmwirth
Copy link
Contributor

Doesn't the lightly.data.collate.BaseCollate class offer an interface for such pipelines? The pipeline need only be implemented as a torchvision.transforms.Compose.

@laurenmoos
Copy link
Author

the last project I worked on was for video and so we had to use non-torch compose pipelines

If the overall goal of the platform is using self-supervision to improve sample efficiency for downstream labeled tasks, it would make sense for the platform to be very robust in terms of ingest of other sample efficiency increasing techniques. Thinking as an end-user, I would almost certainly couple experiments in terms of augmentation strategies + self-supervision and so having (even a very thin) abstraction over torchvision transforms allows for "literate experimental design" . My end-goal isn't strictly using self-supervision, it is preparing data / network initializations in such a way I require less data when the good ol vanilla CNN's come out.

concrete example- my previous company was using medical images to produce a probability of {z medical condition}. I wanted to use self-supervision because we had 10x the amount of unlabeled data as labeled and our domain included super time expensive labeling process. There was some clear things that did to be regularized - in our case it was video so both temporal and spatial regularization- presumably upstream to self-supervision which was in turn upstream to supervised learning. I might use another open source framework for augmentation, but having my experiments in lightly reflect the upstream augmentation processes so I could ask questions like - do I really_ need a Gaussian blur applied randomly to imitate different microscopy focal lengths or should it be part of the "CollateFunction" (i.e. part of self-supervision tasks and not pre-processing) would have been amazing.

Perhaps it is sufficient to do this with torchvision compose but there is some kind of non-trivial & more importantly dynamic relationship between those pipelines and collate....

@laurenmoos
Copy link
Author

I think I am thinking less in terms of framework interoperability and more OOP/end-case usability, how do we express the relationship between prior augmentation pipelines (however they're implemented and you're right they don't need to be implemented by lightly) and how those same augmentations are used on batches in contrastive learning

@busycalibrating
Copy link
Contributor

+1 on this idea, I think its a good idea to have very simple high level interfaces that can allow users to implement what they want without being forced to conform to something more restrictive (e.g. torchvision). This is pretty much the paradigm of how PyTorch does their nn.Module to allow for custom implementations without too much constraint.

@IgorSusmelj
Copy link
Contributor

That's a very interesting idea. Augmentations are key for contrastive learning. I would like to summarize the requirements and the concept of such an augmentation pipeline further before any development.

Here are a few thoughts from me:

  • Augmentations should be flexible depending on the problem I want to solve
  • Augmentations can be grouped
    • spatial (cropping, resize, ...)
    • texture (blur)
    • color
    • temporal (e.g. in videos, frames before/ after current frame)
  • Selecting the right augmentation strength is very tricky (especially in self-supervised/ unsupervised learning)
    • User can set the values (as it is now)
    • Can use heuristics/RL to find good parameters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

5 participants