Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement GPU Wavelet transform #25

Open
frankong opened this issue Jul 11, 2019 · 8 comments
Open

Implement GPU Wavelet transform #25

frankong opened this issue Jul 11, 2019 · 8 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@frankong
Copy link
Member

Is your feature request related to a problem? Please describe.
There is no wavelet transform in GPU. Currently, SigPy moves array to CPU and uses pywavelet to perform the wavelet transform, which is the main bottleneck for compressed sensing MRI reconstructions.

Describe the solution you'd like
A GPU wavelet transform can leverage the GPU multi-channel convolution operations already wrapped in SigPy. Low pass and high pass filtering can be done in one operation using the output channels. Strides can be used to incorporate subsampling.

Describe alternatives you've considered
Implement the GPU kernels for wavelet transforms. But this would be less optimized than cuDNN convolutions.

@frankong frankong added the enhancement New feature or request label Jul 11, 2019
@jtamir
Copy link
Member

jtamir commented Jul 11, 2019

here is a possible approach: https://lernapparat.de/2d-wavelet-transform-pytorch/

@sidward
Copy link
Collaborator

sidward commented Nov 14, 2019

This is addressed by PR #32!

@frankong
Copy link
Member Author

Thanks, Sid!

@marcobattiston1988
Copy link

Hi everyone,

I see this feature was initally added and then removed from the master branch. Would it be possible to know why?

I would be really interested to get this working - I have been working on some compressed sensing type of recon with sigpy, keeping an eye to recon time: the wavelet transform is definitely the bottleneck of the pipeline. Is there any plan to allow wavelet to run on gpu?

thanks a lot for this amazing tool!
Marco

@sidward
Copy link
Collaborator

sidward commented May 25, 2022

Hi,

I think @frankong was worried about the correctness of my implementation. I would like to re-visit it, but do not have time right now. That being said, if you are able to resurrect the code and clarify that it's in "beta", I'll be happy to re-include it.

@sidward sidward reopened this May 25, 2022
@sidward sidward added the help wanted Extra attention is needed label May 27, 2022
@marcobattiston1988
Copy link

Hi Sid,

I have been looking a bit into the solution proposed here

I ran some quick tests on 2D data, comparing forward transform using this implementation and the one in PyWavelets. Qualitatively, it looks reasonably similar, although coefficient absolute values are a bit different. It also has a different size, probably the zero padding is applying in a slightly different way, as when I look at the tiled coefficients they appear a bit shifted? Another thing I noticed, when looking at residual error of a forward+backward transform, this new implementation shows a slighltly higher error compared to the forward+backward of pywavelet (still the effect on the input image could not be appreciated). These were just qualitative tests, I am happy to go a bit more in detail with the comparison if needed...Do you rememeber by any chance what the issue was when you first incorporated this? I can try to focus on that...

Regarding the performace, I had hard time to get decent computation times.
Initially, this solution was way slower than the PyWavelet implementation, although I could see some activity on the GPU. Using the nvidia profiler (NSight), I could see that the GPU workload was getting much more intermittent right after the first iteration of the GradientMethod. I tried to move into the initialization of the Wavelet linear operator as much operations as possible, but that didn't really help. The key move was to downgrade my cuda version, from 11.4 to 11.2 (I am running this on Windows 10), and all of sudden I got a factor 10 of reduction in recon time (and I can see that the GPU is working consistently).
Now recon times are more comparable with the pywavelet implementation. Actually on 3D data, this is now more the 50% faster, while it is roughly 50% slower than the original implementation on 2D data... I am definitely not an expert on code optimization (even worse on GPU), but I believe there is some room for improvement...

Look forward to your feedbacks
(and thanks a lot for the effort you put into sigpy!!!)

Marco

@sidward
Copy link
Collaborator

sidward commented Jul 8, 2022

Hi Marco,

Thanks much for the tests! I will personally be a swamped for a few months but if you're willing to take the lead, I'd be happy to review any pull-requests on this.

The following are what I envision would need to be done before the GPU version can be implemented:

  1. (Not GPU specific) Generally, modify the wavelet calls and linops to take in the modes argument used in pywt (details: https://pywavelets.readthedocs.io/en/latest/regression/modes.html).
  2. (GPU specific) Given the same chosen modes, require that the CPU output and GPU output are less then a certain amount of error (say, 1e-4% to 1e-6% of the top of my head). This will be really helpful for unit-tests!

On the performance difference, I think we can revisit it after (2) is done! Once that's set as a baseline, we can look at optimizations.

Thanks much for all you've reported on so far already! Please let me know your thoughts on the above, and if you're interested in pursuing this.

@marcobattiston1988
Copy link

Hi Sid,
Yes, it sounds like a fair plan. I have had a quick look at 1) and 2) and it shouldn't be anything too hard to achieve.
Not sure how much time I can dedicate to this in the coming weeks, but August looks definitely less busy. I will keep you posted.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants