Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix errors on CUDA device #83

Merged
merged 1 commit into from
May 6, 2024
Merged

Conversation

X3NNY
Copy link
Contributor

@X3NNY X3NNY commented May 5, 2024

Set parameters on the correct device.

@KindXiaoming KindXiaoming merged commit 9c00ccc into KindXiaoming:master May 6, 2024
@AlessandroFlati
Copy link
Contributor

I made the exact same changes to make CUDA work. Nonetheless, it seems slightly less performance on "science" tasks, probably because of the time taken to port memory to cpu for some critical operations (the ones you found out needed a .to("cpu")).
@KindXiaoming would you mind explaining why is that necessary? I'm working hard on this repo believing in its potentiality, but I didn't have time to look at all the low level details.

@KindXiaoming
Copy link
Owner

KindXiaoming commented May 6, 2024

Hi AlessandroFlati, do you mean why there are some operations are needed to perform on CPUs? In some cases: (1) I want to use sklearn (actually I never checked if sklearn can support GPU! so I just went with CPU). For example, sklearn linear regression will be called if you want to extract symbolic equations (which is needed if you want interpretability for science). (2) it seems to me the GPU version of torch.lstsq is more unstable than CPU version. Stable in the sense that sometimes GPU may return NaN and CPU doesn't. For example, torch.lstsq is used in curve2coef hence in initialize_grid_from_another_model (which is needed if you want to do grid extension to aim for accuracy for science).

@AlessandroFlati
Copy link
Contributor

Then I think I can help you on that. I'll open a fork and try to prepare a PR demonstrating those 2 things can be performed in CUDA with dedicated examples. Maybe I should open another Issue for that, but NaN are also easily propagated if one sets the last activation to be exp for large numbers on dataset (~1e6 is enough) and I was wondering why. I'll keep an eye on that too.

@mw66
Copy link

mw66 commented May 6, 2024

@KindXiaoming

I think this fix deserve a new release (right now the version is v0.0.2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants