Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError when using --modified device #345

Open
drivenbyentropy opened this issue Jun 2, 2023 · 0 comments
Open

RuntimeError when using --modified device #345

drivenbyentropy opened this issue Jun 2, 2023 · 0 comments

Comments

@drivenbyentropy
Copy link

Hi,

When running bonito with a custom trained modified base model and specifying the --modified device option, it fails at runtime with the following error:

> reading pod5
> outputting aligned bam
> loading model [email protected]
> loading modified base model
> loaded modified base model to call (alt to T): T=XXXX
> loading reference
> calling:   0%|                                      | 1/5420253 [00:14<22525:53:46, 14.96s/ reads]/opt/bonito/lib/python3.9/site-packages/remora/data_chunks.py:515: UserWarning: FALLBACK path has been taken inside: runCudaFusionGroup. This is an indication that codegen Failed for some reason.
To debug try disable codegen fallback path via setting the env variable `export PYTORCH_NVFUSER_DISABLE=fallback`
 (Triggered internally at ../third_party/nvfuser/csrc/manager.cpp:335.)
  model.forward(
Exception in thread Thread-6:
Traceback (most recent call last):
  File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner
    self.run()
  File "/opt/bonito/lib/python3.9/site-packages/bonito/multiprocessing.py", line 261, in run
    for i, (k, v) in enumerate(self.iterator):
  File "/opt/bonito/lib/python3.9/site-packages/bonito/cli/basecaller.py", line 137, in <genexpr>
    results = ((k, call_mods(mods_model, k, v)) for k, v in results)
  File "/opt/bonito/lib/python3.9/site-packages/bonito/mod_util.py", line 91, in call_mods
    call_read_mods(
  File "/opt/bonito/lib/python3.9/site-packages/remora/inference.py", line 84, in call_read_mods
    nn_out, labels, pos = read.run_model(model)
  File "/opt/bonito/lib/python3.9/site-packages/remora/data_chunks.py", line 515, in run_model
    model.forward(
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: shape '[2, 0, 1]' is invalid for input of size 1474560

Ommiting the --modified_device parameter does work, however at a very slow speed (~40 reads/s).

Is there anything I am missing to move the modified base prediction from CPU to the GPU?

Thank you in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant