BRITS imputation test fails on cuda device mismatch #10

MaciejSkrabski · 2022-08-02T12:36:48Z

Hi,
when trying to run imputation tests with commit 6dcc894 on dev branch.

py3.9_cuda11.3_cudnn8.2.0_0

$ python -m pytest tests/test_imputation.py

./tests/test_imputation.py::TestBRITS::test_parameters Failed with Error: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
  File ".../unittest/case.py", line 59, in testPartExecutor
    yield
  File ".../unittest/case.py", line 588, in run
    self._callSetUp()
  File ".../unittest/case.py", line 547, in _callSetUp
    self.setUp()
  File ".../PyPOTS/pypots/tests/test_imputation.py", line 98, in setUp
    self.brits.fit(self.train_X, self.val_X)
  File "/PyPOTS/pypots/imputation/brits.py", line 504, in fit
    self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
  File "/PyPOTS/pypots/imputation/base.py", line 154, in _train_model
    if np.equal(self.best_loss, float("inf")):
  File .../lib/python3.9/site-packages/torch/_tensor.py", line 732, in __array__
    return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

The text was updated successfully, but these errors were encountered:

MaciejSkrabski · 2022-08-02T12:58:54Z

similar issue with GRUD:

ERROR: test_classify (tests.test_classification.TestGRUD)
----------------------------------------------------------------------
Traceback (most recent call last):
  File ".../PyPOTS/pypots/tests/test_classification.py", line 64, in setUp
    self.grud.fit(self.train_X, self.train_y, self.val_X, self.val_y)
  File ".../PyPOTS/pypots/classification/grud.py", line 151, in fit
    training_set = DatasetForGRUD(train_X, train_y)
  File ".../PyPOTS/pypots/data/dataset_for_grud.py", line 35, in __init__
    self.X_filledLOCF = self.locf.locf_torch(X)
  File ".../PyPOTS/pypots/imputation/locf.py", line 89, in locf_torch
    idx = torch.where(~mask, torch.arange(n_features, device=mask.device), 0)
RuntimeError: Expected condition, x and y to be on the same device, but condition is on cuda:0 and x and y are on cpu and cpu respectively

MaciejSkrabski · 2022-08-02T13:37:44Z

similar issue with CRLI:

./tests/test_clustering.py::TestCRLI::test_parameters Failed with Error: Training got interrupted. Model was not get trained. Please try fit() again.
  File ".../PyPOTS/pypots/clustering/crli.py", line 350, in _train_model
    results = self.model.forward(inputs, training_object='discriminator')
  File ".../PyPOTS/pypots/clustering/crli.py", line 237, in forward
    inputs = self.cluster(inputs, training_object)
  File ".../PyPOTS/pypots/clustering/crli.py", line 215, in cluster
    imputation, imputed_X, generator_fb_hidden_states = self.generator(inputs)
  File ".../python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File ".../PyPOTS/pypots/clustering/crli.py", line 102, in forward
    f_outputs, f_final_hidden_state = self.f_rnn(inputs)
  File ".../python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File ".../PyPOTS/pypots/clustering/crli.py", line 78, in forward
    estimation = self.output_layer(hidden_state)
  File ".../python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File ".../python3.9/site-packages/torch/nn/modules/linear.py", line 103, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_addmm)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File ".../python3.9/unittest/case.py", line 59, in testPartExecutor
    yield
  File ".../python3.9/unittest/case.py", line 588, in run
    self._callSetUp()
  File ".../python3.9/unittest/case.py", line 547, in _callSetUp
    self.setUp()
  File ".../PyPOTS/pypots/tests/test_clustering.py", line 25, in setUp
    self.crli.fit(self.train_X)
  File ".../PyPOTS/pypots/clustering/crli.py", line 298, in fit
    self._train_model(training_loader)
  File ".../PyPOTS/pypots/clustering/crli.py", line 383, in _train_model
    raise RuntimeError('Training got interrupted. Model was not get trained. Please try fit() again.')
RuntimeError: Training got interrupted. Model was not get trained. Please try fit() again.

MaciejSkrabski · 2022-08-02T13:38:42Z

and VaDER:

./tests/test_clustering.py::TestVaDER::test_parameters Failed with Error: Expected condition, x and y to be on the same device, but condition is on cuda:0 and x and y are on cpu and cpu respectively
  File ".../python3.9/unittest/case.py", line 59, in testPartExecutor
    yield
  File ".../python3.9/unittest/case.py", line 588, in run
    self._callSetUp()
  File ".../python3.9/unittest/case.py", line 547, in _callSetUp
    self.setUp()
  File ".../PyPOTS/pypots/tests/test_clustering.py", line 56, in setUp
    self.vader.fit(self.train_X)
  File ".../PyPOTS/pypots/clustering/vader.py", line 323, in fit
    training_set = DatasetForGRUD(train_X)
  File ".../PyPOTS/pypots/data/dataset_for_grud.py", line 35, in __init__
    self.X_filledLOCF = self.locf.locf_torch(X)
  File ".../PyPOTS/pypots/imputation/locf.py", line 89, in locf_torch
    idx = torch.where(~mask, torch.arange(n_features), 0)
RuntimeError: Expected condition, x and y to be on the same device, but condition is on cuda:0 and x and y are on cpu and cpu respectively

WenjieDu · 2022-08-23T07:04:02Z

Solved by MaciejSkrabski in PR #11 that got merged successfully. So I close this issue.

MaciejSkrabski mentioned this issue Aug 2, 2022

fix: brits imputation test device mismatch #11

Merged

WenjieDu closed this as completed Aug 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BRITS imputation test fails on cuda device mismatch #10

BRITS imputation test fails on cuda device mismatch #10

MaciejSkrabski commented Aug 2, 2022 •

edited

MaciejSkrabski commented Aug 2, 2022

MaciejSkrabski commented Aug 2, 2022

MaciejSkrabski commented Aug 2, 2022

WenjieDu commented Aug 23, 2022

BRITS imputation test fails on cuda device mismatch #10

BRITS imputation test fails on cuda device mismatch #10

Comments

MaciejSkrabski commented Aug 2, 2022 • edited

MaciejSkrabski commented Aug 2, 2022

MaciejSkrabski commented Aug 2, 2022

MaciejSkrabski commented Aug 2, 2022

WenjieDu commented Aug 23, 2022

MaciejSkrabski commented Aug 2, 2022 •

edited