Train the DL-based reconstruction with UFLoss #6

Aristot1e · 2021-11-04T08:24:54Z

Traceback (most recent call last):
File "../train_ufloss.py", line 803, in
main(args)
File "../train_ufloss.py", line 562, in main
model_re.load_state_dict(
File "/home/img/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Model:
size mismatch for memory_bank: copying a param with shape torch.Size([256, 1]) from checkpoint, the shape in current model is torch.Size([256, 2457600]).

This is the error when running launch_training_MoDL_traditional_UFLoss_256_demo.sh. The model shape is not corresponding, so why? I can’t deal with it.
And the other problem is in the file train_ufloss.py in line 193/194.
if args.loss_normalized == False:
output = output * std + mean
target = target * std + mean
Both the std and mean are not defined. What should I do?

KeWang0622 · 2021-11-04T23:04:48Z

Please set args.loss_normalized = True and try again, and I will solve this issue, Thanks!

Aristot1e · 2021-11-05T01:32:15Z

Please set args.loss_normalized = True and try again, and I will solve this issue, Thanks!
Namespace(accelerations=[10, 15], batch_size=1, checkpoint=None, circular_pad=True, data_parallel=False, data_path='/home/img/Desktop/lff/Dataset/pre-processed/multicoil', device='cuda', device_num='0', drop_prob=0.0, efficient_ufloss=False, exp_dir='/home/img/Desktop/lff/Dataset/summary/train-3D_MELD_4steps_MoDLflag0_shared_CGsteps_6date_20210929_ufloss0_ufloss_weight_10_dimension_256_debug', fix_step_size=True, ge_mask=None, kernel_size=3, ### loss_normalized='True', loss_type=2, loss_uflossdir='/data/train_ufloss/train_UFLoss_feature_256_features_date_202104283_temperature_1_lr1e-5/checkpoints/ckpt200.pth', lr=0.0002, lr_gamma=0.5, lr_step_size=20, meld_cp=False, meld_flag=False, modl_flag=True, modl_lamda=0.05, num_cg_steps=6, num_emaps=1, num_epochs=2000, num_features=256, num_grad_steps=4, num_resblocks=2, patch_size=64, report_interval=10, resume=False, sample_rate=1.0, seed=42, share_weights=True, slwin_init=True, ufloss3d=False, ufloss_weight=10.0, uflossfreq=8, weight_decay=0.0)
Using parameters:
Temperature: 1.0
2
Traceback (most recent call last):
File "../train_ufloss.py", line 803, in
main(args)
File "../train_ufloss.py", line 562, in main
model_re.load_state_dict(
File "/home/img/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Model:
size mismatch for memory_bank: copying a param with shape torch.Size([256, 1]) from checkpoint, the shape in current model is torch.Size([256, 2457600]).

The loss_normalized is setting True, but it can’t help.

KeWang0622 · 2021-11-05T01:39:01Z

I see, how did you train the UFLoss? the error is not about the normalization. It's about the checkpoint loading, how many patches did you use to train the UFLoss feature mapping network

Aristot1e · 2021-11-05T03:06:41Z

I see, how did you train the UFLoss? the error is not about the normalization. It's about the checkpoint loading, how many patches did you use to train the UFLoss feature mapping network

I trained the UFloss using launch_training_patch_learning.sh.
And the total patch_extraction number should be 15568.

Aristot1e · 2021-11-11T02:00:40Z

I see, how did you train the UFLoss? the error is not about the normalization. It's about the checkpoint loading, how many patches did you use to train the UFLoss feature mapping network

The total patch_data number I used is 311360. The multicoil knee dataset I downloaded has 973 .h5 files. And then it becomes 15568 going through the data_preprocessing.py. Then to do patch_extraction.py, it becomes 311360. But the error say the current model is torch.Size([256,2457600]). I don't know why it's so huge.
Another question is when training the UFloss, the loss is too big (11.3+) after running 200 epochs, how can I make it smaller?

Aristot1e · 2021-11-14T14:36:11Z

I've got some new problem. After Successfully loaded UFLoss model (Traditional), the error appeared .

  Traceback (most recent call last):
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 95, in apply
      output = self._apply(input)
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 1330, in _apply
      return block.array_to_blocks(input, self.blk_shape,
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/block.py", line 103, in array_to_blocks
      raise ValueError('Only support ndim=1, 2, or 3, got {}'.format(ndim))
  ValueError: Only support ndim=1, 2, or 3, got 4
  
  The above exception was the direct cause of the following exception:
  
  Traceback (most recent call last):
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 95, in apply
      output = self._apply(input)
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 362, in _apply
      output = linop(output)
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 122, in __call__
      return self.__mul__(input)
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 131, in __mul__
      return self.apply(input)
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 98, in apply
      raise RuntimeError('Exceptions from {}.'.format(self)) from e
  RuntimeError: Exceptions from <[1, 1, 73, 40, 1, 2, 60, 60]x[1, 2, 640, 372]> ArrayToBlocks Linop>.
  
  The above exception was the direct cause of the following exception:
  
  Traceback (most recent call last):
    File "../train_ufloss.py", line 785, in <module>
      main(args)
    File "../train_ufloss.py", line 568, in main
      train_loss, train_l2, train_ufloss, train_time = train_epoch(args, epoch, model, train_loader, optimizer, writer, model_ufloss)
    File "../train_ufloss.py", line 273, in train_epoch
      ) = compute_metrics(args, model, data, model_ufloss)
    File "../train_ufloss.py", line 223, in compute_metrics
      output_patch = Fa2b(output_roll)
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/pytorch.py", line 118, in forward
      return to_pytorch(linop(from_pytorch(
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 122, in __call__
      return self.__mul__(input)
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 131, in __mul__
      return self.apply(input)
    File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 98, in apply
      raise RuntimeError('Exceptions from {}.'.format(self)) from e
  RuntimeError: Exceptions from <[2920, 2, 60, 60]x[1, 2, 640, 372]> Reshape * ArrayToBlocks Linop>.

It's about compute_metrics in train_ufloss.py and in train_ufloss.py the line 204 to 228. I don't understand it. Can you help me explain? I'll thank you so much.

               arraytoblock = sp.linop.ArrayToBlocks( 
                    ishape=list(
                        (
                            output_roll.shape[0],
                            2,
                            output_roll.shape[2],
                            output_roll.shape[3],
                        )
                    ),
                    blk_shape=list((output_roll.shape[0], 2, 60, 60)),
                    blk_strides=list((1, 1, n_featuresq, n_featuresq)),
                )
    
                reshape = sp.linop.Reshape(
                    ishape=arraytoblock.oshape,
                    oshape=(arraytoblock.oshape[2] * arraytoblock.oshape[3], 2, 60, 60),
                )
    
                Fa2b = sp.to_pytorch_function(reshape * arraytoblock).apply
                output_patch = Fa2b(output_roll)
                target_patch = Fa2b(target_roll)
    
                output_features = model_ufloss(output_patch)
                target_features = model_ufloss(target_patch)
                ufloss = nn.MSELoss()(output_features[0], target_features[0])

Aristot1e · 2021-11-15T09:04:17Z

       File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/linop.py", line 1330, in _apply
            return block.array_to_blocks(input, self.blk_shape,
          File "/home/img/anaconda3/lib/python3.8/site-packages/sigpy/block.py", line 103, in array_to_blocks
            raise ValueError('Only support ndim=1, 2, or 3, got {}'.format(ndim))
        ValueError: Only support ndim=1, 2, or 3, got 4

In sigpy.block.arrat_to_blocks, the dim should be <=3 . Source code: (blk_shape (tuple): block shape of length ndim, with ndim={1, 2, 3}.) But the blk_shape dim you gave is 4 lead to this problem. Which dim should be deleted or something else. I have try my best to deal with it, but it doesn't work. May you give me some advice.

KeWang0622 · 2021-11-15T22:19:22Z

Hi, I believe it is a sigpy version mismatch!
Maybe we can schedule a quick chat to address these issues? And I will update the repo accordingly.
Apologize for the bugs, it's in a early development stage and thanks for your feedbacks.
What would be the best way to contact you?
Ke

Aristot1e · 2021-11-16T12:53:04Z

Thanks you for replying. We can contact by email or github. And my email is [email protected]. You can email to me anytime.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train the DL-based reconstruction with UFLoss #6

Train the DL-based reconstruction with UFLoss #6

Aristot1e commented Nov 4, 2021

KeWang0622 commented Nov 4, 2021

Aristot1e commented Nov 5, 2021

KeWang0622 commented Nov 5, 2021

Aristot1e commented Nov 5, 2021

Aristot1e commented Nov 11, 2021

Aristot1e commented Nov 14, 2021 •

edited

Loading

Aristot1e commented Nov 15, 2021

KeWang0622 commented Nov 15, 2021

Aristot1e commented Nov 16, 2021

Train the DL-based reconstruction with UFLoss #6

Train the DL-based reconstruction with UFLoss #6

Comments

Aristot1e commented Nov 4, 2021

KeWang0622 commented Nov 4, 2021

Aristot1e commented Nov 5, 2021

KeWang0622 commented Nov 5, 2021

Aristot1e commented Nov 5, 2021

Aristot1e commented Nov 11, 2021

Aristot1e commented Nov 14, 2021 • edited Loading

Aristot1e commented Nov 15, 2021

KeWang0622 commented Nov 15, 2021

Aristot1e commented Nov 16, 2021

Aristot1e commented Nov 14, 2021 •

edited

Loading