mednextv1_train_DDP can not work #24

lianjiejie · 2024-06-28T03:22:21Z

Hi, I try to ues mednextv1_train_DDP, the commend like this:
mednextv1_train_DDP 3d_fullres nnUNetTrainerV2_MedNeXt_M_kernel3 500 1 -p nnUNetPlansv2.1_trgSp_1x1x1
but it can not work, It cannot find the class nnUNetTrainerV2_DDP correctly according to the command.

###############################################
I am running the following nnUNet: 3d_fullres
My trainer class is: <class 'nnunet_mednext.training.network_training.MedNeXt.nnUNetTrainerV2_MedNeXt.nnUNetTrainerV2_MedNeXt_M_kernel3'>
For that I will be using the following configuration:
num_classes: 48
modalities: {0: 'CT'}
use_mask_for_norm OrderedDict([(0, False)])
keep_only_largest_region None
min_region_size_per_class None
min_size_per_class None
normalization_schemes OrderedDict([(0, 'CT')])
stages...

stage: 0
{'batch_size': 2, 'num_pool_per_axis': [4, 5, 5], 'patch_size': [128, 128, 128], 'median_patient_size_in_voxels': array([196, 204, 204]), 'current_spacing': array([1.66107814, 1.66107814, 1.66107814]), 'original_spacing': array([1., 1., 1.]), 'do_dummy_2D_data_aug': False, 'pool_op_kernel_sizes': [[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2], [1, 2, 2]], 'conv_kernel_sizes': [[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]]}

stage: 1
{'batch_size': 2, 'num_pool_per_axis': [4, 5, 5], 'patch_size': [128, 128, 128], 'median_patient_size_in_voxels': array([325, 339, 339]), 'current_spacing': array([1., 1., 1.]), 'original_spacing': array([1., 1., 1.]), 'do_dummy_2D_data_aug': False, 'pool_op_kernel_sizes': [[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2], [1, 2, 2]], 'conv_kernel_sizes': [[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]]}

I am using stage 1 from these plans
I am using batch dice + CE loss

I am using data from this folder: /data/lianzejie/mednext_data/preprocessed/Task500_bone_seg/nnUNetData_plans_v2.1_trgSp_1x1x1
###############################################
Traceback (most recent call last):
File "/data/lianzejie/envs/mednext/bin/mednextv1_train_DDP", line 33, in
sys.exit(load_entry_point('mednextv1', 'console_scripts', 'mednextv1_train_DDP')())
File "/home/lianzejie/mednext/nnunet_mednext/run/run_training_DDP.py", line 149, in main
trainer = trainer_class(plans_file, fold, local_rank=args.local_rank, output_folder=output_folder_name,
File "/home/lianzejie/mednext/nnunet_mednext/training/network_training/MedNeXt/nnUNetTrainerV2_MedNeXt.py", line 25, in init
super().init(*args, **kwargs)
TypeError: init() got an unexpected keyword argument 'local_rank'
(mednext) lianzejie@aa-SYS-4029GP-TRT:~$

saikat-roy · 2024-07-09T10:50:25Z

I'm sorry but I've not used nnUNet v1's DDP in my work and can't unfortunately support issues with it. However, I recommend you adopt the architecture for nnUNetv2 (currently the main version of nnUNet) and use it's DDP, which is much easier to use.

saikat-roy closed this as completed Jul 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mednextv1_train_DDP can not work #24

mednextv1_train_DDP can not work #24

lianjiejie commented Jun 28, 2024

saikat-roy commented Jul 9, 2024

mednextv1_train_DDP can not work #24

mednextv1_train_DDP can not work #24

Comments

lianjiejie commented Jun 28, 2024

saikat-roy commented Jul 9, 2024