Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mednextv1_train_DDP can not work #24

Closed
lianjiejie opened this issue Jun 28, 2024 · 1 comment
Closed

mednextv1_train_DDP can not work #24

lianjiejie opened this issue Jun 28, 2024 · 1 comment

Comments

@lianjiejie
Copy link

Hi, I try to ues mednextv1_train_DDP, the commend like this:
mednextv1_train_DDP 3d_fullres nnUNetTrainerV2_MedNeXt_M_kernel3 500 1 -p nnUNetPlansv2.1_trgSp_1x1x1
but it can not work, It cannot find the class nnUNetTrainerV2_DDP correctly according to the command.

###############################################
I am running the following nnUNet: 3d_fullres
My trainer class is: <class 'nnunet_mednext.training.network_training.MedNeXt.nnUNetTrainerV2_MedNeXt.nnUNetTrainerV2_MedNeXt_M_kernel3'>
For that I will be using the following configuration:
num_classes: 48
modalities: {0: 'CT'}
use_mask_for_norm OrderedDict([(0, False)])
keep_only_largest_region None
min_region_size_per_class None
min_size_per_class None
normalization_schemes OrderedDict([(0, 'CT')])
stages...

stage: 0
{'batch_size': 2, 'num_pool_per_axis': [4, 5, 5], 'patch_size': [128, 128, 128], 'median_patient_size_in_voxels': array([196, 204, 204]), 'current_spacing': array([1.66107814, 1.66107814, 1.66107814]), 'original_spacing': array([1., 1., 1.]), 'do_dummy_2D_data_aug': False, 'pool_op_kernel_sizes': [[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2], [1, 2, 2]], 'conv_kernel_sizes': [[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]]}

stage: 1
{'batch_size': 2, 'num_pool_per_axis': [4, 5, 5], 'patch_size': [128, 128, 128], 'median_patient_size_in_voxels': array([325, 339, 339]), 'current_spacing': array([1., 1., 1.]), 'original_spacing': array([1., 1., 1.]), 'do_dummy_2D_data_aug': False, 'pool_op_kernel_sizes': [[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2], [1, 2, 2]], 'conv_kernel_sizes': [[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]]}

I am using stage 1 from these plans
I am using batch dice + CE loss

I am using data from this folder: /data/lianzejie/mednext_data/preprocessed/Task500_bone_seg/nnUNetData_plans_v2.1_trgSp_1x1x1
###############################################
Traceback (most recent call last):
File "/data/lianzejie/envs/mednext/bin/mednextv1_train_DDP", line 33, in
sys.exit(load_entry_point('mednextv1', 'console_scripts', 'mednextv1_train_DDP')())
File "/home/lianzejie/mednext/nnunet_mednext/run/run_training_DDP.py", line 149, in main
trainer = trainer_class(plans_file, fold, local_rank=args.local_rank, output_folder=output_folder_name,
File "/home/lianzejie/mednext/nnunet_mednext/training/network_training/MedNeXt/nnUNetTrainerV2_MedNeXt.py", line 25, in init
super().init(*args, **kwargs)
TypeError: init() got an unexpected keyword argument 'local_rank'
(mednext) lianzejie@aa-SYS-4029GP-TRT:~$

@saikat-roy
Copy link
Member

I'm sorry but I've not used nnUNet v1's DDP in my work and can't unfortunately support issues with it. However, I recommend you adopt the architecture for nnUNetv2 (currently the main version of nnUNet) and use it's DDP, which is much easier to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants