HelloWord Test error #257

sahand68 opened this issue Sep 29, 2020

HelloWord Test error #257

sahand68 opened this issue Sep 29, 2020 · 13 comments


After activating InnerEye env, trying to run python InnerEye/ML/ --model=HelloWorld i get this error.
Ps. I am following documentation step by step

(**InnerEye) sahand@sahand-System-Product-Name:~/InnerEye-DeepLearning$ export PYTHONPATH=`pwd`
(InnerEye) sahand@sahand-System-Product-Name:~/InnerEye-DeepLearning$ python InnerEye/ML/ --model=HelloWorld
Setting up logging to stdout.
Setting logging level to 20
2020-09-29T16:33:30Z INFO     rpdb is handling traps. To debug: identify the main process, then as root: kill -TRAP <process_id>; nc 4444
2020-09-29T16:33:30Z INFO     Found class HelloWorld in file /home/sahand/InnerEye-DeepLearning/InnerEye/ML/configs/segmentation/
2020-09-29T16:33:30Z INFO     Creating the default output folder structure.
2020-09-29T16:33:30Z INFO     Running outside of AzureML.
2020-09-29T16:33:30Z INFO     All results will be written to a subfolder of the project root folder.
2020-09-29T16:33:30Z INFO     Run outputs folder: /home/sahand/InnerEye-DeepLearning/outputs/2020-09-29T163330Z_HelloWorld
2020-09-29T16:33:30Z INFO     Logs folder: /home/sahand/InnerEye-DeepLearning/outputs/2020-09-29T163330Z_HelloWorld/logs
2020-09-29T16:33:30Z INFO     Creating the adjusted output folder structure.
2020-09-29T16:33:30Z INFO     Running outside of AzureML.
2020-09-29T16:33:30Z INFO     All results will be written to a subfolder of the project root folder.
2020-09-29T16:33:30Z INFO     Run outputs folder: /home/sahand/InnerEye-DeepLearning/outputs/2020-09-29T163330Z_HelloWorld
2020-09-29T16:33:30Z INFO     Logs folder: /home/sahand/InnerEye-DeepLearning/outputs/2020-09-29T163330Z_HelloWorld/logs
2020-09-29T16:33:30Z INFO     extra_code_directory is unset
Setting logging level to 20
Setting up logging with level 20 to file /home/sahand/InnerEye-DeepLearning/outputs/2020-09-29T163330Z_HelloWorld/logs/stdout.txt
2020-09-29T16:33:33Z INFO     Setting multiprocessing start method to 'forkserver'
2020-09-29T16:33:33Z INFO     Model training will use the local dataset provided in /home/sahand/InnerEye-DeepLearning/Tests/ML/test_data
2020-09-29T16:33:33Z INFO     
	__center_size_param_value: None
	__dataset_data_frame_param_value: None
	__inference_stride_size_param_value: None
	__largest_connected_component_foreground_classes_param_value: None
	__min_l_rate_param_value: 0
	__model_category_param_value: ModelCategory.Segmentation
	__model_name_param_value: HelloWorld
	__overrides_param_value: None
	__use_gpu_param_value: False
	_architecture_param_value: UNet3D
	_class_weights_param_value: [0.02, 0.49, 0.49]
	_colours_param_value: [(130, 183, 14), (238, 127, 26)]
	_comparison_blob_storage_paths_param_value: None
	_crop_size_param_value: (64, 64, 64)
	_datasets_for_inference: None
	_datasets_for_training: None
	_feature_channels_param_value: [4]
	_file_system_config_param_value: <DeepLearningFileSystemConfig DeepLearningFileSystemConfig00010>
	_fill_holes_param_value: [True, True]
	_ground_truth_ids_display_names_param_value: ['region', 'region_1']
	_ground_truth_ids_param_value: ['region', 'region_1']
	_image_channels_param_value: ['channel1', 'channel2']
	_instance__params : {}
	_l_rate_multi_step_milestones_param_value: None
	_level_param_value: 50
	_local_dataset_param_value: /home/sahand/InnerEye-DeepLearning/Tests/ML/test_data
	_mask_id_param_value: mask
	_multiprocessing_start_method_param_value: MultiprocessingStartMethod.forkserver
	_name_param_value : HelloWorld00008
	_norm_method_param_value: PhotometricNormalizationMethod.CtWindow
	_num_dataload_workers_param_value: 0
	_num_epochs_param_value: 2
	_param_watchers   : {}
	_save_start_epoch_param_value: 1
	_save_step_epochs_param_value: 1
	_slice_exclusion_rules_param_value: []
	_start_epoch_param_value: 0
	_summed_probability_rules_param_value: []
	_tail_param_value : None
	_test_crop_size_param_value: (64, 64, 64)
	_test_diff_epochs_param_value: 1
	_test_start_epoch_param_value: 2
	_test_step_epochs_param_value: 1
	_train_batch_size_param_value: 2
	_use_mixed_precision_param_value: True
	_window_param_value: 200
	initialized       : True
	param             : <param.parameterized.Parameters object at 0x7f3a5dc57b00>

2020-09-29T16:33:33Z INFO     
2020-09-29T16:33:33Z INFO     **** STARTING: Model training **********************************************************************
2020-09-29T16:33:33Z INFO     
2020-09-29T16:33:33Z INFO     Train: 3, Test: 1, and Val: 2. Total subjects: 6
2020-09-29T16:33:33Z INFO     Model Training: Random seed set to: 42
2020-09-29T16:33:33Z INFO     Starting to read and parse the datasets.
2020-09-29T16:33:33Z INFO     Processing dataset (name=None)
2020-09-29T16:33:33Z INFO     Processing dataset (name=None)
2020-09-29T16:33:33Z INFO     Creating the data loader for the training set.
2020-09-29T16:33:33Z INFO     Creating the data loader for the validation set.
2020-09-29T16:33:33Z INFO     Finished creating the data loaders.
2020-09-29T16:33:33Z INFO     Models are saved at /home/sahand/InnerEye-DeepLearning/outputs/2020-09-29T163330Z_HelloWorld/checkpoints
2020-09-29T16:33:33Z INFO     Writing model summary to: logs/model_summaries/model_log001.txt
Attempted to log scalar metric LoggingColumns.NumTrainableParameters:
2020-09-29T16:33:33Z INFO     Making no adjustments to the model because no GPU was found.
2020-09-29T16:33:33Z INFO     Starting training
2020-09-29T16:33:33Z INFO     Starting epoch 1
2020-09-29T16:33:33Z ERROR    Model training/testing failed. Exception: Exception thrown in SimpleITK ReadImage: /tmp/SimpleITK/Code/IO/src/sitkImageReaderBase.cxx:107:
sitk::ERROR: Unable to determine ImageIO reader for "/home/sahand/InnerEye-DeepLearning/Tests/ML/test_data/train_and_test_data/id1_channel1.nii.gz"
Traceback (most recent call last):
  File "InnerEye/ML/", line 364, in run_in_situ
  File "/home/sahand/InnerEye-DeepLearning/InnerEye/ML/", line 308, in run
    model_train(self.model_config, run_recovery)
  File "/home/sahand/InnerEye-DeepLearning/InnerEye/ML/", line 150, in model_train
    train_epoch_results = train_or_validate_epoch(training_steps)
  File "/home/sahand/InnerEye-DeepLearning/InnerEye/ML/", line 267, in train_or_validate_epoch
    for batch_index, sample in enumerate(train_val_params.data_loader):
  File "/home/sahand/anaconda3/envs/InnerEye/lib/python3.7/site-packages/torch/utils/data/", line 363, in __next__
    data = self._next_data()
  File "/home/sahand/anaconda3/envs/InnerEye/lib/python3.7/site-packages/torch/utils/data/", line 403, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/sahand/anaconda3/envs/InnerEye/lib/python3.7/site-packages/torch/utils/data/_utils/", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/sahand/anaconda3/envs/InnerEye/lib/python3.7/site-packages/torch/utils/data/_utils/", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/sahand/InnerEye-DeepLearning/InnerEye/ML/dataset/", line 38, in __getitem__
  File "/home/sahand/InnerEye-DeepLearning/InnerEye/ML/dataset/", line 289, in get_samples_at_index
    samples = [io_util.load_images_from_dataset_source(dataset_source=ds)]  # type: ignore
  File "/home/sahand/InnerEye-DeepLearning/InnerEye/ML/utils/", line 357, in load_images_from_dataset_source
    images = [load_nifti_image(channel, ImageDataType.IMAGE.value) for channel in dataset_source.image_channels]
  File "/home/sahand/InnerEye-DeepLearning/InnerEye/ML/utils/", line 357, in <listcomp>
    images = [load_nifti_image(channel, ImageDataType.IMAGE.value) for channel in dataset_source.image_channels]
  File "/home/sahand/InnerEye-DeepLearning/InnerEye/ML/utils/", line 182, in load_nifti_image
    img, header = read_image_as_array_with_header(path)
  File "/home/sahand/InnerEye-DeepLearning/InnerEye/ML/utils/", line 146, in read_image_as_array_with_header
    image: sitk.Image = sitk.ReadImage(str(file_path))
  File "/home/sahand/anaconda3/envs/InnerEye/lib/python3.7/site-packages/SimpleITK/", line 8876, in ReadImage
    return _SimpleITK.ReadImage(*args)
RuntimeError: Exception thrown in SimpleITK ReadImage: /tmp/SimpleITK/Code/IO/src/sitkImageReaderBase.cxx:107:
sitk::ERROR: Unable to determine ImageIO reader for "/home/sahand/InnerEye-DeepLearning/Tests/ML/test_data/train_and_test_data/id1_channel1.nii.gz"

Hi @sahand68 I am trying to reproduce your issue. Would you please describe your environment (OS version, Anaconda version, python version, etc)?

sahand68 commented Sep 29, 2020

@MaherJendoubi Thanks for the fast reply.
OS version: Ubuntu 20.04

@sahand68 Thanks for the detailed environment. I still doesn't able to reproduce it.
I am suscpecting a corrupted file during the git clone operation.
Please cd /InnerEye-DeepLearning/Tests/ML/test_data and run ls -altrh in order to check the size at least.
Then paste the result here to compare it with mine.
After that, let's run conda deactivate then run conda env remove --name InnerEye
I suggest to delete InnerEye-DeepLearning folder with rm -r command then retry again the steps.
If the problem persists, please share the detailed steps to investigate the issue.

sahand68 commented Sep 29, 2020

There you go:
'base) sahand@sahand-System-Product-Name:~/InnerEye-DeepLearning/Tests/ML/test_data$ ls -altrh
total 340K
-rw-rw-r-- 1 sahand sahand 20K Sep 28 20:11 042_slice_001.png
-rw-rw-r-- 1 sahand sahand 5.7K Sep 28 20:11 042_slice_001_contour.png
-rw-rw-r-- 1 sahand sahand 9.1K Sep 28 20:11 Val_outliers.txt
-rw-rw-r-- 1 sahand sahand 7.5K Sep 28 20:11 Val_agg_splits.csv
-rw-rw-r-- 1 sahand sahand 7.6K Sep 28 20:11 Test_agg_splits.csv
-rw-rw-r-- 1 sahand sahand 121 Sep 28 20:11 ResultsByMode.csv
-rw-rw-r-- 1 sahand sahand 794 Sep 28 20:11 ResultsByModeAndStructure.csv
-rw-rw-r-- 1 sahand sahand 11K Sep 28 20:11 prefix042_class1_slice_001.png
-rw-rw-r-- 1 sahand sahand 131 Sep 28 20:11 posterior_bladder.nii.gz
drwxrwxr-x 5 sahand sahand 4.0K Sep 28 20:11 plot_cross_validation
-rw-rw-r-- 1 sahand sahand 432 Sep 28 20:11 metrics_aggregates.csv
-rw-rw-r-- 1 sahand sahand 16K Sep 28 20:11 MetricsAcrossAllRuns.csv
-rw-rw-r-- 1 sahand sahand 5.6K Sep 28 20:11 image_scaled_and_contour.png
-rw-rw-r-- 1 sahand sahand 9.0K Sep 28 20:11 image_and_multiple_contours.png
-rw-rw-r-- 1 sahand sahand 5.6K Sep 28 20:11 image_and_contour.png
drwxrwxr-x 3 sahand sahand 4.0K Sep 28 20:11 hdf5_data
drwxrwxr-x 2 sahand sahand 4.0K Sep 28 20:11 full_header_csv
-rw-rw-r-- 1 sahand sahand 30K Sep 28 20:11 dice_per_epoch_3classes.png
-rw-rw-r-- 1 sahand sahand 92K Sep 28 20:11 dice_per_epoch_15classes.png
-rw-rw-r-- 1 sahand sahand 1.2K Sep 28 20:11 dataset_with_full_header.csv
-rw-rw-r-- 1 sahand sahand 1.6K Sep 28 20:11 dataset.csv
drwxrwxr-x 2 sahand sahand 4.0K Sep 28 20:11 classification_data_sub_fold_cv
drwxrwxr-x 3 sahand sahand 4.0K Sep 28 20:11 classification_data_generated_random
drwxrwxr-x 2 sahand sahand 4.0K Sep 28 20:11 classification_data_2d
drwxrwxr-x 2 sahand sahand 4.0K Sep 28 20:11 classification_data
drwxrwxr-x 2 sahand sahand 4.0K Sep 28 20:11 checkpoints
drwxrwxr-x 3 sahand sahand 4.0K Sep 28 20:11 train_and_test_data
-rw-rw-r-- 1 sahand sahand 127 Sep 28 20:11 test_img.nii.gz
-rw-rw-r-- 1 sahand sahand 129 Sep 28 20:11 test_good.nii.gz
-rw-rw-r-- 1 sahand sahand 94 Sep 28 20:11 test_dataset_parameters.csv
-rw-rw-r-- 1 sahand sahand 131 Sep 28 20:11 smoothed_posterior_bladder.nii.gz
drwxrwxr-x 2 sahand sahand 4.0K Sep 28 20:11 sequence_data_for_classification
-rw-rw-r-- 1 sahand sahand 128 Sep 28 20:11 scale_and_unscale_image.nii.gz
-rw-rw-r-- 1 sahand sahand 11K Sep 28 20:11 prefix042_class2_slice_002.png
drwxrwxr-x 11 sahand sahand 4.0K Sep 28 20:11 ..
drwxrwxr-x 12 sahand sahand 4.0K Sep 28 20:11 .

MaherJendoubi commented Sep 29, 2020

Here is mine :


I just removed all the repo with rm -rf and the environment as well.
I cloned the repo again. and I still get the same error.

(InnerEye) sahand@sahand-System-Product-Name:~/InnerEye-DeepLearning/Tests/ML/test_data$ ls -altrh total 340K -rw-rw-r-- 1 sahand sahand 9.1K Sep 29 13:09 Val_outliers.txt -rw-rw-r-- 1 sahand sahand 7.5K Sep 29 13:09 Val_agg_splits.csv -rw-rw-r-- 1 sahand sahand 7.6K Sep 29 13:09 Test_agg_splits.csv -rw-rw-r-- 1 sahand sahand 121 Sep 29 13:09 ResultsByMode.csv -rw-rw-r-- 1 sahand sahand 794 Sep 29 13:09 ResultsByModeAndStructure.csv -rw-rw-r-- 1 sahand sahand 432 Sep 29 13:09 metrics_aggregates.csv -rw-rw-r-- 1 sahand sahand 16K Sep 29 13:09 MetricsAcrossAllRuns.csv -rw-rw-r-- 1 sahand sahand 5.6K Sep 29 13:09 image_scaled_and_contour.png -rw-rw-r-- 1 sahand sahand 9.0K Sep 29 13:09 image_and_multiple_contours.png -rw-rw-r-- 1 sahand sahand 5.6K Sep 29 13:09 image_and_contour.png drwxrwxr-x 3 sahand sahand 4.0K Sep 29 13:09 hdf5_data drwxrwxr-x 2 sahand sahand 4.0K Sep 29 13:09 full_header_csv -rw-rw-r-- 1 sahand sahand 30K Sep 29 13:09 dice_per_epoch_3classes.png -rw-rw-r-- 1 sahand sahand 92K Sep 29 13:09 dice_per_epoch_15classes.png -rw-rw-r-- 1 sahand sahand 1.2K Sep 29 13:09 dataset_with_full_header.csv -rw-rw-r-- 1 sahand sahand 1.6K Sep 29 13:09 dataset.csv drwxrwxr-x 2 sahand sahand 4.0K Sep 29 13:09 classification_data_sub_fold_cv drwxrwxr-x 3 sahand sahand 4.0K Sep 29 13:09 classification_data_generated_random drwxrwxr-x 2 sahand sahand 4.0K Sep 29 13:09 classification_data_2d drwxrwxr-x 2 sahand sahand 4.0K Sep 29 13:09 classification_data drwxrwxr-x 2 sahand sahand 4.0K Sep 29 13:09 checkpoints -rw-rw-r-- 1 sahand sahand 20K Sep 29 13:09 042_slice_001.png -rw-rw-r-- 1 sahand sahand 5.7K Sep 29 13:09 042_slice_001_contour.png drwxrwxr-x 3 sahand sahand 4.0K Sep 29 13:09 train_and_test_data -rw-rw-r-- 1 sahand sahand 127 Sep 29 13:09 test_img.nii.gz -rw-rw-r-- 1 sahand sahand 129 Sep 29 13:09 test_good.nii.gz -rw-rw-r-- 1 sahand sahand 94 Sep 29 13:09 test_dataset_parameters.csv -rw-rw-r-- 1 sahand sahand 131 Sep 29 13:09 smoothed_posterior_bladder.nii.gz drwxrwxr-x 2 sahand sahand 4.0K Sep 29 13:09 sequence_data_for_classification -rw-rw-r-- 1 sahand sahand 128 Sep 29 13:09 scale_and_unscale_image.nii.gz -rw-rw-r-- 1 sahand sahand 11K Sep 29 13:09 prefix042_class2_slice_002.png -rw-rw-r-- 1 sahand sahand 11K Sep 29 13:09 prefix042_class1_slice_001.png -rw-rw-r-- 1 sahand sahand 131 Sep 29 13:09 posterior_bladder.nii.gz drwxrwxr-x 5 sahand sahand 4.0K Sep 29 13:09 plot_cross_validation drwxrwxr-x 11 sahand sahand 4.0K Sep 29 13:09 ..

the error stack:
How about this command?


sahand68 commented Sep 29, 2020

(base) sahand@sahand-System-Product-Name:~/InnerEye-DeepLearning/Tests/ML/test_data/train_and_test_data$ ls -altrh
total 60K
-rw-rw-r-- 1 sahand sahand 1.2K Sep 29 13:09 scalar_prediction_target_metrics.csv
-rw-rw-r-- 1 sahand sahand 1006 Sep 29 13:09 scalar_epoch_metrics.csv
-rw-rw-r-- 1 sahand sahand 152 Sep 29 13:09 metrics.csv
-rw-rw-r-- 1 sahand sahand 405 Sep 29 13:09 metrics_aggregates.csv
-rw-rw-r-- 1 sahand sahand 130 Sep 29 13:09 id2_region.nii.gz
-rw-rw-r-- 1 sahand sahand 128 Sep 29 13:09 id2_mask.nii.gz
-rw-rw-r-- 1 sahand sahand 131 Sep 29 13:09 id2_channel2.nii.gz
-rw-rw-r-- 1 sahand sahand 131 Sep 29 13:09 id2_channel1.nii.gz
-rw-rw-r-- 1 sahand sahand 130 Sep 29 13:09 id1_region.nii.gz
-rw-rw-r-- 1 sahand sahand 128 Sep 29 13:09 id1_mask.nii.gz
-rw-rw-r-- 1 sahand sahand 131 Sep 29 13:09 id1_channel2.nii.gz
-rw-rw-r-- 1 sahand sahand 131 Sep 29 13:09 id1_channel1.nii.gz
drwxrwxr-x 2 sahand sahand 4.0K Sep 29 13:09 checkpoints
drwxrwxr-x 12 sahand sahand 4.0K Sep 29 13:09 ..
drwxrwxr-x 3 sahand sahand 4.0K Sep 29 13:09 .

OK. Did you run git lfs install before cloning the git repo?

Hey! you got it man! thank you so much, somehow I had missed that! my bad!

@sahand68 You're welcome!

ant0nsc commented Sep 30, 2020

Thanks a lot @MaherJendoubi for helping @sahand68 resolve this :-) Closing this issue now.

ant0nsc closed this as completed Sep 30, 2020
