Skip to content
This repository has been archived by the owner on Mar 17, 2021. It is now read-only.

Transfer learning using Niftynet pretrained weights. #321

Open
AkhilaPerumalla123 opened this issue Feb 13, 2019 · 1 comment
Open

Transfer learning using Niftynet pretrained weights. #321

AkhilaPerumalla123 opened this issue Feb 13, 2019 · 1 comment
Assignees

Comments

@AkhilaPerumalla123
Copy link

I want to use niftynet pretrained segmentation model for segmenting custom data. I downloaded the pre trained weights and and modified model_dir path to downloaded one.
However when I run, I am getting the error below.
Caused by op 'save/Assign_17', defined at: File "net_segment.py", line 8, in <module> sys.exit(main()) File "/home/NiftyNet/niftynet/__init__.py", line 142, in main app_driver.run(app_driver.app) File "/home/NiftyNet/niftynet/engine/application_driver.py", line 197, in run SESS_STARTED.send(application, iter_msg=None) File "/usr/local/lib/python3.5/dist-packages/blinker/base.py", line 267, in send for receiver in self.receivers_for(sender)] File "/usr/local/lib/python3.5/dist-packages/blinker/base.py", line 267, in <listcomp> for receiver in self.receivers_for(sender)] File "/home/NiftyNet/niftynet/engine/handler_model.py", line 109, in restore_model var_list=to_restore, save_relative_paths=True) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1102, in __init__ self.build() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1114, in build self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1151, in _build build_save=build_save, build_restore=build_restore) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 795, in _build_internal restore_sequentially, reshape) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_tensors, shapes)) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 119, in restore self.op.get_shape().is_fully_defined()) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/state_ops.py", line 221, in assign validate_shape=validate_shape) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 61, in assign use_locking=use_locking, name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__ self._traceback = tf_stack.extract_stack() InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: Assign requires shapes of both tensors to match. lhs shape= [3,3,61,256] rhs shape= [3,3,3,61,9] [[node save/Assign_17 (defined at /home/NiftyNet/niftynet/engine/handler_model.py:109) = Assign[T=DT_FLOAT, _class=["loc:@DenseVNet/conv/conv_/w"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](DenseVNet/conv/conv_/w, save/RestoreV2/_35)

Below is my config file details
`[promise12]
path_to_search = /home/Container_data/Nifti/Images_nii
filename_contains = nii
spatial_window_size = (64, 64, 1)
interp_order = 3
axcodes=(A, R, S)

[label]
path_to_search = /home/Container_data/Nifti/Annotations_colored_nii
filename_contains = nii
spatial_window_size = (64, 64, 1)
interp_order = 0
axcodes=(A, R, S)

############################## system configuration sections
[SYSTEM]
cuda_devices = ""
num_threads = 2
num_gpus = 4
model_dir = ./dense_vnet_abdominal_ct

[NETWORK]
name = dense_vnet
activation_function = prelu
batch_size = 1

volume level preprocessing

volume_padding_size = 0

histogram normalisation

histogram_ref_file = standardisation_models.txt
norm_type = percentile
cutoff = (0.01, 0.99)
normalisation = True
whitening = True
normalise_foreground_only=True
foreground_type = otsu_plus
multimod_foreground_type = and
window_sampling = resize

queue_length = 8

[TRAINING]
sample_per_volume = 4
#rotation_angle = (-10.0, 10.0)
#scaling_percentage = (-10.0, 10.0)
#random_flipping_axes= 1
lr = 0.0002
loss_type = Dice
starting_iter = -1
save_every_n = 1250
max_iter = 25000
max_checkpoints = 20

############################ custom configuration sections
[SEGMENTATION]
image = promise12
label = label
output_prob = False
num_classes = 256
label_normalisation = True
min_numb_labels = 2
min_sampling_ratio = 0.0001
`

@ericspod
Copy link
Collaborator

This appears to be the same issue as mentioned in #37. This appears to be caused by creating a network based on data with a different dimensionality than the one being loaded from the checkpoint. If you want to train on a 2D dataset the weights of some convolution would have a 4D dimension (ie. shape= [3,3,61,256]) whereas those being loaded from the zoo would have 5D (ie. shape= [3,3,3,61,9]).

@ericspod ericspod self-assigned this May 24, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants