Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

trian cityscapes use coco pretrain model problem ? #259

Open
ranjiewwen opened this issue Dec 10, 2018 · 10 comments
Open

trian cityscapes use coco pretrain model problem ? #259

ranjiewwen opened this issue Dec 10, 2018 · 10 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@ranjiewwen
Copy link

❓ Questions and Help

  • thanks the code for train new datasets cityscapes for instance segementation .
  • first i train the cityscapes from scratch and the loss is convergence;but i get box_AP and seg_AP is not high as follow , i read the mask_rcnn paper is is higher a lot , I don't know what details I overlooked.
2018-12-07 18:58:13,471 maskrcnn_benchmark.inference INFO: OrderedDict([('bbox', OrderedDict([('AP', 0.266143220179594), ('AP50', 0.4705279119903588), ('AP75', 0.2664711486678874), ('APs', 0.0742186384761436), ('APm', 0.26418817964465885), ('APl', 0.4618351991771723)])), ('segm', OrderedDict([('AP', 0.2169857479304357), ('AP50', 0.4159623962610022), ('AP75', 0.17807455425402843), ('APs', 0.029122872145021395), ('APm', 0.174442224182182), ('APl', 0.42977448859947454)]))])
  • experiment set on single GTX1080ti :
--config-file "../configs/cityscapes/e2e_mask_rcnn_R_50_FPN_1x_cocostyle.yaml" SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.00125 SOLVER.MAX_ITER 200000 SOLVER.STEPS "(160000, 180000)" TEST.IMS_PER_BATCH 1
  • seconde quesition : using COCO pre-training to train cityscapes
  • when i load the pretrain coco model meet some problem ,the classnums 81->9 ,so the fc parameter should be ignored ,
  • but the code follow maskrcnn-benchmark/maskrcnn_benchmark/utils/model_serialization.py get problem becase model_state_dict[key] = loaded_state_dict[key_old] overwriting the original value :
def load_state_dict(model, loaded_state_dict):
    model_state_dict = model.state_dict()
    # if the state_dict comes from a model that was wrapped in a
    # DataParallel or DistributedDataParallel during serialization,
    # remove the "module" prefix before performing the matching
    loaded_state_dict = strip_prefix_if_present(loaded_state_dict, prefix="module.")
    align_and_update_state_dicts(model_state_dict, loaded_state_dict) ##model_state_dict[key] = loaded_state_dict[key_old] 

    # use strict loading
model.load_state_dict(model_state_dict)
  • i use follow code:
def load_state_dict(model, loaded_state_dict):
    model_state_dict = model.state_dict()
    # if the state_dict comes from a model that was wrapped in a
    # DataParallel or DistributedDataParallel during serialization,
    # remove the "module" prefix before performing the matching
    loaded_state_dict = strip_prefix_if_present(loaded_state_dict, prefix="module.")

    # align_and_update_state_dicts(model_state_dict, loaded_state_dict)
    # # finetune
    loaded_state_dict = {k:v for k,v in loaded_state_dict.items() if k in model_state_dict and model_state_dict[k].size()==v.size()}
    model_state_dict.update(loaded_state_dict)
    # use strict loading
    model.load_state_dict(model_state_dict)
  • but then maskrcnn_benchmark/utils/checkpoint.py get error, i don't know why should load self.optimizer.load_state_dict and self.scheduler.load_state_dict , it has 'momentum_buffer' paremeter , i don't understand why load this parameter . can you explain ? and how can i use coco pretrain model to finetune cityscapes ? thanks !
 def load(self, f=None):
        if self.has_checkpoint():
            # override argument with existing checkpoint
            f = self.get_checkpoint_file()
        if not f:
            # no checkpoint could be found
            self.logger.info("No checkpoint found. Initializing model from scratch")
            return {}
        self.logger.info("Loading checkpoint from {}".format(f))
        checkpoint = self._load_file(f)
        self._load_model(checkpoint)
        if "optimizer" in checkpoint and self.optimizer:
            self.logger.info("Loading optimizer from {}".format(f))
            self.optimizer.load_state_dict(checkpoint.pop("optimizer"))
        if "scheduler" in checkpoint and self.scheduler:
            self.logger.info("Loading scheduler from {}".format(f))
            self.scheduler.load_state_dict(checkpoint.pop("scheduler"))

        # return any further checkpoint data
        return checkpoint
@fmassa
Copy link
Contributor

fmassa commented Dec 10, 2018

Hi,

I believe best results for cityscapes are obtained after starting from a model pre-trained on COCO, and then some model surgery are done so that the common classes between COCO and cityscapes are kept.
See this file for more information.

About your second question, I'm sorry but I couldn't understand what was the problem that you are facing. Can you give a bit more context?

@fmassa fmassa added the question Further information is requested label Dec 10, 2018
@ranjiewwen
Copy link
Author

ranjiewwen commented Dec 20, 2018

thanks you @fmassa reply. i have get some help from #15 . but I haven't reproduced the cityscapes instance segmentation result yet .i hope someone can share the cityscpaes model . so i can compare the different.

  • for the second question is simple : " i want to finetune cityscapse from pretrained coco detectron models with different number of classes".
  • beacuse the class num is different, so i modify the load_state_dict funtion, but the code also load optimizer and scheduler parameter ,so it also conflict, so when load these parameter, i block this code .
  • the follow is my result:
time set data_val segAP mAp
paper fine 0.315
paper fine+coco 0.365
2018-12-06 single gpu fine 0.217 0.266
2018-12-11 multi gpu fine 0.238 0.278
2018-12-08 single gpu fine+coco 0.285 0.331

@fmassa
Copy link
Contributor

fmassa commented Dec 20, 2018

I haven't myself trained models on cityscapes, so I might not be the best person to help you with that. Maybe @henrywang1 knows a bit better, as he's the one who originally added support to cityscapes

@fmassa fmassa added the help wanted Extra attention is needed label Dec 20, 2018
@henrywang1
Copy link
Contributor

Hi @ranjiewwen,
I only tried end to end training on cityscapes.
I followed the steps described by the paper, and the result AP[val] is about 0.316.

We train with image scale (shorter side) randomly sampled from [800, 1024], which reduces overfitting; inference is on a single scale of 1024 pixels.

I didn't submit the code because I thought everyone might have their own transformation.
You could refer the below changes:

In transform.py, add this class

class RandomResize(object):
    def __init__(self, min_size, max_size):
        self.min_size = min_size
        self.max_size = max_size

    def get_size(self, image_size):
        w, h = image_size
        min_size = self.min_size
        max_size = self.max_size
        rand = random.randint(min_size, max_size)
        return rand, int(w*rand/h)

    def __call__(self, image, target):
        size = self.get_size(image.size)
        image = F.resize(image, size)
        target = target.resize(image.size)
        return image, target

In build.py, modify build_transforms

if "cityscapes" in cfg.DATASETS.TRAIN[0]:
    if is_train:
        transform = T.Compose(
            [
                T.RandomResize(800, 1024),
                T.RandomHorizontalFlip(flip_prob),
                T.ToTensor(),
                normalize_transform,
            ]
        )
    else:
        transform = T.Compose(
            [
                T.ToTensor(),
                normalize_transform,
            ]
        )
else: #...

@ranjiewwen
Copy link
Author

thanks @henrywang1 . i will try to train again! look for the good result !

@xllau
Copy link

xllau commented Mar 1, 2019

thanks you @fmassa reply. i have get some help from #15 . but I haven't reproduced the cityscapes instance segmentation result yet .i hope someone can share the cityscpaes model . so i can compare the different.

* for the second question is simple : " i want to finetune cityscapse from pretrained coco detectron models with different number of classes".

* beacuse the class num is different, so i modify the load_state_dict funtion, but the code also load optimizer and scheduler parameter ,so it also conflict, so when load these parameter, i block this code .

* the follow is my result:

time set data_val segAP mAp
paper fine 0.315
paper fine+coco 0.365
2018-12-06 single gpu fine 0.217 0.266
2018-12-11 multi gpu fine 0.238 0.278
2018-12-08 single gpu fine+coco 0.285 0.331

I am wondering what is the mAP in your result? Is it the bbox mAP?

@ranjiewwen
Copy link
Author

thanks you @fmassa reply. i have get some help from #15 . but I haven't reproduced the cityscapes instance segmentation result yet .i hope someone can share the cityscpaes model . so i can compare the different.

* for the second question is simple : " i want to finetune cityscapse from pretrained coco detectron models with different number of classes".

* beacuse the class num is different, so i modify the load_state_dict funtion, but the code also load optimizer and scheduler parameter ,so it also conflict, so when load these parameter, i block this code .

* the follow is my result:

time set data_val segAP mAp
paper fine 0.315
paper fine+coco 0.365
2018-12-06 single gpu fine 0.217 0.266
2018-12-11 multi gpu fine 0.238 0.278
2018-12-08 single gpu fine+coco 0.285 0.331

I am wondering what is the mAP in your result? Is it the bbox mAP?

mAP is for the bbox , you can read original mask r-cnn paper, or read the evaluation code: coco_eval.py

@zimenglan-sysu-512
Copy link
Contributor

zimenglan-sysu-512 commented Mar 14, 2019

hi @ranjiewwen
had u reproduced the results on cityscapes dataset? i follow the steps in mask-rcnn paper, only get 0.250 using fine dataset, and 0.293 using fine + coco

after set both MAX_SIZE_TRAIN and MAX_SIZE_TEST to 2048 and do the re-trainings, get 0.316 using fine and 0.358 using fine + coco.

@zimenglan-sysu-512
Copy link
Contributor

hi @henrywang1
what value did u set for MAX_SIZE_TRAIN and MAX_SIZE_TEST? is it 2048?

@henrywang1
Copy link
Contributor

Hi @zimenglan-sysu-512
I followed the setting on the paper, so I hard-coded the MIN/MAX_SIZE_TRAIN (as described in #259 (comment))

And I just notice that my previous reply is not complete.
For test, the paper said inference is on a single scale of 1024 pixels.
So we have to let the transform be

            transform = T.Compose(
                [
                    T.Resize(1024, 1024),
                    T.ToTensor(),
                    normalize_transform,
                ]
            )

For other settings or training log, you could send an e-mail me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants