Does anyone have problems in training from scratch with GCNet on ImageNet? #10

implus · 2019-05-16T22:10:03Z

Using the best setting of GC-ResNet50 and train it from scratch on ImageNet, I found it will be stuck in a high loss in the early epochs before the training loss begins to decline normally. Therefore the final result is much lower than original ResNet50. Note that one difference from the original paper is that the GC modules are embedded in each bottleneck exactly as SE does, for a fair comparison.

Does anyone have the same problem?

This may be the case since the authors report the ImageNet results via a finetuning setting, which is not very common when validating models on ImageNet Benchmarks. At least all other modules (SE, SK, BAM, CBAM, AA) are following a training-from-scratch setting.

lxtGH · 2019-05-19T07:49:18Z

Did you try to finetune the resnet?

kfxw · 2019-06-04T03:20:57Z

I have the similar problem. When training from scratch, the model converges very slowly.

ZzzjzzZ · 2019-06-04T03:21:36Z

I have the similar problem. When training from scratch, the model converges very slowly.

Me too

xvjiarui · 2019-07-03T04:34:32Z

Sorry for the late reply.

At the first place, we don't have enough resources for training from scratch.
Afterwards, we tried train the whole network from scratch on ImageNet. We didn't observe similar issue. I suggested you train it for 110 or 120 epoch to see the finally performance.

Noted that we use the same augmentation method as SENet.

kfxw · 2019-07-03T06:15:14Z

@xvjiarui
Hi! In terms of the fusion setting, did the case you mentioned use the 'add' one or the 'scale' one?

xvjiarui · 2019-07-03T07:26:02Z

We use the 'add' one by default. On the other hand, 'scale' is similar to SENet. Both of them should not have converge issue.

kfxw · 2019-07-03T09:13:25Z

@xvjiarui
Thx for your reply. Btw, would you mind sharing the classification training codes in this repo? That would be of great help.

xvjiarui · 2019-07-03T16:45:16Z

@xvjiarui
Thx for your reply. Btw, would you mind sharing the classification training codes in this repo? That would be of great help.

Our internal code base is used for classification training. We may try to release a cleaned version in the future. But it is not on schedule yet.

The block structure is the same. You could simply add it to your own code.

taoxinlily · 2019-08-09T08:12:28Z

Using the best setting of GC-ResNet50 and train it from scratch on ImageNet, I found it will be stuck in a high loss in the early epochs before the training loss begins to decline normally. Therefore the final result is much lower than original ResNet50. Note that one difference from the original paper is that the GC modules are embedded in each bottleneck exactly as SE does, for a fair comparison.

Does anyone have the same problem?

This may be the case since the authors report the ImageNet results via a finetuning setting, which is not very common when validating models on ImageNet Benchmarks. At least all other modules (SE, SK, BAM, CBAM, AA) are following a training-from-scratch setting.

Hi! Did you solve the problem?

Shiro-LK · 2020-02-23T01:33:05Z

@xvjiarui
I am also trying to use gc block on classifier such as resnet, vgg16 etc and I would like to be sure that I am doing thing good.
First : in the resnet backbone, we just need to use the global context block before the downsample in the bottleneck/basicblock class
Second : for the global contexxt module, the inplane parameter is the depth of the feature map which will feed the gc module and the plane parameter is equal to inplane//16, is it right ?
regarding the parameter "pool", I suppose "att" is better ? And for the "fusions" parameter "channel add" is also better ? Why this parameter take only a list? I am not sure to understand.

xvjiarui · 2020-02-23T10:50:39Z

For all the yes/no questions, the answers are yes. You are understanding correctly.
The fusions as a list indicate that multiple fusion methods could be used together.

ma-xu · 2020-02-28T02:16:22Z

For all the yes/no questions, the answers are yes. You are understanding correctly.
The fusions as a list indicate that multiple fusion methods could be used together.
@xvjiarui
Hi, thanks a lot for your great work. I appreciate it. However, I tried to train the network on ImageNet and GC achieves a worse performance than original ResNet. Then I follow the paper that fine-tune the ResNet50 for other 40 epochs using cosine schedule， and the performance is still bad. Could you please share you fine-tuning code, and I can cite your work in my research. Thanks a lot.

Shiro-LK · 2020-03-05T01:46:41Z

@xvjiarui

Hi, Thanks for your reply.
Do you still have the model trained on imagenet with gcblock ?

xvjiarui · 2020-03-05T17:21:03Z

Hi, @13952522076
Sorry for the late reply.
Currently, I am too busy to release that part of the code.
If the issue is overfitting, you are suggested to adopt the augmentations in the original paper as well as the drop out on the gc block branch.

xvjiarui · 2020-03-05T17:21:52Z

Hi, @Shiro-LK
The models are not available for now. You will be informed when I train them again.

ma-xu · 2020-03-05T17:32:18Z

Hi, @13952522076
Sorry for the late reply.
Currently, I am too busy to release that part of the code.
If the issue is overfitting, you are suggested to adopt the augmentations in the original paper as well as the drop out on the gc block branch.

Thanks a lot for your reply. Looks like not the issue of overfitting (train loss vs. val loss). Anyway, appreciate your work and it helped a lot. 😺

atnegam · 2022-05-31T13:29:16Z

@ma-xu Hello, excuse me, has the problem been solved?

xvjiarui mentioned this issue Sep 29, 2019

what's the T in cosine learning rate when training on ImageNet? #29

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does anyone have problems in training from scratch with GCNet on ImageNet? #10

Does anyone have problems in training from scratch with GCNet on ImageNet? #10

implus commented May 16, 2019

lxtGH commented May 19, 2019

kfxw commented Jun 4, 2019

ZzzjzzZ commented Jun 4, 2019

xvjiarui commented Jul 3, 2019

kfxw commented Jul 3, 2019

xvjiarui commented Jul 3, 2019

kfxw commented Jul 3, 2019

xvjiarui commented Jul 3, 2019 •

edited

Loading

taoxinlily commented Aug 9, 2019

Shiro-LK commented Feb 23, 2020

xvjiarui commented Feb 23, 2020

ma-xu commented Feb 28, 2020

Shiro-LK commented Mar 5, 2020

xvjiarui commented Mar 5, 2020

xvjiarui commented Mar 5, 2020

ma-xu commented Mar 5, 2020 •

edited

Loading

atnegam commented May 31, 2022

Does anyone have problems in training from scratch with GCNet on ImageNet? #10

Does anyone have problems in training from scratch with GCNet on ImageNet? #10

Comments

implus commented May 16, 2019

lxtGH commented May 19, 2019

kfxw commented Jun 4, 2019

ZzzjzzZ commented Jun 4, 2019

xvjiarui commented Jul 3, 2019

kfxw commented Jul 3, 2019

xvjiarui commented Jul 3, 2019

kfxw commented Jul 3, 2019

xvjiarui commented Jul 3, 2019 • edited Loading

taoxinlily commented Aug 9, 2019

Shiro-LK commented Feb 23, 2020

xvjiarui commented Feb 23, 2020

ma-xu commented Feb 28, 2020

Shiro-LK commented Mar 5, 2020

xvjiarui commented Mar 5, 2020

xvjiarui commented Mar 5, 2020

ma-xu commented Mar 5, 2020 • edited Loading

atnegam commented May 31, 2022

xvjiarui commented Jul 3, 2019 •

edited

Loading

ma-xu commented Mar 5, 2020 •

edited

Loading