Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I changed backbone and head of yolov5s,train process done,error happened in detect process. #658

Closed
Polary-L opened this issue Aug 7, 2020 · 9 comments
Labels
question Further information is requested

Comments

@Polary-L
Copy link

Polary-L commented Aug 7, 2020

❔Question

here is the changed yolov5s.yaml

# parameters
nc: 1  # number of classes
na: 3  # number of anchors
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 backbone
backbone:
  # [from, number, module, args]:640
  [[-1, 1, Focus, [64, 3]],  # 0-P1/2:320
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4:160
  #  [-1, 3, BottleneckCSP, [128]],
   [-1, 3, GhostBottleneck, [128, 3, 1]], 
   [-1, 1, GhostBottleneck, [256, 3, 2]], #3-P3/8:80
   [-1, 9, GhostBottleneck, [256, 3, 1]], 
   [-1, 1, SELayer, [256,16]],
   [-1, 1, GhostBottleneck, [512, 3, 2]], #6-P4/16:40
   [-1, 9, GhostBottleneck, [512, 3, 1]],
   [-1, 1, SELayer, [512,16]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P5/32:20
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   #  [-1, 3, BottleneckCSP, [1024, False]],  # 11
   [-1, 3, SELayer, [1024,16]],
  ]

# YOLOv5 head PANET
head:
  [
   [-1, 3, GhostBottleneck, [1024, 3, 1]], #12

   [-1, 1, DWConv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']], #14 :40
   [[-1, 8], 1, Concat, [1]],  # cat backbone P4
   [-1, 1, DWConv, [512, 3, 1]],
   [-1, 3, GhostBottleneck, [512, 3, 1]],

   [-1, 1, DWConv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']], #19/ :160
   [[-1, 5], 1, Concat, [1]],  # cat backbone P3
   [-1, 1, DWConv, [256, 3, 1]],
   [-1, 3, GhostBottleneck, [256, 3, 1]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], #  23/(P3/8-small)

   [-1, 1, DWConv, [256, 3, 2]],
   [[-1, 18], 1, Concat, [1]],  # cat head P4
   [-1, 1, DWConv, [512, 3, 1]],
   [-1, 3, GhostBottleneck, [512, 3, 1]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], #  28/(P4/16-medium)

   [-1, 1, DWConv, [512, 3, 2]],
   [[-1, 13], 1, Concat, [1]],  # cat head P4
   [-1, 1, DWConv, [1024, 3, 1]],
   [-1, 3, GhostBottleneck, [1024, 3, 1]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], #  33/(P5/32-large)

   [[23, 28, 33], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

and train process is done with

                 from  n    params  module                                  arguments                     
 0                -1  1      3520  models.common.Focus                     [3, 32, 3]                    
 1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]                
 2                -1  1      3440  models.common.GhostBottleneck           [64, 64, 3, 1]                
 3                -1  1     18784  models.common.GhostBottleneck           [64, 128, 3, 2]               
 4                -1  3     32928  models.common.GhostBottleneck           [128, 128, 3, 1]              
 5                -1  1      2048  models.common.SELayer                   [128, 16]                     
 6                -1  1     66240  models.common.GhostBottleneck           [128, 256, 3, 2]              
 7                -1  3    115008  models.common.GhostBottleneck           [256, 256, 3, 1]              
 8                -1  1      8192  models.common.SELayer                   [256, 16]                     
 9                -1  1   1180672  models.common.Conv                      [256, 512, 3, 2]              
10                -1  1    656896  models.common.SPP                       [512, 512, [5, 9, 13]]        
11                -1  1     32768  models.common.SELayer                   [512, 16]                     
12                -1  1    142208  models.common.GhostBottleneck           [512, 512, 3, 1]              
13                -1  1      1024  n DWConv at 0x7efff986a9d               [512, 256, 1, 1]              
14                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
15           [-1, 8]  1         0  models.common.Concat                    [1]                           
16                -1  1      5120  n DWConv at 0x7efff986a9d               [512, 256, 3, 1]              
17                -1  1     38336  models.common.GhostBottleneck           [256, 256, 3, 1]              
18                -1  1       512  n DWConv at 0x7efff986a9d               [256, 128, 1, 1]              
19                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
20           [-1, 5]  1         0  models.common.Concat                    [1]                           
21                -1  1      2560  n DWConv at 0x7efff986a9d               [256, 128, 3, 1]              
22                -1  1     10976  models.common.GhostBottleneck           [128, 128, 3, 1]              
23                -1  1      2322  torch.nn.modules.conv.Conv2d            [128, 18, 1, 1]               
24                -1  1     10624  n DWConv at 0x7efff986a9d               [18, 128, 3, 2]               
25          [-1, 18]  1         0  models.common.Concat                    [1]                           
26                -1  1      2816  n DWConv at 0x7efff986a9d               [256, 256, 3, 1]              
27                -1  1     38336  models.common.GhostBottleneck           [256, 256, 3, 1]              
28                -1  1      4626  torch.nn.modules.conv.Conv2d            [256, 18, 1, 1]               
29                -1  1     21248  n DWConv at 0x7efff986a9d               [18, 256, 3, 2]               
30          [-1, 13]  1         0  models.common.Concat                    [1]                           
31                -1  1      5632  n DWConv at 0x7efff986a9d               [512, 512, 3, 1]              
32                -1  1    142208  models.common.GhostBottleneck           [512, 512, 3, 1]              
33                -1  1      9234  torch.nn.modules.conv.Conv2d            [512, 18, 1, 1]               
34      [23, 28, 33]  1      1026  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [18, 18, 18]]
Model Summary: 243 layers, 2.57786e+06 parameters, 2.57786e+06 gradients

Optimizer groups: 81 .bias, 87 conv.weight, 75 other

Additional context

then I used the best.pt to detect ,error happened with "RuntimeError: shape '[16, 16, 5, 5]' is invalid for input of size 400"
I saw this error in your yolov3 version,and you gave suggestions in https://github.com/ultralytics/yolov3/issues/1395
how to fix this bug in yolov5 whose model has changed?
another question is my test_batch0_pred.jpgdidn't prediction anything.I use different sizes images to train the new network,is that reason caused the error?

@Polary-L Polary-L added the question Further information is requested label Aug 7, 2020
@Polary-L Polary-L closed this as completed Aug 7, 2020
@Polary-L Polary-L reopened this Aug 7, 2020
@glenn-jocher
Copy link
Member

@Polary-L if your model trains correctly then it will also operate correctly with test.py and detect.py.

Your yaml model definition is not correct though, it has redundant nn.Conv2d() ops which we eliminated in v2.0. You should start a clean slate: clone current repo and go from there.

@Polary-L
Copy link
Author

the

@Polary-L if your model trains correctly then it will also operate correctly with test.py and detect.py.

Your yaml model definition is not correct though, it has redundant nn.Conv2d() ops which we eliminated in v2.0. You should start a clean slate: clone current repo and go from there.

this problem is same as this one :[(https://github.com/ultralytics/yolov3/issues/883)]
I comment line 113 fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.size())) out on torch_utils.py and Fusing layers... gonna work. do you have any idea to fix it?

@glenn-jocher
Copy link
Member

You should start a clean slate: clone current repo and go from there.

Originally posted by @glenn-jocher in #658 (comment)

@zhepherd
Copy link

the

@Polary-L if your model trains correctly then it will also operate correctly with test.py and detect.py.
Your yaml model definition is not correct though, it has redundant nn.Conv2d() ops which we eliminated in v2.0. You should start a clean slate: clone current repo and go from there.

this problem is same as this one :[(https://github.com/ultralytics/yolov3/issues/883)]
I comment line 113 fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.size())) out on torch_utils.py and Fusing layers... gonna work. do you have any idea to fix it?
@Polary-L i have the same error with you, do you fix it? how?

@glenn-jocher
Copy link
Member

@zhepherd if you are running into difficulties with fusing custom models, I would recommend you simply convert the fuse() method into a pass through entity:

yolov5/models/yolo.py

Lines 160 to 170 in 702c4fa

def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
print('Fusing layers... ')
for m in self.model.modules():
if type(m) is Conv and hasattr(Conv, 'bn'):
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatability
m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv
delattr(m, 'bn') # remove batchnorm
m.forward = m.fuseforward # update forward
self.info()
return self

For example:

    def fuse(self):  # fuse model Conv2d() + BatchNorm2d() layers
        print('Skipping fuse... ')
        # print('Fusing layers... ')
        # for m in self.model.modules():
        #     if type(m) is Conv and hasattr(Conv, 'bn'):
        #         m._non_persistent_buffers_set = set()  # pytorch 1.6.0 compatability
        #         m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv
        #         delattr(m, 'bn')  # remove batchnorm
        #         m.forward = m.fuseforward  # update forward
        # self.info()
        return self

@Polary-L
Copy link
Author

@zhepherd if you are running into difficulties with fusing custom models, I would recommend you simply convert the fuse() method into a pass through entity:

yolov5/models/yolo.py

Lines 160 to 170 in 702c4fa

def fuse(self): # fuse model Conv2d() + BatchNorm2d() layers
print('Fusing layers... ')
for m in self.model.modules():
if type(m) is Conv and hasattr(Conv, 'bn'):
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatability
m.conv = fuse_conv_and_bn(m.conv, m.bn) # update conv
delattr(m, 'bn') # remove batchnorm
m.forward = m.fuseforward # update forward
self.info()
return self

For example:

    def fuse(self):  # fuse model Conv2d() + BatchNorm2d() layers
        print('Skipping fuse... ')
        # print('Fusing layers... ')
        # for m in self.model.modules():
        #     if type(m) is Conv and hasattr(Conv, 'bn'):
        #         m._non_persistent_buffers_set = set()  # pytorch 1.6.0 compatability
        #         m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv
        #         delattr(m, 'bn')  # remove batchnorm
        #         m.forward = m.fuseforward  # update forward
        # self.info()
        return self

I soved it in the same way and it works.

@Polary-L
Copy link
Author

@glenn-jocher
Ive trained yolov5(with ghostnet backbone) on CrowdHuman dataset.
Although the amount of parameters is reduced by 50%, GFLOPS and test speed both increase.
Are there other ways worked on models to enhance test speed ? Prune on yolov5s or any other ?

@glenn-jocher
Copy link
Member

reduce image size, increase batch size, use faster hardware.

@zaghdoud2019
Copy link

Question

here is the changed yolov5s.yaml

# parameters
nc: 1  # number of classes
na: 3  # number of anchors
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 backbone
backbone:
  # [from, number, module, args]:640
  [[-1, 1, Focus, [64, 3]],  # 0-P1/2:320
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4:160
  #  [-1, 3, BottleneckCSP, [128]],
   [-1, 3, GhostBottleneck, [128, 3, 1]], 
   [-1, 1, GhostBottleneck, [256, 3, 2]], #3-P3/8:80
   [-1, 9, GhostBottleneck, [256, 3, 1]], 
   [-1, 1, SELayer, [256,16]],
   [-1, 1, GhostBottleneck, [512, 3, 2]], #6-P4/16:40
   [-1, 9, GhostBottleneck, [512, 3, 1]],
   [-1, 1, SELayer, [512,16]],
   [-1, 1, Conv, [1024, 3, 2]],  # 9-P5/32:20
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   #  [-1, 3, BottleneckCSP, [1024, False]],  # 11
   [-1, 3, SELayer, [1024,16]],
  ]

# YOLOv5 head PANET
head:
  [
   [-1, 3, GhostBottleneck, [1024, 3, 1]], #12

   [-1, 1, DWConv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']], #14 :40
   [[-1, 8], 1, Concat, [1]],  # cat backbone P4
   [-1, 1, DWConv, [512, 3, 1]],
   [-1, 3, GhostBottleneck, [512, 3, 1]],

   [-1, 1, DWConv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']], #19/ :160
   [[-1, 5], 1, Concat, [1]],  # cat backbone P3
   [-1, 1, DWConv, [256, 3, 1]],
   [-1, 3, GhostBottleneck, [256, 3, 1]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], #  23/(P3/8-small)

   [-1, 1, DWConv, [256, 3, 2]],
   [[-1, 18], 1, Concat, [1]],  # cat head P4
   [-1, 1, DWConv, [512, 3, 1]],
   [-1, 3, GhostBottleneck, [512, 3, 1]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], #  28/(P4/16-medium)

   [-1, 1, DWConv, [512, 3, 2]],
   [[-1, 13], 1, Concat, [1]],  # cat head P4
   [-1, 1, DWConv, [1024, 3, 1]],
   [-1, 3, GhostBottleneck, [1024, 3, 1]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1]], #  33/(P5/32-large)

   [[23, 28, 33], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

and train process is done with

                 from  n    params  module                                  arguments                     
 0                -1  1      3520  models.common.Focus                     [3, 32, 3]                    
 1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]                
 2                -1  1      3440  models.common.GhostBottleneck           [64, 64, 3, 1]                
 3                -1  1     18784  models.common.GhostBottleneck           [64, 128, 3, 2]               
 4                -1  3     32928  models.common.GhostBottleneck           [128, 128, 3, 1]              
 5                -1  1      2048  models.common.SELayer                   [128, 16]                     
 6                -1  1     66240  models.common.GhostBottleneck           [128, 256, 3, 2]              
 7                -1  3    115008  models.common.GhostBottleneck           [256, 256, 3, 1]              
 8                -1  1      8192  models.common.SELayer                   [256, 16]                     
 9                -1  1   1180672  models.common.Conv                      [256, 512, 3, 2]              
10                -1  1    656896  models.common.SPP                       [512, 512, [5, 9, 13]]        
11                -1  1     32768  models.common.SELayer                   [512, 16]                     
12                -1  1    142208  models.common.GhostBottleneck           [512, 512, 3, 1]              
13                -1  1      1024  n DWConv at 0x7efff986a9d               [512, 256, 1, 1]              
14                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
15           [-1, 8]  1         0  models.common.Concat                    [1]                           
16                -1  1      5120  n DWConv at 0x7efff986a9d               [512, 256, 3, 1]              
17                -1  1     38336  models.common.GhostBottleneck           [256, 256, 3, 1]              
18                -1  1       512  n DWConv at 0x7efff986a9d               [256, 128, 1, 1]              
19                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
20           [-1, 5]  1         0  models.common.Concat                    [1]                           
21                -1  1      2560  n DWConv at 0x7efff986a9d               [256, 128, 3, 1]              
22                -1  1     10976  models.common.GhostBottleneck           [128, 128, 3, 1]              
23                -1  1      2322  torch.nn.modules.conv.Conv2d            [128, 18, 1, 1]               
24                -1  1     10624  n DWConv at 0x7efff986a9d               [18, 128, 3, 2]               
25          [-1, 18]  1         0  models.common.Concat                    [1]                           
26                -1  1      2816  n DWConv at 0x7efff986a9d               [256, 256, 3, 1]              
27                -1  1     38336  models.common.GhostBottleneck           [256, 256, 3, 1]              
28                -1  1      4626  torch.nn.modules.conv.Conv2d            [256, 18, 1, 1]               
29                -1  1     21248  n DWConv at 0x7efff986a9d               [18, 256, 3, 2]               
30          [-1, 13]  1         0  models.common.Concat                    [1]                           
31                -1  1      5632  n DWConv at 0x7efff986a9d               [512, 512, 3, 1]              
32                -1  1    142208  models.common.GhostBottleneck           [512, 512, 3, 1]              
33                -1  1      9234  torch.nn.modules.conv.Conv2d            [512, 18, 1, 1]               
34      [23, 28, 33]  1      1026  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [18, 18, 18]]
Model Summary: 243 layers, 2.57786e+06 parameters, 2.57786e+06 gradients

Optimizer groups: 81 .bias, 87 conv.weight, 75 other

Additional context

then I used the best.pt to detect ,error happened with "RuntimeError: shape '[16, 16, 5, 5]' is invalid for input of size 400"
I saw this error in your yolov3 version,and you gave suggestions in https://github.com/ultralytics/yolov3/issues/1395
how to fix this bug in yolov5 whose model has changed?
another question is my test_batch0_pred.jpgdidn't prediction anything.I use different sizes images to train the new network,is that reason caused the error?

please where you define 'SLayer'?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants