-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MBConvBlockWithoutDepthwise stride implemented in 1x1 projection, wasting expansion arithmetic #660
Comments
The reduction in OPs is substantial: moving the stride from the projection convolution to the expansion reduces the total number of operations in EfficientEdgeTPU-S by about 24%. |
At this point, it should be apparent that Google has no intention of responding to this issue. I will nevertheless leave this comment here in case it helps anyone who should stumble upon it: Google actually fixed this design flaw in the updated implementation of the MBConWithoutDepthwise block as it is used in MobileDets. In that work, they renamed the block "Fused convolution layer." The implementation can be found here: https://github.com/tensorflow/models/blob/2986bcafb9eaa8fed4d78f17a04c4c5afc8f6691/research/object_detection/models/ssd_mobiledet_feature_extractor.py#L142-L147 Notice that the stride is now implemented in the expansion convolution:
|
Hi @andravin Thanks for the great point. I have prepared a fix change internally, which should go out in the next release. |
Fixed in 32572cb |
PiperOrigin-RevId: 349201465
I noticed exactly the same issue recently when I looked into the Efficientnet_EdgeTPU repo. Thank @andravin for pointing out the issue here. |
MBConvBlockWithoutDepthwise
implements stride in the1x1
projection convolution. When stride=2, the projection discards 3/4ths of the activations produced by the expansion. It would be equivalent to implement stride on the3x3
expansion convolution instead, and this would reduce the total block arithmetic almost by a factor of 4.tpu/models/official/efficientnet/efficientnet_model.py
Lines 422 to 442 in 8462d08
The text was updated successfully, but these errors were encountered: