Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

problem when set fix_gamma=True in batchnorm #9624

Closed
solin319 opened this issue Jan 30, 2018 · 8 comments
Closed

problem when set fix_gamma=True in batchnorm #9624

solin319 opened this issue Jan 30, 2018 · 8 comments

Comments

@solin319
Copy link
Contributor

If fix_gamma is true, then set gamma to 1 and its gradient to 0.
But the value of gamma will be changed during parameters update. So the gamma saved in param file was not 1. These will bring a problem in convert MXNet parameters to other deep-learning platforms.
This problem was caused by we set a default weight-decay in SGD optimizer.
We must define variable gamma with wd_mult=0 to fix gamma=1 during training.

Can MXNet set wd of gamma to 0 automatically when fix_gamma=1?

@piiswrong
Copy link
Contributor

This is a legacy design defect. When fix_gamma is true there shouldn't be a gamma parameter.
In the future this could be solved by creating new operator batch_norm and deprecating BatchNorm

@rajanksin
Copy link
Contributor

@sandeep-krishnamurthy : Tag: Bug

@anushthakalia
Copy link

@solin319 how did you check the value of gamma? I suspect I am facing the same issue and hence, want to verify. But unfortunately, I couldn't find gamma in either arg_params or aux_params.

@vandanavk
Copy link
Contributor

@mxnet-label-bot add [Operator]

@Vikas-kum
Copy link
Contributor

@solin319 I guess, the fix is already merged and closed.
@anirudh2290 Can we close this one.

@anirudh2290
Copy link
Member

@Vikas89 the fix is only for coreml converter. The operator hasnt been fixed yet.

@kohillyang
Copy link

@solin319 Does it means that the parameters would be updated even its grad_req is set to "null" if the wd_mult is not set to zero?

I think this behavior is unexpected.

@szha
Copy link
Member

szha commented Sep 26, 2020

I think @wkcn fixed this issue in #18500. Now the batchnorm op would respond to grad_req=null correctly.

@szha szha closed this as completed Sep 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests