-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question about normalized MSE #8
Comments
Thanks for reporting this discrepancy. The version used in the paper is: # computed only once before training on a fixed set of activations
mean_activations = original_input.mean(dim=0) # averaging over the batch dimension
baseline_mse = (original_input - mean_activations).pow(2).mean()
# computed on each batch during training and testing
actual_mse = (reconstruction - original_input).pow(2).mean()
normalized_mse = actual_mse / baseline_mse |
Got it. It matches the code in |
lukaemon
added a commit
to lukaemon/mission-sae
that referenced
this issue
Jul 9, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In paper 2.1:
In readme example:
Which is the same as in loss.py:
The way I understand normalized MSE and divide by
baseline reconstruction error of always predicting the mean activations
isWhat did I miss? Did I misunderstand the paper or code? Thx for your time.
The text was updated successfully, but these errors were encountered: