Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix gated forward functions #295

Merged
merged 5 commits into from
Sep 20, 2024

Conversation

callummcdougall
Copy link
Contributor

2 fixes:

  1. In the forward method, when self.use_error_term is True, for non-standard architectures things like centering and input activation functions weren't being applied correctly. I think this was just a mistake, and is now fixed.
  2. In the encode_gated method, the hook hook_sae_acts_post pointed to the activations post-ReLU (or whatever activation function is), but not to the actual output of this function, i.e. post-ReLU activations multiplied by masking values. I think a long-term solution has 2 separate hooks for each of these (e.g. hook_sae_acts_post and hook_sae_mag_post), but if we just have a single hook called hook_sae_acts_post then I think it makes a lot more sense for it to refer to the output of the encoder.

@jbloomAus
Copy link
Owner

Thanks!

@callummcdougall
Copy link
Contributor Author

np! also let me know if there are things I can do to not require formatting PRs in the future - would this involve something like using the same workspace config files as this repo does?

@jbloomAus
Copy link
Owner

@callummcdougall run make format and make check-ci nothing more to it!

@jbloomAus jbloomAus merged commit a708220 into jbloomAus:main Sep 20, 2024
5 checks passed
zhenningdavidliu pushed a commit to decandido/SAELens that referenced this pull request Oct 4, 2024
* support seqpos slicing

* fix forward functions for gated

* remove seqpos changes

* fix formatting (remove my changes)

* format

---------

Co-authored-by: jbloomAus <[email protected]>
tom-pollak pushed a commit to tom-pollak/SAELens that referenced this pull request Oct 22, 2024
* support seqpos slicing

* fix forward functions for gated

* remove seqpos changes

* fix formatting (remove my changes)

* format

---------

Co-authored-by: jbloomAus <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants