Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smoke tests with tiny gpt2, fix CCSReporter #149

Merged
merged 10 commits into from
Mar 24, 2023
Merged

Smoke tests with tiny gpt2, fix CCSReporter #149

merged 10 commits into from
Mar 24, 2023

Conversation

thejaminator
Copy link
Collaborator

@thejaminator thejaminator commented Mar 23, 2023

This MR adds a smoke test for CCSReporter with tinygpt2

Subsequent issues:

  • Fix EigenReporter for tinygpt2
  • Make the tests work for tiny deberta (or some other MLM?)

@thejaminator thejaminator changed the title WIP: Smoke tests with tiny models Smoke tests with tiny gpt2 Mar 24, 2023
but you'll need to make deberta fp32 instead of fp16
because pytorch cpu doesn't support fp16
"""

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically the deberta model itself does some fp16 operations that aren't supported. but tiny-gpt doesn't. so that why tiny-gpt works.

i'll try to figure that out in another MR. i think we could convert fp16 models to fp32 if its CPU, but seems hacky?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't view this as hacky. In general float16 isn't supported on CPUs. We could check the whether we're running on CPU right when we call AutoModel.from_pretrained and tell HF to always load the model as float32 in that case, and use the dtype of the checkpoint otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ just made this change, so if you want to add a deberta test you can. I think tiny gpt2 is probably sufficient for now though

u[:] = torch.einsum("...ij,...j->...i", A, V[..., k, :])

RuntimeError: select(): index 1 out of range for tensor of size [1, 2]
at dimension 0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EigenReporterConfig is broken for tiny-gpt2. didn't investigate, would like to push the fix for CCS first

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing this, I just pushed a fix

@thejaminator thejaminator changed the title Smoke tests with tiny gpt2 Smoke tests with tiny gpt2, fix CCSReporter Mar 24, 2023
@norabelrose norabelrose self-requested a review March 24, 2023 10:45
Copy link
Member

@norabelrose norabelrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After my changes it LGTM

@norabelrose norabelrose merged commit ad2a088 into main Mar 24, 2023
@norabelrose norabelrose deleted the smoke-test branch March 24, 2023 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants