Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignore embeddings in extraction process #273

Merged
merged 4 commits into from
Jul 27, 2023
Merged

Conversation

lauritowal
Copy link
Collaborator

No description provided.

@lauritowal lauritowal changed the title do not extract embeddings anymore ignore embeddings in extraction process Jul 19, 2023
@lauritowal lauritowal mentioned this pull request Jul 19, 2023
Copy link
Collaborator

@AlexTMallen AlexTMallen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a couple changes. Most importantly, the changes to the handling of labels fixes extraction on deberta. Waiting for another review.

inputs: dict[str, Tensor | None] = dict(input_ids=ids.long())
if is_enc_dec or has_lm_preds:
inputs["labels"] = labels
inputs = dict(input_ids=ids.long(), labels=labels)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does not work with e.g. microsoft/deberta-v2-xxlarge-mnli, which does not expect labels.

@@ -192,11 +191,9 @@ def extract_hiddens(
template_path=cfg.template_path,
rank=rank,
world_size=world_size,
seed=cfg.seed,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seed is used in load_prompts, why was it removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexTMallen what do you mean, it is still there or not?

Copy link
Member

@norabelrose norabelrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with Alex's changes

@norabelrose norabelrose merged commit d45cbbd into main Jul 27, 2023
6 checks passed
@norabelrose norabelrose deleted the remove-embeddings branch July 27, 2023 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants