-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor train eval #147
Refactor train eval #147
Conversation
Looks like type tests are failing. The type of device should probably be str |
for more information, see https://pre-commit.ci
… into refactor-train-eval
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good overall! I just want to make the train/val change for Eval
. I can write the code for this.
output directory. Defaults to False. | ||
""" | ||
|
||
data: Extract |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently extract requires there to be a train and validation split, which I don't think we should mandate when running Eval (I'd like to be able to run truthfulQA without having to make a fake training split). I think we should modify the __post_init__
of Evaluate to only extract the validation dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for the others from our conversation in the chat:
Your change requests should probably be maybe implemented as a new feature by itself, since the current pull request is only about refactoring, but we can add them soonish after merging...
"""Get a list of indices of hidden layers given a `DatasetDict`.""" | ||
layers = [ | ||
int(feat[len("hidden_") :]) | ||
for feat in ds["train"].features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ds might not contain a "train" split, so we should just take ds.values().pop().features or the equivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't you always still get at least train if there is no split?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking about the refactor I want to do to support evaluations on a single split, and in the case of E.g. truthful QA there is only a validation split. This can wait for the next PR though.
|
||
devices = select_usable_devices(cfg.num_gpus) | ||
num_devices = len(devices) | ||
_, _, test_x0, test_x1, _, test_labels = self.prepare_data( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should split up the extraction of train and test data so we don't have to do this
0c8bc0f
to
f43d354
Compare
^ sorry my bad, was branching off from you and forgot to switch my branch when pushing. corrected it and ignore the previous push |
for more information, see https://pre-commit.ci
Create dataclasses for writing to CSV, refactor CSV logging, fix eval CSV columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. We can support single splits in the next pr.
No description provided.