-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create dataclasses for writing to CSV, refactor CSV logging, fix eval CSV columns #155
Conversation
self.eval_result.acc, | ||
self.eval_result.cal_acc, | ||
self.eval_result.auroc, | ||
self.eval_result.ece, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previously the columns were ["layer", "loss", "acc", "cal_acc", "auroc"] for whatever Eval logged. So i think there was a bug where acc was written to to "loss" instead, and "cal_acc" to "acc" instead, etc.
# and don't want to fail the test | ||
pass | ||
# We should still have results for layer 1, 3, even though layer 2 failed | ||
with open(tmp_path / "eval.csv", "r") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test for kabooms
row = to_csv_line(row) | ||
writer.writerow(row) | ||
if debug: | ||
save_debug_log(dataset, out_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factored this out so its much easier to test
The layer field is used to sort the logs by layer.""" | ||
Log = TypeVar("Log", EvalLog, ElicitLog) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this typevar has to be bounded on EvalLog, ElicitLog
since we call .layer on it (to sort the CSV)
to_csv_line: A function that converts a Log to a list of strings. | ||
This has to be injected in because the Run class does not know | ||
the extra options e.g. skip_baseline to apply to function. | ||
csv_columns: The columns of the CSV file.""" | ||
self.out_dir = assert_type(Path, self.out_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i made func have a signature like
Callable[[int], Log]
previously it was something like
Callable[[int, int, list[int]], Log] # layer, device, list of devices
which could be pretty confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previously func wasn't annotated so pyright wouldn't catch keyword mispelling / mistakes in partial
.
we could also make it a paramspec instead so you indiate that is a function that has keywords world_size, devices, but i didn't want to complicate things
def evaluate_reporter(self, layer: int, devices: list[str], world_size: int): | ||
def evaluate_reporter( | ||
self, layer: int, devices: list[str], world_size: int = 1 | ||
) -> EvalLog: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it now returns a dataclass instead of a list[str].
Which i think is nicer since
- Whoever wants to call Evaluate manually (e.g. in a notebook) gets back a dataclass rather than a list of unlabelled strings
- The responsbility of writing a CSV format is done somewhere else.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvements! Looks good to me, I've added a few comments, but I'll merge it into the refactor-train-eval branch later.
self.out_dir = assert_type(Path, self.out_dir) | ||
# Should we write to different CSV files for elicit vs eval? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we write to different CSV files for elicit vs eval?
we are writing to different CSV files, since for eval the out_dir is usually by default under ..../transfer_eval... However, we could rename "eval.csv" to something like "transfer_eval.csv", maybe.
writer.writerow(row) | ||
if self.cfg.debug: | ||
save_debug_log(self.dataset, self.out_dir) | ||
iterator: Iterator[Log] = tqdm( # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the type: ignore necessary here?
Depends on #147
Closes #153