Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report all metrics at end of benchmark #1333

Closed
3 tasks
guarin opened this issue Jul 19, 2023 · 3 comments · Fixed by #1706
Closed
3 tasks

Report all metrics at end of benchmark #1333

guarin opened this issue Jul 19, 2023 · 3 comments · Fixed by #1706

Comments

@guarin
Copy link
Contributor

guarin commented Jul 19, 2023

We should print all metrics at the end of a benchmark again to make them easier to extract from the logs.

TODO:

  • Collect metrics over a whole benchmark (online, knn, linear, and finetune eval)
  • Print metrics at end of benchmark script
  • Optional: Print also as markdown table
@EricLiclair
Copy link
Contributor

@guarin i'm planning to take this up. seems to me like a good-first-issue and i'd get some understanding of the codebase. can u assign this to me and also, maybe, point out a few entry points where i could start with the codebase so as to speed up my understanding. thanks.

@guarin
Copy link
Contributor Author

guarin commented Oct 21, 2024

Hi @EricLiclair, the idea is that the main.py script in the benchmarks should aggregate all evaluation metrics and print them as a table in the end. The relevant function calls are here:

knn_eval.knn_eval(
model=model,
num_classes=num_classes,
train_dir=train_dir,
val_dir=val_dir,
log_dir=method_dir,
batch_size_per_device=batch_size_per_device,
num_workers=num_workers,
accelerator=accelerator,
devices=devices,
)
if skip_linear_eval:
print_rank_zero("Skipping linear eval.")
else:
linear_eval.linear_eval(
model=model,
num_classes=num_classes,
train_dir=train_dir,
val_dir=val_dir,
log_dir=method_dir,
batch_size_per_device=batch_size_per_device,
num_workers=num_workers,
accelerator=accelerator,
devices=devices,
precision=precision,
)
if skip_finetune_eval:
print_rank_zero("Skipping fine-tune eval.")
else:
finetune_eval.finetune_eval(
model=model,
num_classes=num_classes,
train_dir=train_dir,
val_dir=val_dir,
log_dir=method_dir,
batch_size_per_device=batch_size_per_device,
num_workers=num_workers,
accelerator=accelerator,
devices=devices,
precision=precision,
)

So the knn_eval, linear_eval, and finetune_eval functions should each return their metrics from the function instead of just printing them. See for example here:

for metric in ["val_top1", "val_top5"]:
print_rank_zero(f"knn {metric}: {max(metric_callback.val_metrics[metric])}")

@EricLiclair
Copy link
Contributor

hi @guarin
i've added a draft pr here - #1706 for imagenet/vitb16; requesting comments on the approach.

based on your comments, i'll update (if needed) and add similar changes for imagenet/resnet50.
(or lmk if i should raise another pr for imagenet/resnet50)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants