Report all metrics at end of benchmark #1333

guarin · 2023-07-19T09:41:06Z

We should print all metrics at the end of a benchmark again to make them easier to extract from the logs.

TODO:

Collect metrics over a whole benchmark (online, knn, linear, and finetune eval)
Print metrics at end of benchmark script
Optional: Print also as markdown table

EricLiclair · 2024-10-19T10:28:56Z

@guarin i'm planning to take this up. seems to me like a good-first-issue and i'd get some understanding of the codebase. can u assign this to me and also, maybe, point out a few entry points where i could start with the codebase so as to speed up my understanding. thanks.

guarin · 2024-10-21T06:52:17Z

Hi @EricLiclair, the idea is that the main.py script in the benchmarks should aggregate all evaluation metrics and print them as a table in the end. The relevant function calls are here:

lightly/benchmarks/imagenet/resnet50/main.py

Lines 128 to 170 in 5ac3898

    
               knn_eval.knn_eval( 
        
                   model=model, 
        
                   num_classes=num_classes, 
        
                   train_dir=train_dir, 
        
                   val_dir=val_dir, 
        
                   log_dir=method_dir, 
        
                   batch_size_per_device=batch_size_per_device, 
        
                   num_workers=num_workers, 
        
                   accelerator=accelerator, 
        
                   devices=devices, 
        
               ) 
        
           if skip_linear_eval: 
        
               print_rank_zero("Skipping linear eval.") 
        
           else: 
        
               linear_eval.linear_eval( 
        
                   model=model, 
        
                   num_classes=num_classes, 
        
                   train_dir=train_dir, 
        
                   val_dir=val_dir, 
        
                   log_dir=method_dir, 
        
                   batch_size_per_device=batch_size_per_device, 
        
                   num_workers=num_workers, 
        
                   accelerator=accelerator, 
        
                   devices=devices, 
        
                   precision=precision, 
        
               ) 
        
           if skip_finetune_eval: 
        
               print_rank_zero("Skipping fine-tune eval.") 
        
           else: 
        
               finetune_eval.finetune_eval( 
        
                   model=model, 
        
                   num_classes=num_classes, 
        
                   train_dir=train_dir, 
        
                   val_dir=val_dir, 
        
                   log_dir=method_dir, 
        
                   batch_size_per_device=batch_size_per_device, 
        
                   num_workers=num_workers, 
        
                   accelerator=accelerator, 
        
                   devices=devices, 
        
                   precision=precision, 
        
               )

So the knn_eval, linear_eval, and finetune_eval functions should each return their metrics from the function instead of just printing them. See for example here:

lightly/benchmarks/imagenet/resnet50/knn_eval.py

Lines 92 to 93 in 5ac3898

    
           for metric in ["val_top1", "val_top5"]: 
        
               print_rank_zero(f"knn {metric}: {max(metric_callback.val_metrics[metric])}")

EricLiclair · 2024-10-22T19:33:15Z

hi @guarin
i've added a draft pr here - #1706 for imagenet/vitb16; requesting comments on the approach.

based on your comments, i'll update (if needed) and add similar changes for imagenet/resnet50.
(or lmk if i should raise another pr for imagenet/resnet50)

guarin added the enhancement label Aug 16, 2024

EricLiclair mentioned this issue Oct 22, 2024

Patch eval metrics to markdown #1706

Merged

6 tasks

guarin closed this as completed in #1706 Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report all metrics at end of benchmark #1333

Report all metrics at end of benchmark #1333

guarin commented Jul 19, 2023

EricLiclair commented Oct 19, 2024

guarin commented Oct 21, 2024

EricLiclair commented Oct 22, 2024

Report all metrics at end of benchmark #1333

Report all metrics at end of benchmark #1333

Comments

guarin commented Jul 19, 2023

EricLiclair commented Oct 19, 2024

guarin commented Oct 21, 2024

EricLiclair commented Oct 22, 2024