This repo contains the package to compute the evaluation metrics for the TopCoW2024 challenge on grand-challnge (GC).
At the root folder, there is a pyproject.toml
config file that can set up the evaluation project folder
as a local pip module called topcow24_eval
for running the evaluations in your python project.
To setup and install topcow24_eval
package:
# from topcow24_eval root
bash ./setup.sh
# activate the env with topcow24_eval installed
source env_py310/bin/activate
First go to topcow24_eval/configs.py
and configure the track, task, and expected_num_cases
.
The expected_num_cases
is required and must match the number of cases to evalute, i.e. the number of ground-truth cases etc.
See below.
When not in docker environment, the paths of pred, gt, roi etc
are set by default to be on the same level as the package dir topcow24_eval
:
# mkdir and put your gt, pred etc like this:
├── ground-truth
├── predictions
├── topcow24_eval
Simply put the files of ground-truth and predictions in the folders ground-truth/
and predictions/
,
and run python3 topcow24_eval/evaluation.py
.
You can also specify your own custom paths for the ground-truth, predictions etc when you call the evaluation object:
# example from topcow24_eval/test_evaluation.py
evalRun = TopCoWEvaluation(
track,
task,
expected_num_cases,
need_crop,
predictions_path=TESTDIR / "task_1_seg_predictions/",
ground_truth_path=TESTDIR / "task_1_seg_ground-truth/",
output_path=output_path,
roi_path=TESTDIR / "task_1_seg_roi-metadata/",
)
The naming of gt and pred files can be arbitrary as long as their filelist dataframe .sort_values()
are sorted in the same way!
The accepted file formats for ground-truth and predictions are:
- NIfTI (
.nii.gz
,.nii
) or SimpleITK compatible images.mha
for images and masks .txt
,.json
for bounding box- (
roi-metadata/
only allows for.txt
for roi-txt. See below.)
- (
.yml
,.json
for graph/edge-list
Optionally, if you evaluate for Task-1-CoW-Segmentation, you can decide to whether evaluate on the cropped region (ROI) of the ground-truth/prediction.
Whether to crop the images for evaluations is set by need_crop
in configs.py
.
If need_crop
is False
, then roi_path
will be ignored, and no cropping will be done.
If need_crop
is True
and roi_path
has roi_txt files,
then the evaluations will be performed on the cropped gt, pred.
It has no effect on Task 2 or Task 3.
Afterwards, make sure to put the roi-txt files in the folder roi-metadata/
(or you can supply your own roi_path
when calling TopCoWEvaluation()
):
# note the new roi-metadata/
├── ground-truth
├── predictions
├── roi-metadata
├── topcow24_eval
The naming of roi-txt files can be arbitrary as long as their filelist dataframe .sort_values()
are sorted in the same way as gt or pred!
In topcow24_eval/metrics/seg_metrics/
, you will find our implementations for evaluating the submitted segmentation predictions.
Seven evaluation metrics with equal weights for multi-class (CoW anatomical vessels) segmentation task:
- Class-average Dice similarity coefficient:
- Centerline Dice (clDice) on merged binary mask:
- Class-average 0-th Betti number error:
- Class-average Hausdorff Distance 95% Percentile (HD95):
- Average F1 score (harmonic mean of the precision and recall) for detection of the "Group 2 CoW components":
- Variant-balanced graph classification accuracy:
- Variant-balanced topology match rate:
In topcow24_eval/metrics/box_metrics/
, you will find our implementations for evaluating bounding box predictions.
- Boundary Intersection over Union (IoU) and IoU:
In topcow24_eval/metrics/edg_metrics/
, you will find our implementations for evaluating graph classification task.
- Variant-balanced accuracy:
The documentations for our code come in the form of unit tests. Please check our test cases to see the expected inputs and outputs, expected behaviors and calculations.
The files with names that follow the form test_*.py
contain the test cases for the evaluation metrics.
- Dice:
- clDice:
- Betti-0 number error:
- HD and HD95:
- graph classification:
- detections:
- topology matching:
- boundary IoU:
Test asset files used in the test cases are stored in the folder test_assets/
.
Simply invoke the tests by pytest .
:
# simply run pytest
$ pytest .
topcow24_eval/aggregate/test_aggregate_all_detection_dicts.py ... [ 2%]
topcow24_eval/aggregate/test_aggregate_all_graph_dicts.py ... [ 4%]
topcow24_eval/aggregate/test_aggregate_all_topo_dicts.py . [ 4%]
topcow24_eval/aggregate/test_edge_list_to_variant_str.py . [ 5%]
topcow24_eval/metrics/box_metrics/test_boundary_iou_from_tuple.py ........... [ 13%]
topcow24_eval/metrics/box_metrics/test_boundary_points_with_distances.py ..... [ 16%]
topcow24_eval/metrics/box_metrics/test_iou_dict_from_files.py .... [ 19%]
topcow24_eval/metrics/edg_metrics/test_edge_dict_to_list.py .. [ 20%]
topcow24_eval/metrics/edg_metrics/test_graph_dict_from_files.py .. [ 22%]
topcow24_eval/metrics/seg_metrics/graph_classification/test_edge_criteria.py .. [ 23%]
topcow24_eval/metrics/seg_metrics/graph_classification/test_generate_edgelist.py ... [ 25%]
topcow24_eval/metrics/seg_metrics/graph_classification/test_graph_classification.py . [ 26%]
topcow24_eval/metrics/seg_metrics/test_clDice.py .... [ 29%]
topcow24_eval/metrics/seg_metrics/test_cls_avg_b0.py ............... [ 39%]
topcow24_eval/metrics/seg_metrics/test_cls_avg_dice.py ............. [ 48%]
topcow24_eval/metrics/seg_metrics/test_cls_avg_hd95.py ............. [ 57%]
topcow24_eval/metrics/seg_metrics/test_detection_grp2_labels.py ........ [ 63%]
topcow24_eval/metrics/seg_metrics/test_generate_cls_avg_dict.py .......... [ 70%]
topcow24_eval/metrics/seg_metrics/topology_matching/test_check_LR_flip.py .. [ 71%]
topcow24_eval/metrics/seg_metrics/topology_matching/test_topology_matching.py ........ [ 77%]
topcow24_eval/test_evaluation_task_1_seg.py .. [ 78%]
topcow24_eval/test_evaluation_task_2_box.py . [ 79%]
topcow24_eval/test_evaluation_task_3_edg.py . [ 79%]
topcow24_eval/test_score_case.py .... [ 82%]
topcow24_eval/utils/test_crop_gt_and_pred.py . [ 83%]
topcow24_eval/utils/test_crop_sitk.py .... [ 86%]
topcow24_eval/utils/test_utils_box.py .......... [ 93%]
topcow24_eval/utils/test_utils_edge.py .. [ 94%]
topcow24_eval/utils/test_utils_mask.py ...... [ 98%]
topcow24_eval/utils/test_utils_neighborhood.py .. [100%]
======================================================== 144 passed, 15 warnings in 10.54s =========================================================