"None" ensembling for classfication accuracy #290

derpyplops · 2023-08-31T15:19:18Z

Closes NOT-372

ATM, accuracy for "none" ensembling == "partial" ensembling.

This PR implements a reasonable interpretation of what "No ensembling" would look like for classification accuracy: i.e. for accuracy and calibrated accuracy, use the positive hiddens for inference. I also added logging for cal_thresh.

CLAassistant · 2023-08-31T20:05:29Z

All committers have signed the CLA.

black'd

elk/metrics/eval.py

AlexTMallen

I am a bit confused why we're only evaluating the positive examples in our "none" ensembling case--it seems slightly arbitrary, but fine. I will also note that this function has always and continues to handle LM logits incorrectly. LM logits are log probs, so it does not make sense to apply a sigmoid to them and gives us slightly different ensembled results. Meanwhile, reporter logits are log odds, and applying a sigmoid to log odds gives probabilities.

elk/metrics/eval.py

derpyplops force-pushed the not-372-none-ensembling-for-accuracy branch from a38e2a3 to a63995c Compare August 31, 2023 20:25

derpyplops marked this pull request as ready for review August 31, 2023 20:27

add none in acc

728cd42

black'd

derpyplops force-pushed the not-372-none-ensembling-for-accuracy branch from a63995c to 728cd42 Compare August 31, 2023 20:28

derpyplops requested review from norabelrose and AlexTMallen September 1, 2023 07:47

lauritowal approved these changes Sep 3, 2023

View reviewed changes

elk/metrics/eval.py Show resolved Hide resolved

lauritowal self-requested a review September 3, 2023 19:05

AlexTMallen approved these changes Sep 5, 2023

View reviewed changes

elk/metrics/eval.py Show resolved Hide resolved

derpyplops merged commit 14669b1 into EleutherAI:main Sep 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"None" ensembling for classfication accuracy #290

"None" ensembling for classfication accuracy #290

derpyplops commented Aug 31, 2023 •

edited

Loading

CLAassistant commented Aug 31, 2023 •

edited

Loading

AlexTMallen left a comment

"None" ensembling for classfication accuracy #290

"None" ensembling for classfication accuracy #290

Conversation

derpyplops commented Aug 31, 2023 • edited Loading

CLAassistant commented Aug 31, 2023 • edited Loading

AlexTMallen left a comment

Choose a reason for hiding this comment

derpyplops commented Aug 31, 2023 •

edited

Loading

CLAassistant commented Aug 31, 2023 •

edited

Loading