Use train data for computing normalization #30

FabienRoger · 2023-02-03T13:18:01Z

No description provided.

lauritowal

@FabienRoger @norabelrose
Seems like deberta performs worse after these changes:

for boolq for it was 70% before, now:

model,prefix,method,prompt_level,train,test,accuracy,std,language_model_type,layer,loss
deberta-v2-xxlarge-mnli,normal,ccs,all,boolq,boolq,0.551499992609024,0.051188381401737586,encoder,-1.0,0.37969053238630296
deberta-v2-xxlarge-mnli,normal,lr,all,boolq,boolq,0.6075,0.06677387213573883,encoder,-1.0,0.0"

for imdb for it was 89% before, now:

model,prefix,method,prompt_level,train,test,accuracy,std,language_model_type,layer,loss
deberta-v2-xxlarge-mnli,normal,ccs,all,imdb,imdb,0.8411666552225748,0.15101562370137744,encoder,-1.0,0.8821464618047078
deberta-v2-xxlarge-mnli,normal,lr,all,imdb,imdb,0.9106666666666665,0.07123591478710409,encoder,-1.0,0.0

We should check what is happening there (?)

elk/train.py

elk/utils_evaluation/utils_evaluation.py

…sing-train

FabienRoger · 2023-02-06T22:44:34Z

for boolq for it was 70% before, now:

I'm not finding that. I got 59% for both.

for imdb for it was 89% before, now:

I'm not finding that. I got 0.88 for both runs.

lauritowal · 2023-02-06T22:48:38Z

@FabienRoger Thanks for running that too! Are you sure that you are not loading an older model? What model type are you using? I'll try it out again, later.

FabienRoger · 2023-02-06T22:49:35Z

deberta-v2-xxlarge-mnli,normal,ccs,all,boolq,boolq,0.5864999830722809,0.0585042759828347,encoder,-1.0,0.9317224085330963

FabienRoger · 2023-02-07T18:17:26Z

I fixed a dumb bug and now the results are
On boolq, drop from 65% to 62%
On imdb, stays at 88%

…sing-train

norabelrose

LGTM

Use train data for computing normalization

43bde99

FabienRoger marked this pull request as ready for review February 3, 2023 13:20

FabienRoger requested a review from norabelrose February 3, 2023 13:20

norabelrose requested a review from lauritowal February 3, 2023 19:14

lauritowal requested changes Feb 3, 2023

View reviewed changes

elk/train.py Outdated Show resolved Hide resolved

elk/utils_evaluation/utils_evaluation.py Outdated Show resolved Hide resolved

lauritowal requested changes Feb 3, 2023

View reviewed changes

elk/utils_evaluation/utils_evaluation.py Show resolved Hide resolved

FabienRoger added 3 commits February 6, 2023 20:18

Merge branch 'main' of github.com:EleutherAI/elk into normalization-u…

f986876

…sing-train

Add the include_test_set flag

484fb03

relabel + comment options

d75ce8e

FabienRoger requested a review from lauritowal February 6, 2023 20:33

Fix mistake

79fc98e

Merge branch 'main' of github.com:EleutherAI/elk into normalization-u…

f930044

…sing-train

norabelrose approved these changes Feb 9, 2023

View reviewed changes

norabelrose merged commit 0af2021 into main Feb 9, 2023

norabelrose mentioned this pull request Feb 9, 2023

Normalization should use the training data to compute means and stds #5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use train data for computing normalization #30

Use train data for computing normalization #30

FabienRoger commented Feb 3, 2023

lauritowal left a comment •

edited

Loading

FabienRoger commented Feb 6, 2023 •

edited

Loading

lauritowal commented Feb 6, 2023 •

edited

Loading

FabienRoger commented Feb 6, 2023

FabienRoger commented Feb 7, 2023 •

edited

Loading

norabelrose left a comment

Use train data for computing normalization #30

Use train data for computing normalization #30

Conversation

FabienRoger commented Feb 3, 2023

lauritowal left a comment • edited Loading

Choose a reason for hiding this comment

FabienRoger commented Feb 6, 2023 • edited Loading

lauritowal commented Feb 6, 2023 • edited Loading

FabienRoger commented Feb 6, 2023

FabienRoger commented Feb 7, 2023 • edited Loading

norabelrose left a comment

Choose a reason for hiding this comment

lauritowal left a comment •

edited

Loading

FabienRoger commented Feb 6, 2023 •

edited

Loading

lauritowal commented Feb 6, 2023 •

edited

Loading

FabienRoger commented Feb 7, 2023 •

edited

Loading