truthfulqa_mc2 is Nan, while truthfulqa_mc1 is 1.00 #714

chi2liu · 2023-07-31T08:59:44Z

I have finetued a model based on llama-2-hf, and run the evaluation with code and get truthfulqa_mc2 is Nan, while truthfulqa_mc1 is 1.00.

What does that means?

python main.py --model hf-causal-experimental --model_args pretrained=../mamba-gpt-7b-v2 --tasks anli_r1,anli_r2,anli_r3,arc_challenge,arc_easy,boolq,hellaswag,openbookqa,piqa,record,rte,truthfulqa_mc,wic,winogrande --device cuda:0

hf-causal-experimental (pretrained=../mamba-gpt-7b-v2), limit: None, provide_description: False, num_fewshot: 0, batch_size: None

Task	Version	Metric	Value		Stderr
anli_r1	0	acc	0.3340	±	0.0149
anli_r2	0	acc	0.3340	±	0.0149
anli_r3	0	acc	0.3350	±	0.0136
arc_challenge	0	acc	0.2270	±	0.0122
		acc_norm	0.2270	±	0.0122
arc_easy	0	acc	0.2508	±	0.0089
		acc_norm	0.2508	±	0.0089
boolq	1	acc	0.3783	±	0.0085
hellaswag	0	acc	0.2504	±	0.0043
		acc_norm	0.2504	±	0.0043
openbookqa	0	acc	0.2760	±	0.0200
		acc_norm	0.2760	±	0.0200
piqa	0	acc	0.4951	±	0.0117
		acc_norm	0.4951	±	0.0117
record	0	f1	0.1186	±	0.0032
		em	0.1151	±	0.0032
rte	0	acc	0.5271	±	0.0301
truthfulqa_mc	1	mc1	1.0000	±	0.0000
		mc2	NaN	±	NaN
wic	0	acc	0.5000	±	0.0198
winogrande	0	acc	0.4957	±	0.0141

The text was updated successfully, but these errors were encountered:

505707566 · 2023-11-22T08:27:56Z

I have same issue！But I have done some operation to change or move the lora weight in my code.
Have you solved it?

I have finetued a model based on llama-2-hf, and run the evaluation with code and get truthfulqa_mc2 is Nan, while truthfulqa_mc1 is 1.00.

What does that means?

python main.py --model hf-causal-experimental --model_args pretrained=../mamba-gpt-7b-v2 --tasks anli_r1,anli_r2,anli_r3,arc_challenge,arc_easy,boolq,hellaswag,openbookqa,piqa,record,rte,truthfulqa_mc,wic,winogrande --device cuda:0

hf-causal-experimental (pretrained=../mamba-gpt-7b-v2), limit: None, provide_description: False, num_fewshot: 0, batch_size: None

Task Version Metric Value Stderr
anli_r1 0 acc 0.3340 ± 0.0149
anli_r2 0 acc 0.3340 ± 0.0149
anli_r3 0 acc 0.3350 ± 0.0136
arc_challenge 0 acc 0.2270 ± 0.0122
acc_norm 0.2270 ± 0.0122
arc_easy 0 acc 0.2508 ± 0.0089
acc_norm 0.2508 ± 0.0089
boolq 1 acc 0.3783 ± 0.0085
hellaswag 0 acc 0.2504 ± 0.0043
acc_norm 0.2504 ± 0.0043
openbookqa 0 acc 0.2760 ± 0.0200
acc_norm 0.2760 ± 0.0200
piqa 0 acc 0.4951 ± 0.0117
acc_norm 0.4951 ± 0.0117
record 0 f1 0.1186 ± 0.0032
em 0.1151 ± 0.0032
rte 0 acc 0.5271 ± 0.0301
truthfulqa_mc 1 mc1 1.0000 ± 0.0000
mc2 NaN ± NaN
wic 0 acc 0.5000 ± 0.0198
winogrande 0 acc 0.4957 ± 0.0141

lintangsutawika · 2023-12-14T08:29:32Z

This issue should be solved in the main branch.

hahmad2008 · 2024-01-23T12:20:03Z

@lintangsutawika I used the main branch and the issue is still there
issue opened #1340

choco9966 · 2024-04-23T12:03:37Z

@lintangsutawika How to fix it? Can you share the PR? Thanks

haileyschoelkopf · 2024-04-26T15:13:58Z

@choco9966 can you share a public model + sample command that reproduces this issue?

lintangsutawika closed this as completed Dec 14, 2023

hahmad2008 mentioned this issue Jan 23, 2024

NAN value for truthfulqa_mc2 on full finetuned model TinyLlama #1340

Open

hahmad2008 mentioned this issue Jan 23, 2024

NAN value for truthfulqa_mc2 on full finetuned model TinyLlama pytorch/pytorch#118095

Open

haileyschoelkopf reopened this Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

truthfulqa_mc2 is Nan, while truthfulqa_mc1 is 1.00 #714

truthfulqa_mc2 is Nan, while truthfulqa_mc1 is 1.00 #714

chi2liu commented Jul 31, 2023

505707566 commented Nov 22, 2023

lintangsutawika commented Dec 14, 2023

hahmad2008 commented Jan 23, 2024

choco9966 commented Apr 23, 2024 •

edited

haileyschoelkopf commented Apr 26, 2024

truthfulqa_mc2 is Nan, while truthfulqa_mc1 is 1.00 #714

truthfulqa_mc2 is Nan, while truthfulqa_mc1 is 1.00 #714

Comments

chi2liu commented Jul 31, 2023

505707566 commented Nov 22, 2023

lintangsutawika commented Dec 14, 2023

hahmad2008 commented Jan 23, 2024

choco9966 commented Apr 23, 2024 • edited

haileyschoelkopf commented Apr 26, 2024

choco9966 commented Apr 23, 2024 •

edited