Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix triviaqa task #525

Merged
merged 4 commits into from
Jun 14, 2023
Merged

Fix triviaqa task #525

merged 4 commits into from
Jun 14, 2023

Conversation

seopbo
Copy link
Contributor

@seopbo seopbo commented May 26, 2023

  • Fix triviaqa task
  • Test task code by usinggpt-neox-20b
  • Below command is my test command
python main.py \
--model hf-causal-experimental \
--model_args use_accelerate=True,pretrained=/mount/lm_storage/checkpoints/gpt-neox-20b \
--tasks triviaqa \
--num_fewshot 5 \
--batch_size 16 \
--limit 100

image

to: @StellaAthena
resolved: #456

@CLAassistant
Copy link

CLAassistant commented May 26, 2023

CLA assistant check
All committers have signed the CLA.

@StellaAthena
Copy link
Member

@seopbo Thanks for the contribution! Can you explain how this fixes TriviaQA?

@seopbo
Copy link
Contributor Author

seopbo commented Jun 1, 2023

@seopbo Thanks for the contribution! Can you explain how this fixes TriviaQA?

I implemented this along our previous discussion. (#456 (comment).)
By this code, gpt-neox-20b score

  • 0 shot: 0.2705
  • 5 shot: 0.3818

In gpt-neox-20b paper, score is

  • 0 shot: 0.259
  • 5 shot: 0.347

to: @StellaAthena

@haileyschoelkopf
Copy link
Contributor

Thanks very much for this contribution, @seopbo ! It is very appreciated :)

Will merge this shortly--LLama-7B achieves 40.5% instead of the 50% in the paper with this, and I'd like to confirm that we can't get any closer to their / other published setups first. (The paper seems to imply the prompt is "Q: {question}\nA:")

@haileyschoelkopf
Copy link
Contributor

Thanks again! Changed this so it should use the filtered dev set, following LLaMA.

@haileyschoelkopf haileyschoelkopf merged commit b018a7d into EleutherAI:master Jun 14, 2023
2 checks passed
@wwngh1233
Copy link

can you update the new performance of llama 7b on triviaqa?

@seopbo seopbo deleted the fix-triviaqa branch June 29, 2023 01:15
qmdnls pushed a commit to qmdnls/lm-evaluation-harness that referenced this pull request Aug 17, 2023
LZY-the-boys pushed a commit to LZY-the-boys/lm-evaluation-harness-fast that referenced this pull request Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Validate TriviaQA
5 participants