You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this file, a reference to the new dataset is given. However, doc_to_text and doc_to_target fields still includes English keywords like Question:.
What is the best practice for supporting other languages? Can we add more datasets into arc_easy.yaml and make it use a different processing method for each different dataset?
The text was updated successfully, but these errors were encountered:
Hi,
I am trying understand the best way to provide a separate LLM benchmark specifically evaluating against Turkish datasets.
This can be done by automatically translating the datasets references by tasks in
lm_eval/tasks
. This is the approach taken bymaking changes to files like
lm_eval/tasks/arc/arc_easy.yaml
as below:https://github.com/malhajar17/lm-evaluation-harness_turkish/blob/main/lm_eval/tasks/arc_tr/arc_easy.yaml
In this file, a reference to the new dataset is given. However,
doc_to_text
anddoc_to_target
fields still includes English keywords likeQuestion:
.What is the best practice for supporting other languages? Can we add more datasets into arc_easy.yaml and make it use a different processing method for each different dataset?
The text was updated successfully, but these errors were encountered: