-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into chat_template
- Loading branch information
Showing
84 changed files
with
1,063 additions
and
24 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# COPAL | ||
|
||
### Paper | ||
|
||
Title: `COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances` | ||
|
||
Abstract: `https://arxiv.org/abs/2311.01012` | ||
|
||
`COPAL-ID is an Indonesian causal commonsense reasoning dataset that captures local nuances. It provides a more natural portrayal of day-to-day causal reasoning within the Indonesian (especially Jakartan) cultural sphere. Professionally written and validatid from scratch by natives, COPAL-ID is more fluent and free from awkward phrases, unlike the translated XCOPA-ID.` | ||
|
||
Homepage: `https://github.com/haryoa/copal-id` | ||
|
||
|
||
### Citation | ||
|
||
``` | ||
@article{wibowo2023copal, | ||
title={COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances}, | ||
author={Wibowo, Haryo Akbarianto and Fuadi, Erland Hilman and Nityasya, Made Nindyatama and Prasojo, Radityo Eko and Aji, Alham Fikri}, | ||
journal={arXiv preprint arXiv:2311.01012}, | ||
year={2023} | ||
} | ||
``` | ||
|
||
### Groups and Tasks | ||
|
||
#### Groups | ||
|
||
* `copal_id` | ||
|
||
#### Tasks | ||
|
||
* `copal_id_standard`: `Standard version of COPAL dataset, use formal language and less local nuances` | ||
* `copal_id_colloquial`: `Colloquial version of COPAL dataset, use informal language and more local nuances` | ||
|
||
### Checklist | ||
|
||
For adding novel benchmarks/datasets to the library: | ||
* [x] Is the task an existing benchmark in the literature? | ||
* [x] Have you referenced the original paper that introduced the task? | ||
* [x] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test? | ||
|
||
|
||
If other tasks on this dataset are already supported: | ||
* [ ] Is the "Main" variant of this task clearly denoted? | ||
* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates? | ||
* [ ] Have you noted which, if any, published evaluation setups are matched by this variant? |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
include: standard.yaml | ||
task: copal_id_colloquial | ||
task_alias: colloquial | ||
test_split: test_colloquial |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
group: copal_id | ||
task: copal_id_standard | ||
task_alias: standard | ||
dataset_path: haryoaw/COPAL | ||
dataset_name: id | ||
output_type: multiple_choice | ||
test_split: test | ||
doc_to_text: !function utils.doc_to_text_id | ||
doc_to_target: label | ||
doc_to_choice: !function utils.doc_to_choice | ||
metric_list: | ||
- metric: acc | ||
metadata: | ||
version: 1.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
from functools import partial | ||
|
||
|
||
def convert_choice(choice): | ||
return choice[0].lower() + choice[1:] | ||
|
||
|
||
def doc_to_text(doc, connector): | ||
conn = connector[doc["question"]] | ||
return doc["premise"].strip()[:-1] + f" {conn}" | ||
|
||
|
||
def doc_to_choice(doc): | ||
return [convert_choice(doc["choice1"]), convert_choice(doc["choice2"])] | ||
|
||
|
||
doc_to_text_id = partial( | ||
doc_to_text, | ||
connector={ | ||
"cause": "karena", | ||
"effect": "maka", | ||
}, | ||
) |
11 changes: 11 additions & 0 deletions
11
lm_eval/tasks/mmlu/continuation/_continuation_template_yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
dataset_path: hails/mmlu_no_train # a copy of `cais/mmlu` with no auxiliary_train split | ||
output_type: multiple_choice | ||
test_split: test | ||
fewshot_split: dev | ||
fewshot_config: | ||
sampler: first_n | ||
doc_to_text: "Question: {{question.strip()}}\nAnswer:" | ||
doc_to_choice: "{{choices}}" | ||
doc_to_target: "{{answer}}" | ||
metadata: | ||
version: 0.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
group: mmlu_continuation | ||
task: | ||
- mmlu_continuation_stem | ||
- mmlu_continuation_other | ||
- mmlu_continuation_social_sciences | ||
- mmlu_continuation_humanities |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "abstract_algebra" | ||
"description": "The following are questions (with answers) about abstract\ | ||
\ algebra.\n\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_abstract_algebra" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "anatomy" | ||
"description": "The following are questions (with answers) about anatomy.\n\ | ||
\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_anatomy" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "astronomy" | ||
"description": "The following are questions (with answers) about astronomy.\n\ | ||
\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_astronomy" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "business_ethics" | ||
"description": "The following are questions (with answers) about business\ | ||
\ ethics.\n\n" | ||
"group": "mmlu_continuation_other" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_business_ethics" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "clinical_knowledge" | ||
"description": "The following are questions (with answers) about clinical\ | ||
\ knowledge.\n\n" | ||
"group": "mmlu_continuation_other" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_clinical_knowledge" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "college_biology" | ||
"description": "The following are questions (with answers) about college\ | ||
\ biology.\n\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_college_biology" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "college_chemistry" | ||
"description": "The following are questions (with answers) about college\ | ||
\ chemistry.\n\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_college_chemistry" |
6 changes: 6 additions & 0 deletions
6
lm_eval/tasks/mmlu/continuation/mmlu_college_computer_science.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "college_computer_science" | ||
"description": "The following are questions (with answers) about college\ | ||
\ computer science.\n\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_college_computer_science" |
6 changes: 6 additions & 0 deletions
6
lm_eval/tasks/mmlu/continuation/mmlu_college_mathematics.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "college_mathematics" | ||
"description": "The following are questions (with answers) about college\ | ||
\ mathematics.\n\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_college_mathematics" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "college_medicine" | ||
"description": "The following are questions (with answers) about college\ | ||
\ medicine.\n\n" | ||
"group": "mmlu_continuation_other" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_college_medicine" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "college_physics" | ||
"description": "The following are questions (with answers) about college\ | ||
\ physics.\n\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_college_physics" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "computer_security" | ||
"description": "The following are questions (with answers) about computer\ | ||
\ security.\n\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_computer_security" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "conceptual_physics" | ||
"description": "The following are questions (with answers) about conceptual\ | ||
\ physics.\n\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_conceptual_physics" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "econometrics" | ||
"description": "The following are questions (with answers) about econometrics.\n\ | ||
\n" | ||
"group": "mmlu_continuation_social_sciences" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_econometrics" |
6 changes: 6 additions & 0 deletions
6
lm_eval/tasks/mmlu/continuation/mmlu_electrical_engineering.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "electrical_engineering" | ||
"description": "The following are questions (with answers) about electrical\ | ||
\ engineering.\n\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_electrical_engineering" |
6 changes: 6 additions & 0 deletions
6
lm_eval/tasks/mmlu/continuation/mmlu_elementary_mathematics.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
"dataset_name": "elementary_mathematics" | ||
"description": "The following are questions (with answers) about elementary\ | ||
\ mathematics.\n\n" | ||
"group": "mmlu_continuation_stem" | ||
"include": "_continuation_template_yaml" | ||
"task": "mmlu_continuation_elementary_mathematics" |
Oops, something went wrong.