Klokan-qa task #1657

hynky1999 · 2024-04-01T20:20:13Z

What is Klokan-qa

Dataset of mathematical questions in Czech for elementary and high-school children
~850 questions
Dataset is not translated, which makes it the only publicly available dataset in Czech for math, which is not translated

CLAassistant · 2024-04-01T20:20:20Z

All committers have signed the CLA.

haileyschoelkopf

Thank you for the PR, this looks like a cool dataset!

A few questions before we could merge:

Is there a (small? open weights) model that could be run on this task to get nonzero performance, to check that the implementation is correct?
Could you add a lm_eval/tasks/klokan-qa/README.md file that follows the typical template format, mentioning the source of this dataset (and a link to the source that introduced the dataset) and any other relevant details about the task implementation?

haileyschoelkopf · 2024-04-02T14:31:32Z

lm_eval/tasks/klokan-qa/klokan-qa.yaml

+ - "</s>"
+ - "<|im_end|>"
+ do_sample: true
+ temperature: 0.0000001


Why not temperature: 0.0 and do_sample: false ?

Yeah, my bad

lm_eval/tasks/klokan-qa/klokan-qa.yaml

…s into klokan-qa

haileyschoelkopf

Took another look and had some other comments that might be helpful before merge! Lmk what you think.

lm_eval/tasks/klokan-qa/klokan-qa.yaml

haileyschoelkopf · 2024-04-07T20:30:59Z

lm_eval/tasks/klokan-qa/klokan-qa.yaml

+
+ - name: "flexbile-extract"
+ filter:
+ - function: "regex"


Could use the multiple choice filter that's existing for this? or was this meant to be regex_pattern: "([A-E])" ?

lm-evaluation-harness/lm_eval/tasks/mmlu/flan_cot_zeroshot/_mmlu_flan_cot_zeroshot_template_yaml

Lines 13 to 20 in e9a4054

- name: "flexible-extract"

filter:

- function: !function utils.MultiChoiceRegexFilter

group_select: -1

ignore_case: true

ignore_punctuation: true

regex_pattern: "(\\([A-Z]\\))"

- function: "take_first"

haileyschoelkopf · 2024-04-07T20:34:21Z

lm_eval/tasks/klokan-qa/klokan-qa.yaml

+ "Jsi expert v matematice, logice, českém jazyce a všeobecných znalostech.\nDostaneš otázku s 5 možnými odpověďmi (A, B, C, D, E) a tvým úkolem je vybrat správnou odpověď.\nNejprve bys měl o odpovědi přemýšlet a poté vypsat písmeno správné odpovědi.\nJazyk otázek a odpovědí je čeština.\n\n\
+ Q: Jak se jmenuje hlavní město České republiky?\n(A) Brno\n(B) Praha\n(C) Ostrava\n(D): Plzeň\n(E) Liberec\nA: Přestože Brno a Ostrava jsou velká města, Praha je hlavní město České republiky, správná odpověď je Praha. Správná odpověď: B.\n\n\
+ Q: Mařenka má 5 jablíček a 3 hrušky. Kolik má ovoce?\n(A) 5\n(B) 1\n(C) 3\n(D) 2\n(E) 8\nA: Jak jablka, tak hrušky jsou ovoce a tedy 3 + 5 = 8. Správná odpověď: E\n\n\
+ Q: {{question}}\n(A) {{A}}\n(B) {{B}}\n(C) {{C}}\n(D) {{D}}\n(E) {{E}}"


Realized this might be missing the trailing \nA:?

Suggested change

Q: {{question}}\n(A) {{A}}\n(B) {{B}}\n(C) {{C}}\n(D) {{D}}\n(E) {{E}}"

Q: {{question}}\n(A) {{A}}\n(B) {{B}}\n(C) {{C}}\n(D) {{D}}\n(E) {{E}}\nA:"

Also if you intend people to only ever evaluate the task with these specific fewshot samples, consider setting num_fewshot: 0 in this config and adding

metadata: version: 3.0 num_fewshot: 2

for making this only ever able to be evaluated "zero-shot" (no extra fewshots added to your prompt here) and to print that it is actually a 2-shot prompt, respectively

Co-authored-by: Hailey Schoelkopf <[email protected]>

klokan

146242c

hynky1999 requested review from haileyschoelkopf and lintangsutawika as code owners April 1, 2024 20:20

haileyschoelkopf requested changes Apr 2, 2024

View reviewed changes

hynky1999 added 3 commits April 7, 2024 20:54

add flexible extract + no temperature

68a7708

Merge branch 'EleutherAI:main' into klokan-qa

f6b5518

Merge branch 'klokan-qa' of github.com:hynky1999/lm-evaluation-harnes…

83c981e

…s into klokan-qa

hynky1999 requested a review from haileyschoelkopf April 7, 2024 19:13

haileyschoelkopf requested changes Apr 7, 2024

View reviewed changes

Update lm_eval/tasks/klokan-qa/klokan-qa.yaml

8b6c749

Co-authored-by: Hailey Schoelkopf <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Klokan-qa task #1657

Klokan-qa task #1657

hynky1999 commented Apr 1, 2024

CLAassistant commented Apr 1, 2024 •

edited

Loading

haileyschoelkopf left a comment

haileyschoelkopf Apr 2, 2024

hynky1999 Apr 7, 2024

haileyschoelkopf left a comment

haileyschoelkopf Apr 7, 2024

haileyschoelkopf Apr 7, 2024

	- name: "flexible-extract"
	filter:
	- function: !function utils.MultiChoiceRegexFilter
	group_select: -1
	ignore_case: true
	ignore_punctuation: true
	regex_pattern: "(\\([A-Z]\\))"
	- function: "take_first"

	Q: {{question}}\n(A) {{A}}\n(B) {{B}}\n(C) {{C}}\n(D) {{D}}\n(E) {{E}}"
	Q: {{question}}\n(A) {{A}}\n(B) {{B}}\n(C) {{C}}\n(D) {{D}}\n(E) {{E}}\nA:"

Klokan-qa task #1657

Are you sure you want to change the base?

Klokan-qa task #1657

Conversation

hynky1999 commented Apr 1, 2024

What is Klokan-qa

CLAassistant commented Apr 1, 2024 • edited Loading

haileyschoelkopf left a comment

Choose a reason for hiding this comment

haileyschoelkopf Apr 2, 2024

Choose a reason for hiding this comment

hynky1999 Apr 7, 2024

Choose a reason for hiding this comment

haileyschoelkopf left a comment

Choose a reason for hiding this comment

haileyschoelkopf Apr 7, 2024

Choose a reason for hiding this comment

haileyschoelkopf Apr 7, 2024

Choose a reason for hiding this comment

CLAassistant commented Apr 1, 2024 •

edited

Loading