mela #1970

Geralt-Targaryen · 2024-06-16T10:26:03Z

Add the ACL 2024 benchmark mela (multilingual evaluation of linguistic acceptability)

CLAassistant · 2024-06-16T10:26:08Z

All committers have signed the CLA.

StellaAthena · 2024-06-18T16:18:48Z

@Geralt-Targaryen Thanks for the contribution! Can you see about reporoducing some of the scores reported in Table 3 to validate the implementation is working correctly?

Geralt-Targaryen · 2024-07-01T11:25:45Z

@Geralt-Targaryen Thanks for the contribution! Can you see about reporoducing some of the scores reported in Table 3 to validate the implementation is working correctly?

Yes, here are some models' results from our original implementation and evaluation harness implementation:

model	shot	original (reported in the paper)	lm eval harness
BLOOMZ 7B	0	5.85	5.99±0.85
BLOOMZ 7B	2	4.31	4.11±0.87
mT0 13B	0	6.62	7.72±0.88
mT0 13B	2	7.70	5.82±0.75
mTk 13B	0	2.24	3.16±1.01
mTk 13B	2	12.05	12.26±0.98

As we explained in the paper, linguistic acceptability is a task with large performance variations. Fluctuations that result from the selection of in-context examples, floating point precisions, and prompt formatting are expected. A slight difference between the two implementations is that in our original version, we used two newlines after the task description, but it seems that eval harness treats multiple newlines after the task description as a single one.

mela

c4d96a6

Geralt-Targaryen requested review from haileyschoelkopf and lintangsutawika as code owners June 16, 2024 10:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mela #1970

mela #1970

Geralt-Targaryen commented Jun 16, 2024

CLAassistant commented Jun 16, 2024 •

edited

Loading

StellaAthena commented Jun 18, 2024

Geralt-Targaryen commented Jul 1, 2024

mela #1970

Are you sure you want to change the base?

mela #1970

Conversation

Geralt-Targaryen commented Jun 16, 2024

CLAassistant commented Jun 16, 2024 • edited Loading

StellaAthena commented Jun 18, 2024

Geralt-Targaryen commented Jul 1, 2024

CLAassistant commented Jun 16, 2024 •

edited

Loading