Skip to content

Repository for EMNLP-2023 Findings Paper - xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark

License

Notifications You must be signed in to change notification settings

e0397123/xDial-Eval

Repository files navigation

xDial-Eval

Repository for EMNLP-2023 Findings Paper - xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark

Changelog

[25/10/2023] Add data to the repository.

[27/10/2023] Add code for zero-shot inference with open-source LLMs to the repository.

Prerequisites

  • Python 3.8+ and PyTorch 1.13.1+
  • See requirments.txt

Data Format

  1. The csv files in the turn-level data include columns: [lang]_ctx, [lang]_res, and raings where [lang] refers to different languages.
  2. The csv files in the dialogue-level data include columns: [lang]_dial and raings where [lang] refers to different languages.
  3. [lang]_ctx and [lang]_dialogue are delimited by \n.

Original English Data

Sources

Note that for accessing Human-Eval data, please contact the original authors of Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents. Once you have obtained the permission, you may contact me to obtain the multilingual extension of Human-Eval data.

Zero-shot Inference with Open-source LLMs

Currently, we included scripts for zero-shot inference with LLama-2, Baichuan-2, Phoenix, and Alpaca. You can easily adapt the scripts to other open-source LLMs.

The python scripts can be found in zeroshot_inference and the shell scripts are in scripts/zeroshot_inference.

Example execution - bash zeroshot_inference/turn/infer_alpaca.sh.

Please cite us if you found our benchmark useful

@inproceedings{zhang-etal-2023-xdial,
    title = "x{D}ial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark",
    author = "Zhang, Chen  and
      D{'}Haro, Luis  and
      Tang, Chengguang  and
      Shi, Ke  and
      Tang, Guohua  and
      Li, Haizhou",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.findings-emnlp.371",
    doi = "10.18653/v1/2023.findings-emnlp.371",
    pages = "5579--5601",
}

Acknowledge Statement

We thank all the authors for kindly making their data publicly available. In the same spirit, we make our multilingual extension publicly available as well. We hope our data can further benefit researchers working on multilingual open-domain dialogue systems and evaluation metrics.

About

Repository for EMNLP-2023 Findings Paper - xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published