By Ali Borji, Mehrdad Mohammadian
ResearchGate: link
SSRN Electronic Journal Preprint: link
Link to paper: to be announced
to be announced
to be announced
In total, our dataset contains 1002 question-answer pairs. There are 27 categories that can be used to assess the main and important abilities of the large language models. The figure below shows the number of questions per category.
To access the dataset, see the data folder or download the dataset from the release section. Both json
and csv
formats are provided for all categories, you can use them based on your need. For those categories/questions that do not require an answer, "NONE" is replaced as the answer.
If you are interested to contribute to expanding proposed dataset, please open an issue or just send an email. We encourage you to add your question-answer pairs in any category and language.
SSRN preprint:
@misc{BorjiMohammadianWordsmiths,
author = {Borji, Ali and Mohammadian, Mehrdad},
year = {2023},
month = {06},
pages = {},
title = {Battle of the Wordsmiths: Comparing ChatGPT, GPT-4, Claude, and Bard},
journal = {SSRN Electronic Journal},
doi = {10.2139/ssrn.4476855}
}