Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mandarin homophones eval 🇨🇳 #689

Open
ofou opened this issue Apr 15, 2023 · 0 comments
Open

Mandarin homophones eval 🇨🇳 #689

ofou opened this issue Apr 15, 2023 · 0 comments

Comments

@ofou
Copy link

ofou commented Apr 15, 2023

Homophones are two or more words having the same pronunciation but different meanings, for example, 'rose' (flower) and 'rose' (rise) in English. Currently, I'm learning Mandarin using ChatGPT, and I realized it makes mistakes when identifying tones in Mandarin. These are the four tones in Mandarin:

Tone Tone Description Examples
High tone Flat and high pitch 妈妈 (māma) - mother
Rising tone Starts low and rises to a high pitch 麻 (má) - numb, hemp
Falling-rising tone Starts high, falls, then rises again 你好 (nǐ hǎo) - hello
Falling tone Starts high and falls to a low pitch 不 (bù) - not

The same sound for practical purposes is just the same Pinyin (romanization of Mandarin, ex. nǐ hǎo), but as you can see, ChatGPT and GPT-4 both make errors when differentiating tones.

Here are some examples:

Screenshot 2023-04-15 at 14 22 26

As you can see, there are many errors in 2, 3, 4, and so on. I've tried many times.

Also, GPT-4 makes the same type of mistakes. It seems to be unable to differentiate tones accurately. I highlighted some errors, they should have had the same Pinyin like the other examples.

Screenshot 2023-04-15 at 14 27 34

I've compiled a list of Homophones in Mandarin, to provide some examples for Evals.
Is this something of interest to OpenAI? I'll submit a PR if so.

Let me know! 🙋🏻‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant