Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLDR-14276 Updating ku_arab (ckb) to match official character #823

Closed
wants to merge 1 commit into from

Conversation

jwtiyar
Copy link

@jwtiyar jwtiyar commented Nov 3, 2020

CLDR-14276
The one that provided is wrong and not used yet for ku_arab (ckb) central kurdish language, The problem with ە and ك is resolved with government statments which i will mention below.
This code (U+0647) used for ھ And (U+06D5) used for ە, And changing ك to ک which is reperesents by (U+06A9).
https://unicode.ekrg.org/index.html

Best Regards.

The one that provided is wrong and not used yet for ku_arab (ckb) central kurdish language, The problem with ە and ك is resolved with government statments which i will mention below.
This code (U+0647) used for ھ And (U+06D5) used for ە, And changing ك to ک which is reperesents by (U+06A9).
https://unicode.ekrg.org/index.html

Best Regards.
@srl295
Copy link
Member

srl295 commented Nov 14, 2020

Hi, the commit will need to start with CLDR-14276 otherwise it looks OK formally.

And, the ticket will need to be accepted.

@jwtiyar jwtiyar changed the title CLDR-14276 Updating ku_arab (ckb) to match official character CLDR - 14276 Updating ku_arab (ckb) to match official character Nov 14, 2020
@jwtiyar
Copy link
Author

jwtiyar commented Nov 15, 2020

Thank you, You mean the title ?

@jwtiyar jwtiyar changed the title CLDR - 14276 Updating ku_arab (ckb) to match official character CLDR-14276 Updating ku_arab (ckb) to match official character Nov 15, 2020
@srl295
Copy link
Member

srl295 commented Nov 16, 2020

Thank you, You mean the title ?

Not just the title, but also the actual commit message. The commit has:

Updating ku_arab (ckb) to match official character

Can you amend the commit, and force-push an updated commit?

@macchiati
Copy link
Member

The Jira ticket needs to be explicit as to the exact changes being requested.

@jwtiyar
Copy link
Author

jwtiyar commented Nov 16, 2020

The Jira ticket needs to be explicit as to the exact changes being requested.

I Already mentioned that two character changed which are (ە) and (ک).

Can you amend the commit, and force-push an updated commit?

I dont know how to do edit my commit.
Also here in cldr repo in keyboard layout for windows you used already the right characters but not here and OSx keyboard which i did push another commit for it before.
https://github.com/unicode-org/cldr/blob/master/keyboards/windows/ckb-t-k0-windows.xml

@layik
Copy link
Contributor

layik commented Jul 4, 2021

Can I help here? @macchiati

@jwtiyar
Copy link
Author

jwtiyar commented Nov 30, 2021

Does We have to wait when world will end? @macchiati @srl295
I did PR when CLDR was on v38.1, Now is v40, if anything wrong or not possible to merge tell us.

@srl295
Copy link
Member

srl295 commented Dec 1, 2021

@jwtiyar I've moved the ticket back to New since it now has a specific list of characters

@srl295 srl295 self-assigned this Dec 1, 2021
@DavidLRowe
Copy link
Contributor

In the current pull request, it seems that (in the file exemplars/main/ku_Arab.xml):
(a) the order of U+0632 (Arabic letter zain) and U+0695 (Arabic letter reh with small v below) is swapped, that is, currently the order is U+0632 U+0695 and in this pull request the order is U+0695 U+0632.
(b) U+0643 (Arabic letter kaf) is replaced by U+06A9 (Arabic letter kekeh).

In CLDR-14276: Updating ckb (ku_arab) characters for exemplarsNEW you mention U+0647 (Arabic letter heh) and U+06D5 (Arabic letter ae), but those seem to exist in both the existing ku_Arab.xml file and the one in this pull request, so it's not clear why these characters are mentioned in the CLDR ticket.

But on another level, I'm wondering if the correct file is being targeted. (I know that my understanding of how macrolanguages are handled is incomplete, so I don't know the answer to the questions I raise, and I certainly could be totally mistaken!)

"ku" is a macrolanguage for Kurdish and includes:
"kmr" Northern Kurdish
"ckb" Central Kurdish
"sdh" Southern Kurdish

It seems that "kmr" is the representative language for "ku" (based on finding "ckb", "sdh" and "ku" in likelySubtags.xml, but not "kmr", however perhaps that is not a valid conclusion?). Also, common/main/ku.xml has:
soranî
kurdî
(and no entry for "kmr" or "sdh")

Currently in CLDR there are a number of files related to Kurdish:

common/annotations/ku.xml
common/collation/ku.xml
common/main/ku.xml
common/main/ku_TR.xml
exemplars/main/ku_Arab.xml

common/annotations/ckb.xml
common/annotationsDerived/ckb.xml
common/main/ckb.xml
common/main/ckb_IQ.xml
common/main/ckb_IR.xml
common/subdivisions/ckb.xml

seed/main/sdh.xml
seed/main/sdh_IQ.xml
seed/main/sdh_IR.xml

Perhaps this PR should introduce a new file: exemplars/main/ckb_Arab.xml, rather than modify ku_Arab.xml?

@DavidLRowe
Copy link
Contributor

Also, I notice a difference in the order of the characters between the document https://unicode.ekrg.org/ku_unicodes.html:
U+0648
U+06C6
U+0647
U+06D5

and this pull request:
U+‎0647
‎U+06D5
‎U+0648
‎U+06C6

I don't know if this difference is significant or not.

Also, I note that there is no index exemplar.

@jwtiyar
Copy link
Author

jwtiyar commented Dec 2, 2021

In the current pull request, it seems that (in the file exemplars/main/ku_Arab.xml): (a) the order of U+0632 (Arabic letter zain) and U+0695 (Arabic letter reh with small v below) is swapped, that is, currently the order is U+0632 U+0695 and in this pull request the order is U+0695 U+0632. (b) U+0643 (Arabic letter kaf) is replaced by U+06A9 (Arabic letter kekeh).

In CLDR-14276: Updating ckb (ku_arab) characters for exemplarsNEW you mention U+0647 (Arabic letter heh) and U+06D5 (Arabic letter ae), but those seem to exist in both the existing ku_Arab.xml file and the one in this pull request, so it's not clear why these characters are mentioned in the CLDR ticket.

But on another level, I'm wondering if the correct file is being targeted. (I know that my understanding of how macrolanguages are handled is incomplete, so I don't know the answer to the questions I raise, and I certainly could be totally mistaken!)

"ku" is a macrolanguage for Kurdish and includes: "kmr" Northern Kurdish "ckb" Central Kurdish "sdh" Southern Kurdish

It seems that "kmr" is the representative language for "ku" (based on finding "ckb", "sdh" and "ku" in likelySubtags.xml, but not "kmr", however perhaps that is not a valid conclusion?). Also, common/main/ku.xml has: soranî kurdî (and no entry for "kmr" or "sdh")

Currently in CLDR there are a number of files related to Kurdish:

common/annotations/ku.xml common/collation/ku.xml common/main/ku.xml common/main/ku_TR.xml exemplars/main/ku_Arab.xml

common/annotations/ckb.xml common/annotationsDerived/ckb.xml common/main/ckb.xml common/main/ckb_IQ.xml common/main/ckb_IR.xml common/subdivisions/ckb.xml

seed/main/sdh.xml seed/main/sdh_IQ.xml seed/main/sdh_IR.xml

Perhaps this PR should introduce a new file: exemplars/main/ckb_Arab.xml, rather than modify ku_Arab.xml?

Thank you, @DavidLRowe
For point (a) yes, both are exist and I just swapped them to make it like the formal one we have, nothing else.
For point(b) Yes U+0643 is wrong it should not be used for ckb as I mentioned before, The correct one is U+06A9 but U+0643 can be used as a shift for U+06A9. Because it's not ckb character its an Arabic character.

For targeting the correct language, I agree with you the correct one is ckb_arab(or just ckb) but I didn't change the file name because ku_arab was already exist in CLDR, If it can be changed would be better and recognizable for us as Kurdish.
CKB = Written in Arabic script
KMR = Written in English Script
As you mentioned these files that have kurdish language, CLDR itself don't know which one to use they used different name for same language I don't know why this happened(ckb_iq, ku_arab, ckb).

KU is a macro language for Kurdish. KU should not be used for any of them.
If we can change these name that exist in CLDR then I can change them all and make a new PR for each of them. what is your suggest?

@DavidLRowe
Copy link
Contributor

I do not fully understand how the complexities of how macrolanguages are handled. @macchiati will need to comment.

@srl295
Copy link
Member

srl295 commented Dec 2, 2021

@jwtiyar Hi. you mention ckb and 'central kurdish' - my understanding is that Central or Sorani Kurdish should be in ckb.xml and not ku_Arab.xml

Update: Oh, you are updating the exemplars/main/ku_Arab.xml - I think instead you should update common/main/ckb.xml

@macchiati
Copy link
Member

macchiati commented Dec 2, 2021 via email

@jwtiyar
Copy link
Author

jwtiyar commented Dec 2, 2021

@jwtiyar Hi. you mention ckb and 'central kurdish' - my understanding is that Central or Sorani Kurdish should be in ckb.xml and not ku_Arab.xml

Update: Oh, you are updating the exemplars/main/ku_Arab.xml - I think instead you should update common/main/ckb.xml

I know but ckb was not exist in exemplars/main folder.

@srl295
Copy link
Member

srl295 commented Dec 2, 2021

@jwtiyar Hi. you mention ckb and 'central kurdish' - my understanding is that Central or Sorani Kurdish should be in ckb.xml and not ku_Arab.xml
Update: Oh, you are updating the exemplars/main/ku_Arab.xml - I think instead you should update common/main/ckb.xml

I know but ckb was not exist in exemplars/main folder.

Correct. exemplars is for locales that don't have enough data to be in seed or main.

Please redo this PR to apply to https://github.com/unicode-org/cldr/blob/main/common/main/ckb.xml

@jwtiyar
Copy link
Author

jwtiyar commented Dec 2, 2021

Yes, see
https://cldr.unicode.org/index/cldr-spec/picking-the-right-language-code

ku_Arab is the equivalent of kmr_Arab, and would not be Central or
Soprani Kurdish.

Mark

On Thu, Dec 2, 2021 at 7:36 AM Steven R. Loomis @.***>
wrote:

@jwtiyar https://github.com/jwtiyar Hi. you mention ckb and 'central
kurdish' - my understanding is that Central or Sorani Kurdish should be in
ckb.xml and not ku_Arab.xml


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#823 (comment), or
unsubscribe
https://github.com/notifications/unsubscribe-auth/ACJLEMGMGA5AEVQBZCKK23TUO6G6ZANCNFSM4TIOLIWQ
.
Triage notifications on the go with GitHub Mobile for iOS
https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675
or Android
https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Hey @macchiati
Thank you for your comment, No ku_arab is not kmr_arab, if you do this it's completely wrong, ku should not refer to kurmanju kurdish (kmr),
Ku_arab is wrong also but it's provided by cldr, i did ku_arab because gnu-linux distros ku_arab refer to ckb not kmr_arab.

Conclusion :
If you you agree i can help to fix language names because its very complicated here some time you mention ckb, ckb_iq, ku_arab for same language.

Ckb = Central kurdish (sorani)
Kmr = Northern kurdish (kurmanji)

@macchiati
Copy link
Member

macchiati commented Dec 2, 2021 via email

@jwtiyar
Copy link
Author

jwtiyar commented Dec 5, 2021

The choice of ku = kmr is deliberate. The standard usage in CLDR is as described on https://cldr.unicode.org/index/cldr-spec/picking-the-right-language-code. So we will not accept PRs that violate that policy. Note, however, that a client that needs to use kmr instead of ku can do that by remapping CLDR data as they need to. That is done, for example, with iw and he by some clients.

So it means we can't work like that, As I said using ku for kmr is wrong and its my language.

Also you can check here
https://util.unicode.org/UnicodeJsps/languageid.jsp?a=kmr
and
https://util.unicode.org/UnicodeJsps/languageid.jsp?a=ku

Unicode uses ku for kurdish not Northern Kurdish
They do use kmr for northern as i said before.

@jwtiyar
Copy link
Author

jwtiyar commented Dec 5, 2021

Another suggestion would be like that
ckb_arab
ckb_latn
kmr_latn
kmr_arab

This does make sense and correct.

jwtiyar added a commit to jwtiyar/cldr that referenced this pull request Dec 9, 2021
Maybe I made a mistake to merge this file with the one from OSX folder.
This file has independent PR Here:
unicode-org#823
@macchiati
Copy link
Member

macchiati commented Feb 2, 2022

The policy on use of macrolanguages is not going to change. If the goal is to affect the exemplar characters for ckb, then the right target is ckb.xml

<exemplarCharacters>[ئ ا ب پ ت ج چ ح خ د ر ز ڕ ژ س ش ع غ ف ڤ ق ک گ ل ڵ م ن ھ ە و ۆ ی ێ]</exemplarCharacters>
<exemplarCharacters type="auxiliary">[\u200E\u200F \u064B \u064C \u064D \u064E \u064F \u0650 \u0651 \u0652 ء آ أ ؤ إ ة ث ذ ص ض ط ظ ك ه ى ي]</exemplarCharacters>
<exemplarCharacters type="index" draft="unconfirmed">[ئ ا ب پ ت ج چ ح خ د ر ز ڕ ژ س ش ع غ ف ڤ ق ک گ ل ڵ م ن ھ ە و ۆ ی ێ]</exemplarCharacters>
<exemplarCharacters type="numbers">[\u200E\u200F \- ‑ , ٫ ٬ . % ٪ ‰ ؉ + 0٠ 1١ 2٢ 3٣ 4٤ 5٥ 6٦ 7٧ 8٨ 9٩]</exemplarCharacters>

@macchiati macchiati closed this Feb 2, 2022
@layik
Copy link
Contributor

layik commented Feb 2, 2022

@jwtiyar let me know if you need any help. Looks like another PR on a ckb.xml target?

@jwtiyar
Copy link
Author

jwtiyar commented Feb 3, 2022

The policy on use of macrolanguages is not going to change. If the goal is to affect the exemplar characters for ckb, then the right target is ckb.xml

<exemplarCharacters>[ئ ا ب پ ت ج چ ح خ د ر ز ڕ ژ س ش ع غ ف ڤ ق ک گ ل ڵ م ن ھ ە و ۆ ی ێ]</exemplarCharacters>
<exemplarCharacters type="auxiliary">[\u200E\u200F \u064B \u064C \u064D \u064E \u064F \u0650 \u0651 \u0652 ء آ أ ؤ إ ة ث ذ ص ض ط ظ ك ه ى ي]</exemplarCharacters>
<exemplarCharacters type="index" draft="unconfirmed">[ئ ا ب پ ت ج چ ح خ د ر ز ڕ ژ س ش ع غ ف ڤ ق ک گ ل ڵ م ن ھ ە و ۆ ی ێ]</exemplarCharacters>
<exemplarCharacters type="numbers">[\u200E\u200F \- ‑ , ٫ ٬ . % ٪ ‰ ؉ + 0٠ 1١ 2٢ 3٣ 4٤ 5٥ 6٦ 7٧ 8٨ 9٩]</exemplarCharacters>

So what will happen to ku_arab?
And you want me to create new target from "exemplars/main" with ckb.xml? because this directory does not have ckb.xml?

@jwtiyar
Copy link
Author

jwtiyar commented Feb 3, 2022

@jwtiyar let me know if you need any help. Looks like another PR on a ckb.xml target?

سوپاس کاک هیوا بەڵام بەخوا دەهرییان کردووم ئەمانە، هەر شوێنە و ناوێکیان بۆ کوردی داناوە و ناشیگۆڕن. نازانم چییان لیبکەم، تۆ هیچ بیرۆکەیەکت هەیە؟

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants