Accept BCP 47 language tags in the statuses API #23541

yheuhtozr · 2023-02-12T05:21:02Z

Pitch

Currently, I see three parameters that take an ISO 639 string in the statuses API.

POST /api/v1/statuses: language
POST /api/v1/statuses/:id/translate: lang
PUT /api/v1/statuses/:id: language

I suggest that we should expand them to accept a well-formed BCP 47 string, or if any compatibility concerns, make another parameter (such as locale?) that can store them. The ActivityPub content property readily accepts BCP 47, so I think this is just a Mastodon API restriction.

Motivation

ISO 639 alone is not a "practical" approximation of concrete languages people perceive
- In short, a single ISO 639 code may look a reasonable identifier in most of the European region, it often doesn't in other parts of the world. It is either because history, politics (ISO is voted by countries), or simply mismatch of technology and culture. Sometimes multiple subtags combined to be the "idiom" for a user-perceived language.
- I already found many people reporting various issues in Add more options to language selection #18538
- But if I'd to add another, for example, "Spanish" variants across the Atlantic are already enough divergent that a random everyday word in one region is a profanity in another (example), that would seriously harm the usability of profanity filter (Scunthorpe problem).
- Machine translation providers, such as Google Translate or DeepL already support AmE (en-US) vs BrE (en-GB) and Brazillian Portuguese (pt-BR) vs Portuguese Portuguese (pt-PT), which Mastodon API cannot accept (edit: it was already reported in Consider the country when translating with DeepL #22707).
it affects wider audience than Mastodon
- Other software including Friendica, Pleroma, GoToSocial..., and various client apps rely on Mastodon API for compatibility, so it is restraining a much wider user base from implementing more flexible language selection on their own.
- Even if Mastodon has difficulty putting it in use anytime soon due to other design issues (though I'd like to see it happen), the API change is a helpful step I think "relatively easy" to do.

The text was updated successfully, but these errors were encountered:

yheuhtozr · 2023-02-26T15:27:24Z

I wonder if this is an adoptable solution, and if so, whether there is anything I can help with.

rschiang · 2023-07-08T14:56:09Z

I guess we could bring this issue on IRC? There must be a primary discussion space for Mastodon devs, and we’ll need some attention before even convincing the merge.

yheuhtozr · 2023-07-26T13:26:53Z

@rschiang Hi, I'm not very familiar with the actual ecosystem of Mastodon development. I am definitely ready to make a some kind of pitch if you can help me into a right place to do.

yheuhtozr · 2023-11-11T17:01:25Z

Just from my other comment:

The new ISO 639 is just out with substantial enlargement, so I can back up my opinion below with the official evidence. While the full text is proprietary, please allow me to cite its section 6.2.1:

Where spoken intelligibility between language varieties is marginal, the existence of a common literature or of a common ethnolinguistic identity with a central language variety that both speaker communities understand is a strong indicator that they should nevertheless be considered language varieties of the same individual language.

which essentially means that they publicly admit grouping up several practically unintelligible "dialects" into an identical ISO 639 code (the situation has always existed but now explicitly ratified). From the backstage perspective, this assumes the existence of other methods to specify subdivisions, such as IETF language tag or upcoming ISO 21636 framework, so in such cases we will need the help of combined language tags.

yheuhtozr added the suggestion Feature suggestion label Feb 12, 2023

yheuhtozr linked a pull request Mar 29, 2023 that will close this issue

Update language attribute restrictions in API mastodon/documentation#1193

Open

VyrCossont mentioned this issue Aug 3, 2023

[feature] Support any valid BCP 47 tag as a posting language superseriousbusiness/gotosocial#2066

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accept BCP 47 language tags in the statuses API #23541

Accept BCP 47 language tags in the statuses API #23541

yheuhtozr commented Feb 12, 2023 •

edited

Loading

yheuhtozr commented Feb 26, 2023

rschiang commented Jul 8, 2023

yheuhtozr commented Jul 26, 2023

yheuhtozr commented Nov 11, 2023

Accept BCP 47 language tags in the statuses API #23541

Accept BCP 47 language tags in the statuses API #23541

Comments

yheuhtozr commented Feb 12, 2023 • edited Loading

Pitch

Motivation

yheuhtozr commented Feb 26, 2023

rschiang commented Jul 8, 2023

yheuhtozr commented Jul 26, 2023

yheuhtozr commented Nov 11, 2023

yheuhtozr commented Feb 12, 2023 •

edited

Loading