Skip to content

Commit

Permalink
Update Hebrew language code to he per IANA registry (openai#401)
Browse files Browse the repository at this point in the history
* Update Hebrew language code to he per IANA registry

Per [IANA registry](https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry), `iw` was deprecated as the code for Hebrew in 1989 and the preferred code is `he`

The correct subtag: 
```
%%
Type: language
Subtag: he
Description: Hebrew
Added: 2005-10-16
Suppress-Script: Hebr
%%
``` 
And the deprecation
```
%%
Type: language
Subtag: iw
Description: Hebrew
Added: 2005-10-16
Deprecated: 1989-01-01
Preferred-Value: he
Suppress-Script: Hebr
%%
```

* Update hebrew ISO code to he

Per discussion, it's ok to make this change without backwards compatibility
  • Loading branch information
altryne authored Dec 7, 2022
1 parent fd8f80c commit b9265e5
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion whisper/tokenizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
"hi": "hindi",
"fi": "finnish",
"vi": "vietnamese",
"iw": "hebrew",
"he": "hebrew",
"uk": "ukrainian",
"el": "greek",
"ms": "malay",
Expand Down

0 comments on commit b9265e5

Please sign in to comment.