Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

修复英文多音字,调整字典热加载,新增姓名匹配 #869

Merged
merged 5 commits into from
Mar 25, 2024

Conversation

KamioRinn
Copy link
Contributor

  1. 修复英文read、complex多音错误的问题
  2. 调整hot字典为热加载,方便用户随时修改
  3. 增加英文姓名字典,数据来自Facebook massive dump (533M users),筛选英美加三个地区重复次数超过2000次的名字
  4. 增加姓名匹配,对仅有首字母大写的单词优先使用英文姓名字典进行匹配

@KamioRinn
Copy link
Contributor Author

  1. 修正姓名匹配优先级到官方字典和自定义字典之后

@RVC-Boss RVC-Boss merged commit 6ccfd36 into RVC-Boss:main Mar 25, 2024
@hyhuc0079
Copy link

hyhuc0079 commented Mar 26, 2024

这这样我中文训练集里面混杂英文人名的样本是不是就不需要去掉了?
比如:
bonasera,bonasera,我到底是做了什么让你如此的不尊重我?

@KamioRinn
Copy link
Contributor Author

这这样我中文训练集里面混杂英文人名的样本是不是就不需要去掉了? 比如: bonasera,bonasera,我到底是做了什么让你如此的不尊重我?

可以识别。

@hyhuc0079
Copy link

这个可以直接覆盖到0306fix2版本中使用吗?只影响推理不需要重新训练吧?

RVC-Boss added a commit that referenced this pull request Apr 19, 2024
* Update README

* Optimize-English-G2P

* docs: change akward expression

* docs: update Changelog_KO.md

* Fix CN punc in EN,add 's match

* Adjust normalize and g2p logic

* Update zh_CN.json

* Update README (#827)

Update README.md
Update some outdated file paths and commands

* 修复英文多音字,调整字典热加载,新增姓名匹配 (#869)

* Fix homograph dict

* Add JSON in dict

* Adjust hot dict to hot reload

* Add English name dict

* Adjust get name dict logic

* Make API Great Again (#894)

* Add zh/jp/en mix

* Optimize code readability and formatted output.

* Try OGG streaming

* Add stream mode arg

* Add media type arg

* Add cut punc arg

* Eliminate punc risk

* Update README (#895)

* Update README

* Update README

* update README

* update README

* fix typo s/Licence /License (#904)

* fix reformat cmd (#917)

Co-authored-by: starylan <[email protected]>

* Update README.md

* Normalize chinese arithmetic operations (#947)

* 改变训练和推理时的mask策略,以修复当batch_size>1时,产生的复读现象

* 同步main分支代码,增加“保持随机”选项

* 在colab中运行colab_webui.ipynb发生的uvr5模型缺失问题 (#968)

在colab中使用git下载uvr5模型时报错:
fatal: destination path 'uvr5_weights' already exists and is not an empty directory.
通过在下载前将原本从本仓库下载的uvr5_weights文件夹删除可以解决问题。

* [ASR] 修复FasterWhisper遍历输入路径失败 (#956)

* remove glob

* rename

* reset mirror pos

* 回退mask策略;
回退pad策略;
在T2SBlock中添加padding_mask,以减少pad的影响;
开放repetition_penalty参数,让用户自行调整重复惩罚的强度;
增加parallel_infer参数,用于开启或关闭并行推理,关闭时与0307版本保持一致;
在webui中增加“保持随机”选项;
同步main分支代码。

* 删除无用注释

---------

Co-authored-by: Lion <[email protected]>
Co-authored-by: RVC-Boss <[email protected]>
Co-authored-by: KamioRinn <[email protected]>
Co-authored-by: Pengoose <[email protected]>
Co-authored-by: Yuan-Man <[email protected]>
Co-authored-by: XXXXRT666 <[email protected]>
Co-authored-by: KamioRinn <[email protected]>
Co-authored-by: Lion-Wu <[email protected]>
Co-authored-by: digger yu <[email protected]>
Co-authored-by: SapphireLab <[email protected]>
Co-authored-by: starylan <[email protected]>
Co-authored-by: shadow01a <[email protected]>
@KamioRinn KamioRinn deleted the Optimize-English-G2P branch July 14, 2024 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants