DiffSinger (XMTech maintained version)

这是由XMTech维护的，基于OpenVPI维护的DiffSinger的二次开发分支，代码克隆于2023年10月8日。
相较于原版仓库，本分支拥有更快的训练速度，完善的拼接引擎授权声库声库迁移流程，以及更简易的训练步骤。
在前期，我们会保证代码与原版仓库的兼容性，并在后续版本中提供更多的功能。

This is a fork of DiffSinger maintained by XMTech, cloned on 2023-10-08. Compared to the original repository, this branch has faster training speed, a well-designed splicing engine, and a more streamlined training process. We will ensure compatibility with the original repository in the early stages, and provide more features in subsequent versions.

不兼容更新预告 Notice of Incompatible Update

为方便后续更新维护，由 XMTech 的 DiffSinger 分支将在近期进行功能性切割，不再确保与 OpenVPI 仓库 DiffSinger 保持兼容。
我们会尽量本分支保证导出的模型与OpenVPI版本生态兼容。
【新增】
1.新增对 refinegan 声码器的支持，并将默认推荐声码器修改为 Kouon_Vocoder_refinegan。
（ONNX版：OpenUtau Dependencies，公测QQ群：749073684）
2.新增 multi-langs 项目词典支持，并将默认配置文件的词典替换为 multi-langs 三段式中文词典。
【修改】
1.取消独立的预处理步骤，优化训练流程，当不存在对应 binary 文件夹时自动进行预处理。
2.将 checkpoints 目录名替换为 ckpt，以保证在 jupyter-lab 中正常打开模型文件夹。
3.优化 onnx 文件导出流程。
【移除】
1.彻底移除残留的 DiffSpeech 组件。
2.唱法模型仅保留力度参数（Tension）。

In order to facilitate subsequent update maintenance, the DiffSinger branch maintained by XMTech will be functional cut in the near future, and compatibility with the DiffSinger repository in the OpenVPI warehouse will no longer be guaranteed.
We will try to ensure that the exported models are compatible with the OpenVPI version ecosystem.
【New】
1.Add support for refinegan vocoder, and change the default recommended vocoder to Kouon_Vocoder_refinegan.
(ONNX version: OpenUtau Dependencies, public test QQ group: 749073684)
2.Add multi-langs project dictionary support, and replace the dictionary in the default configuration file with the three-part Chinese dictionary.
【Modify】
1.Cancel the independent preprocessing step, optimize the training process, and automatically perform preprocessing when the corresponding binary folder does not exist.
2.Replace the checkpoints directory name with ckpt to ensure normal opening of the model folder in jupyter-lab.
3.Optimize the export process of onnx files.
【Remove】
1.Completely remove the residual DiffSpeech components.
2.Only retain the Tension in the variance.

References

DiffSinger: Original,OpenVPI maintained version
DiffSinger_XMTech_Maintained_Version: Github,启智 OpenI
DiffSinger_Toolkit 声库制作辅助工具: 启智 OpenI
OpenUtau: Github
OpenUtau For DiffSinger 白糖の正义铃维护版本: Github
XMTech VCS Vocoder 社区版本: 启智 OpenI

License

本分支受Apache 2.0 License 开源协议保护。

This branch is licensed under the Apache 2.0 License.

英文翻译：智谱GeekCodeX English translation: GeekCodeX

DiffSinger (OpenVPI maintained version)

This is a refactored and enhanced version of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism based on the original paper and implementation, which provides:

Cleaner code structure: useless and redundant files are removed and the others are re-organized.
Better sound quality: the sampling rate of synthesized audio are adapted to 44.1 kHz instead of the original 24 kHz.
Higher fidelity: improved acoustic models and diffusion sampling acceleration algorithms are integrated.
More controllability: introduced variance models and parameters for prediction and control of pitch, energy, breathiness, etc.
Production compatibility: functionalities are designed to match the requirements of production deployment and the SVS communities.

Overview	Variance Model	Acoustic Model

User Guidance

中文教程 / Chinese Tutorials: Text, Video

Installation & basic usages: See Getting Started
Dataset creation pipelines & tools: See MakeDiffSinger
Best practices & tutorials: See Best Practices
Editing configurations: See Configuration Schemas
Deployment & production: OpenUTAU for DiffSinger, DiffScope (under development)
Communication groups: QQ Group (907879266), Discord server

Progress & Roadmap

Progress since we forked into this repository: See Releases
Roadmap for future releases: See Project Board
Thoughts, proposals & ideas: See Discussions

Architecture & Algorithms

TBD

Development Resources

TBD

References

Original DiffSinger: paper, implementation
HiFi-GAN and NSF for waveform reconstruction
pc-ddsp for waveform reconstruction
DDIM for diffusion sampling acceleration
PNDM for diffusion sampling acceleration
DPM-Solver++ for diffusion sampling acceleration
UniPC for diffusion sampling acceleration
RMVPE and yxlllc's fork for pitch extraction

Disclaimer

Any organization or individual is prohibited from using any functionalities included in this repository to generate someone's speech without his/her consent, including but not limited to government leaders, political figures, and celebrities. If you do not comply with this item, you could be in violation of copyright laws.

License

This forked DiffSinger repository is licensed under the Apache 2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
augmentation		augmentation
basics		basics
ckpt		ckpt
configs		configs
data		data
deployment		deployment
dictionaries		dictionaries
docs		docs
inference		inference
modules		modules
preprocessing		preprocessing
scripts		scripts
training		training
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DiffSinger (XMTech maintained version)

不兼容更新预告 Notice of Incompatible Update

References

License

DiffSinger (OpenVPI maintained version)

User Guidance

Progress & Roadmap

Architecture & Algorithms

Development Resources

References

Disclaimer

License

About

Releases 4

Packages

Languages

License

XMTechCommunity/DiffSinger

Folders and files

Latest commit

History

Repository files navigation

DiffSinger (XMTech maintained version)

不兼容更新预告 Notice of Incompatible Update

References

License

DiffSinger (OpenVPI maintained version)

User Guidance

Progress & Roadmap

Architecture & Algorithms

Development Resources

References

Disclaimer

License

About

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages