Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare to VoiceFlow TTS #66

Open
Liujingxiu23 opened this issue Mar 20, 2024 · 1 comment
Open

Compare to VoiceFlow TTS #66

Liujingxiu23 opened this issue Mar 20, 2024 · 1 comment

Comments

@Liujingxiu23
Copy link

Thank you for your work and sharing!
It seems MATCHA-TTS and VoiceFLow-TTS (https://github.com/X-LANCE/VoiceFlow-TTS) are very similar?
What is the main diffences between these two methods?
And How about the performace on voice quality, for example prosody, and the inference speed?

@shivammehta25
Copy link
Owner

They are! I met the author @cantabile-kwok just last week (super nice guy), it is interesting we both made certain decisions to improve the speed relative to just conditional flow matching. One way to speed up that they employed was to improve the paths by "rectifying" the learned paths by flow matching which is a two-step approach and quite effective. For us, we felt that the same speedup could be achieved by improving the architecture instead so we improved the U-net architecture and got a similar speedup.
They both are trying to solve a similar problem in different ways, you surely can "rectify" the paths with Matcha-TTS's architecture for even improved speed up :)

Hope that helps.

Shivam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants