Implement BigVGAN #52

New Issue

deviandice · 2023-03-03T04:21:28Z

deviandice commented

2023-03-03 04:21:28 +00:00

I implemented BigVGAN over here using your fork as a base. It's way better. Go implement it. Credit would be appreciated. Just throw this model file in models/tortoise.
https://disk.yandex.com/d/fOjzTs8HQiFVdg
https://github.com/deviandice/tortoise-tts-BigVGAN

I implemented BigVGAN over here using your fork as a base. It's way better. Go implement it. Credit would be appreciated. Just throw this model file in models/tortoise. https://disk.yandex.com/d/fOjzTs8HQiFVdg https://github.com/deviandice/tortoise-tts-BigVGAN

deviandice changed title from ~~BigVGAN~~ to Implement BigVGAN

2023-03-03 04:22:35 +00:00

mrq commented

2023-03-03 04:43:20 +00:00

Naisu, I'll play around with it whenever I get a chance to (probably tomorrow evening).

If you don't mind, to make my life a little easier (and it'll retain credit to you in the commit history), can you fork the mrq/tortoise-tts repo, apply your changes to it, then do a pull request? It's not a big deal, I can probably figure out what to slap back in. Ah, it seems fairly simple it re-implement it. I'll play around with it in a separate branch then merge it.

Naisu, I'll play around with it whenever I get a chance to (probably tomorrow evening). ~~If you don't mind, to make my life a little easier (and it'll retain credit to you in the commit history), can you fork the mrq/tortoise-tts repo, apply your changes to it, then do a pull request? It's not a big deal, I can probably figure out what to slap back in.~~ Ah, it seems fairly simple it re-implement it. I'll play around with it in a separate branch then merge it.

mrq commented

2023-03-03 06:59:44 +00:00

Very nice, implemented in mrq/tortoise-tts commit aca32a71f7, and added a toggle (default enabled) in commit 740b5587df.

In some of my comparisons there's definitely a noticeable improvement, but in others it's slightly perceptible. For example, the treble isn't so bad with it enabled (but you really have to tune your ears to it):

with BigVGAN: https://files.catbox.moe/e53ngv.wav
without BigVGAN: https://files.catbox.moe/81nxcr.wav

I know it's not a giant improvement, but it's another nice QoL uplift.

Very nice, implemented in mrq/tortoise-tts commit https://git.ecker.tech/mrq/tortoise-tts/commit/aca32a71f798ebd8487c113d41d1b4e9ee15c315, and added a toggle (default enabled) in commit 740b5587df13f205f02a84113d29f67f3b9a2219. In some of my comparisons there's definitely a noticeable improvement, but in others it's slightly perceptible. For example, the treble isn't so bad with it enabled (but you really have to tune your ears to it): * with BigVGAN: https://files.catbox.moe/e53ngv.wav * without BigVGAN: https://files.catbox.moe/81nxcr.wav I know it's not a giant improvement, but it's another nice QoL uplift.

mrq closed this issue

2023-03-03 06:59:44 +00:00

mrq added the

enhancement

label 2023-03-03 07:00:06 +00:00

deviandice commented

2023-03-03 10:12:17 +00:00

Thanks for implementing this so quickly, and its pretty neato that it's having a noticeable effect.

Sign in to join this conversation.