CVVP latents missing #80

New Issue

yqxtqymn · 2023-03-07T07:35:42Z

yqxtqymn commented

2023-03-07 07:35:42 +00:00

Since recent update code won't calculate CVVP latents.

Requesting weighing against CVVP weight, but voice latents are missing some extra data. Please regenerate your voice latents.

Tried regenerating latents, 1 CVVP weight, bigvgan and univnet all produce same problem.

Since recent update code won't calculate CVVP latents. > Requesting weighing against CVVP weight, but voice latents are missing some extra data. Please regenerate your voice latents. Tried regenerating latents, 1 CVVP weight, bigvgan and univnet all produce same problem.

mrq commented

2023-03-07 13:24:24 +00:00

Per the wiki:

Slimmer Computed Latents: falls back to the original, 12.9KiB way of storing latents (without the extra bits required for using the CVVP model).

You shouldn't even be using the CVVP.

Per [the wiki](https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Settings#settings): > Slimmer Computed Latents: falls back to the original, 12.9KiB way of storing latents (without the extra bits required for using the CVVP model). You shouldn't even be using the CVVP.

mrq closed this issue

2023-03-07 13:24:24 +00:00

yqxtqymn commented

2023-03-07 18:20:03 +00:00

Why shouldn't I use CVVP? I had some good generations with it in the 0.8-1 range.

mrq commented

2023-03-07 19:17:41 +00:00

Per https://github.com/neonbjb/tortoise-tts#v24-2022517:

Removed CVVP model. Found that it does not, in fact, make an appreciable difference in the output.

Per https://nonint.com/2022/04/25/tortoise-architectural-design-doc/:

CVVP’s contribution to Tortoise is minor. It could be entirely ommitted and you would still be left with a very good TTS program. I do not have a way to quantify its contribution to Tortoise, but I have subjectively been able to tell outputs that were generated with CVVP in the picture versus those that were generated without CVVP.

Per my own findings, it increases generation time for little to no gain. It's something I've tried seeking quality uplifts and found none in many of my early tests. If it works for you, great, but it's not something I'll endorse (for lack of a better term).

Anyways, I've made the message more detailed, as it was before I've added the setting that governs exporting the data for comparing against the CVVP. The setting isn't on by default because of mrq/tortoise-tts#10 (it greatly increases output size if embedding metadata + latents are enabled, which are by default).

Per https://github.com/neonbjb/tortoise-tts#v24-2022517: > Removed CVVP model. Found that it does not, in fact, make an appreciable difference in the output. Per https://nonint.com/2022/04/25/tortoise-architectural-design-doc/: > CVVP’s contribution to Tortoise is minor. It could be entirely ommitted and you would still be left with a very good TTS program. I do not have a way to quantify its contribution to Tortoise, but I have subjectively been able to tell outputs that were generated with CVVP in the picture versus those that were generated without CVVP. Per my own findings, it increases generation time for little to no gain. It's something I've tried seeking quality uplifts and found none in many of my early tests. If it works for you, great, but it's not something I'll endorse (for lack of a better term). Anyways, I've made the message more detailed, as it was before I've added the setting that governs exporting the data for comparing against the CVVP. The setting isn't on by default because of https://git.ecker.tech/mrq/tortoise-tts/issues/10 (it greatly increases output size if embedding metadata + latents are enabled, which are by default).

yqxtqymn commented

2023-03-07 20:19:13 +00:00

I appreciate the detailed response. Was not aware support got dropped in the original Tortoise.

CVVP impact is small, but it's a nice to have and play around with when you're trying to get a generation just right.

As far as I can tell CVVP is completely broken since yesterday's commit. If it's a small change to get it functional again (even if not officially endorsed) then I would consider it worth it. Of course it's up to you whether or not you wish to carry it into the future.

If you'd like I could give it a go and see if I can get it working?

I appreciate the detailed response. Was not aware support got dropped in the original Tortoise. CVVP impact is small, but it's a nice to have and play around with when you're trying to get a generation just right. As far as I can tell CVVP is completely broken since yesterday's commit. If it's a small change to get it functional again (even if not officially endorsed) then I would consider it worth it. Of course it's up to you whether or not you wish to carry it into the future. If you'd like I could give it a go and see if I can get it working?

mrq commented

2023-03-07 21:16:48 +00:00

Slimmer Computed Latents: falls back to the original, 12.9KiB way of storing latents (without the extra bits required for using the CVVP model).

Disable the setting.

> Slimmer Computed Latents: falls back to the original, 12.9KiB way of storing latents (without the extra bits required for using the CVVP model). Disable the setting.

yqxtqymn commented

2023-03-07 22:03:20 +00:00

I tried it before and it wasn't working for some reason. Anyway it's working now. Thanks! :)

Sign in to join this conversation.