Is there anyway to save the voice from a random generation ? #264

New Issue

wiznat · 2023-06-12T21:45:49Z

wiznat commented

2023-06-12 21:45:49 +00:00

I noticed in the original TTTS repo they author notes :

Random voice :
For the those in the ML space: this is created by projecting a random vector onto the voice conditioning latent space.

Full disclosure, I am not well versed in ML, but is there any hacky solution I could do to save whatever this random vector is and create a voice with it?

If so, where in the code would I find the random generation going on?

I noticed in the original TTTS repo they author notes : Random voice : For the those in the ML space: this is created by projecting a random vector onto the voice conditioning latent space. Full disclosure, I am not well versed in ML, but is there any hacky solution I could do to save whatever this random vector is and create a voice with it? If so, where in the code would I find the random generation going on?

mrq commented

2023-06-12 23:10:10 +00:00

If you have Embed Output Metadata enabled in settings, the latents used for that generation are "embedded" into the result sound file. You can take that into Utilities > Import / Analyze and the web UI can rip the latents back out, and you can place it in a new folder under ./voices/.

However, if I remember right, they're rather sensitive to not sound all that similar across generations, so your mileage will vary if you're looking to reroll the dice for a new voice and wanting to keep using it.

If you have `Embed Output Metadata` enabled in settings, the latents used for that generation are "embedded" into the result sound file. You can take that into `Utilities > Import / Analyze` and the web UI can rip the latents back out, and you can place it in a new folder under `./voices/`. However, if I remember right, they're rather *sensitive* to not sound all that similar across generations, so your mileage will vary if you're looking to reroll the dice for a new voice and wanting to keep using it.

FrioGlakka commented

2023-06-14 15:19:53 +00:00

I can testify that the saved random voices aren't "solid" or however you'd call it.

What I mean is that I had some nice random voices saved, and sometimes they would change gender when generating audio.

So your best bet would probably be trying to generate as much audio as possible from the random voice you like, and then using those clips to generate latents with as a custom voice?

I can testify that the saved random voices aren't "solid" or however you'd call it. What I mean is that I had some nice random voices saved, and sometimes they would change gender when generating audio. So your best bet would probably be trying to generate as much audio as possible from the random voice you like, and then using those clips to generate latents with as a custom voice?

Sign in to join this conversation.