Speed Increase? #374

New Issue

maki6003 · 2023-09-09T11:15:25Z

maki6003 commented

2023-09-09 11:15:25 +00:00

Is there any way to get a good outcome with better speed?

Like the quality i get from the bark is broken and meh.. but the speed is pretty decent,

i wish there was a way to get 11labs speeds if possible or even bark speed but but with tts inference..
its too slow to generate 1 audio and I'm on a 3090. it could take above 200 seconds to get 1 output that i wont even like.

unless something is wrong.. but is very slow.

also is there a way to make bark better?

Is there any way to get a good outcome with better speed? Like the quality i get from the bark is broken and meh.. but the speed is pretty decent, i wish there was a way to get 11labs speeds if possible or even bark speed but but with tts inference.. its too slow to generate 1 audio and I'm on a 3090. it could take above 200 seconds to get 1 output that i wont even like. unless something is wrong.. but is very slow. also is there a way to make bark better?

drew commented

2023-09-11 16:54:55 +00:00

check out this issue here #363

basically you wanna use low quality params and then use rvc. This gives you the best speed+quality in my testing. Also check out this video https://www.youtube.com/watch?v=IcpRfHod1ic

check out this issue here https://git.ecker.tech/mrq/ai-voice-cloning/issues/363 basically you wanna use low quality params and then use rvc. This gives you the best speed+quality in my testing. Also check out this video https://www.youtube.com/watch?v=IcpRfHod1ic

Qual commented

2023-11-11 02:34:39 +00:00

check out this issue here #363

basically you wanna use low quality params and then use rvc. This gives you the best speed+quality in my testing. Also check out this video https://www.youtube.com/watch?v=IcpRfHod1ic

There is something I'm not sure i'm understanding it yet and I hope you can enlight me.
If I finetune a model in ai-voice-cloning, I can now generate "ready for rvc" audio files which my finetuned model already did a good job. But I can't directly use my finetuned model in RVC right ? I need to train again a model using RVC itself ? ( it's 15hours of audio )

So i need 2 models, one for ai-voice-clone so even in "very-fast" preset it produce a good enough output, so the model trained with the same dataset in RVC can do it's inference job ? That mean the 2 models are doing a completely different job, one is to "create audio from text" and one it to "audio processing on audio" ?

I'm sorry if this sounds stupid, i'm totally learning this on the fly and there is so many things to learn just to get started.

> check out this issue here https://git.ecker.tech/mrq/ai-voice-cloning/issues/363 > > basically you wanna use low quality params and then use rvc. This gives you the best speed+quality in my testing. Also check out this video https://www.youtube.com/watch?v=IcpRfHod1ic There is something I'm not sure i'm understanding it yet and I hope you can enlight me. If I finetune a model in ai-voice-cloning, I can now generate "ready for rvc" audio files which my finetuned model already did a good job. But I can't directly use my finetuned model in RVC right ? I need to train again a model using RVC itself ? ( it's 15hours of audio ) So i need 2 models, one for ai-voice-clone so even in "very-fast" preset it produce a good enough output, so the model trained with the same dataset in RVC can do it's inference job ? That mean the 2 models are doing a completely different job, one is to "create audio from text" and one it to "audio processing on audio" ? I'm sorry if this sounds stupid, i'm totally learning this on the fly and there is so many things to learn just to get started.

Sign in to join this conversation.