Speed Increase? #374

Open
opened 2023-09-09 11:15:25 +00:00 by maki6003 · 2 comments

Is there any way to get a good outcome with better speed?

Like the quality i get from the bark is broken and meh.. but the speed is pretty decent,

i wish there was a way to get 11labs speeds if possible or even bark speed but but with tts inference..
its too slow to generate 1 audio and I'm on a 3090. it could take above 200 seconds to get 1 output that i wont even like.

unless something is wrong.. but is very slow.

also is there a way to make bark better?

Is there any way to get a good outcome with better speed? Like the quality i get from the bark is broken and meh.. but the speed is pretty decent, i wish there was a way to get 11labs speeds if possible or even bark speed but but with tts inference.. its too slow to generate 1 audio and I'm on a 3090. it could take above 200 seconds to get 1 output that i wont even like. unless something is wrong.. but is very slow. also is there a way to make bark better?

check out this issue here #363

basically you wanna use low quality params and then use rvc. This gives you the best speed+quality in my testing. Also check out this video https://www.youtube.com/watch?v=IcpRfHod1ic

check out this issue here https://git.ecker.tech/mrq/ai-voice-cloning/issues/363 basically you wanna use low quality params and then use rvc. This gives you the best speed+quality in my testing. Also check out this video https://www.youtube.com/watch?v=IcpRfHod1ic

check out this issue here #363

basically you wanna use low quality params and then use rvc. This gives you the best speed+quality in my testing. Also check out this video https://www.youtube.com/watch?v=IcpRfHod1ic

There is something I'm not sure i'm understanding it yet and I hope you can enlight me.
If I finetune a model in ai-voice-cloning, I can now generate "ready for rvc" audio files which my finetuned model already did a good job. But I can't directly use my finetuned model in RVC right ? I need to train again a model using RVC itself ? ( it's 15hours of audio )

So i need 2 models, one for ai-voice-clone so even in "very-fast" preset it produce a good enough output, so the model trained with the same dataset in RVC can do it's inference job ? That mean the 2 models are doing a completely different job, one is to "create audio from text" and one it to "audio processing on audio" ?

I'm sorry if this sounds stupid, i'm totally learning this on the fly and there is so many things to learn just to get started.

> check out this issue here https://git.ecker.tech/mrq/ai-voice-cloning/issues/363 > > basically you wanna use low quality params and then use rvc. This gives you the best speed+quality in my testing. Also check out this video https://www.youtube.com/watch?v=IcpRfHod1ic There is something I'm not sure i'm understanding it yet and I hope you can enlight me. If I finetune a model in ai-voice-cloning, I can now generate "ready for rvc" audio files which my finetuned model already did a good job. But I can't directly use my finetuned model in RVC right ? I need to train again a model using RVC itself ? ( it's 15hours of audio ) So i need 2 models, one for ai-voice-clone so even in "very-fast" preset it produce a good enough output, so the model trained with the same dataset in RVC can do it's inference job ? That mean the 2 models are doing a completely different job, one is to "create audio from text" and one it to "audio processing on audio" ? I'm sorry if this sounds stupid, i'm totally learning this on the fly and there is so many things to learn just to get started.
Sign in to join this conversation.
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#374
No description provided.