Actually, I just realized it's $0.20/hr: $10 for 100 compute resources, 2 computes per hour for a T4= 50 hours...$10/50 = $0.2/hr
Actually, I'm not sure what the tech underneath clonedub is for sure. Meta and 11labs I guess both have similar capabilities now.
What a journey this has been so far! I can now hear bad recording practices in every audio I hear, particularly on youtube... There is a new web app clonedub.com (using what I assume is Google's…
I am able to run multiple instances on Google Colab, using the same model stored on Google Drive.
I originally was doing this but for my purposes the speed was not adequate and too…
I am able to run multiple instances on Google Colab, using the same model stored on Google Drive.
Well, I don't have any examples on hand to share, but, truthfully, an untrained ear won't hear the nuances anyway. If you consume AI vocal content like I do, then you've probably picked up on the…
Upon further testing, I've found much of the poorer audio quality is due to phasing in the output, which is a kind of audio artifact in this case because of the natural variance in voice…
There's a relatively new TTS called Balacoon, aimed at low end devices. I tried it out on my desktop and it was faster than RT.
How was the quality?
The quality is fine, It comes…
It's all going to depend on what the needs of the audio are. As mentioned previously, my use case is audiobooks, which is a far cry from everyday AI raps and meme content on youtube (which it's…
The bottleneck is largely at the sample generation, afaik. Because higher quality outputs necessarily require more inference time, that's the precise trade-off, and cutting corners, other than…
There's a relatively new TTS called Balacoon, aimed at low end devices. I tried it out on my desktop and it was faster than RT. I'm not sure to what degree everything is open source, but the dev…
Ran into this in the colab, needed to reclone the repo. When everything is getting setup, there's a link in the modules folder that pulls dlas from another repo. It may not have been able to…
One thing I haven't tried (yet) is combining models and latents from different speakers. Can I get the prosody of one speaker in the voice of another?
Simple Audio remastering tools: vocalremover.org (great for first pass) Adobe Enhance Speech (great but tends to distort pitch) Castofly (better than adobe imo) UVR (best for removing…
I've found it best when running into cuda memory allocation errors to just restart everything. In fact, I run into that issue mostly when trying to do multiple tasks within the same session (ie…
If you have a large dataset, go to your dataset and rename the audio folder so it doesnt get seen by the UI. Select 10-50 audio samples from the DS audio folder, put these in the voices…