helloitsme
  • Joined on 2023-05-25
helloitsme commented on issue mrq/ai-voice-cloning#273 2023-06-23 06:42:07 +00:00
Maybe dumb question; Could running multiple instances share the same models/resources?

Actually, I just realized it's $0.20/hr: $10 for 100 compute resources, 2 computes per hour for a T4= 50 hours...$10/50 = $0.2/hr

helloitsme commented on issue mrq/ai-voice-cloning#277 2023-06-23 06:35:54 +00:00
Google Translatotron 3: Speech to Speech Translation with Monolingual Data

Actually, I'm not sure what the tech underneath clonedub is for sure. Meta and 11labs I guess both have similar capabilities now.

helloitsme opened issue mrq/ai-voice-cloning#277 2023-06-21 19:33:53 +00:00
Google Translatotron 3: Speech to Speech Translation with Monolingual Data
helloitsme commented on issue mrq/ai-voice-cloning#253 2023-06-21 19:21:45 +00:00
Results, Retrospectives, and Recommendations

What a journey this has been so far! I can now hear bad recording practices in every audio I hear, particularly on youtube... There is a new web app clonedub.com (using what I assume is Google's…

helloitsme commented on issue mrq/ai-voice-cloning#273 2023-06-21 18:45:17 +00:00
Maybe dumb question; Could running multiple instances share the same models/resources?

I am able to run multiple instances on Google Colab, using the same model stored on Google Drive.

I originally was doing this but for my purposes the speed was not adequate and too…

helloitsme commented on issue mrq/ai-voice-cloning#273 2023-06-20 21:23:12 +00:00
Maybe dumb question; Could running multiple instances share the same models/resources?

I am able to run multiple instances on Google Colab, using the same model stored on Google Drive.

helloitsme commented on issue mrq/ai-voice-cloning#253 2023-06-17 23:26:40 +00:00
Results, Retrospectives, and Recommendations

Well, I don't have any examples on hand to share, but, truthfully, an untrained ear won't hear the nuances anyway. If you consume AI vocal content like I do, then you've probably picked up on the…

helloitsme commented on issue mrq/ai-voice-cloning#253 2023-06-15 12:15:51 +00:00
Results, Retrospectives, and Recommendations

Upon further testing, I've found much of the poorer audio quality is due to phasing in the output, which is a kind of audio artifact in this case because of the natural variance in voice…

helloitsme commented on issue mrq/ai-voice-cloning#152 2023-06-15 01:17:25 +00:00
VALL-E Integration (and In Response To TorToiSe: a Quick Retrospective)

There's a relatively new TTS called Balacoon, aimed at low end devices. I tried it out on my desktop and it was faster than RT.

How was the quality?

The quality is fine, It comes…

helloitsme commented on issue mrq/ai-voice-cloning#253 2023-06-15 01:12:07 +00:00
Results, Retrospectives, and Recommendations

It's all going to depend on what the needs of the audio are. As mentioned previously, my use case is audiobooks, which is a far cry from everyday AI raps and meme content on youtube (which it's…

helloitsme commented on issue mrq/ai-voice-cloning#183 2023-06-09 22:50:39 +00:00
generating voice clip is so much slower compared to using original Tortoise TTS

The bottleneck is largely at the sample generation, afaik. Because higher quality outputs necessarily require more inference time, that's the precise trade-off, and cutting corners, other than…

helloitsme commented on issue mrq/ai-voice-cloning#152 2023-06-06 12:28:58 +00:00
VALL-E Integration (and In Response To TorToiSe: a Quick Retrospective)

There's a relatively new TTS called Balacoon, aimed at low end devices. I tried it out on my desktop and it was faster than RT. I'm not sure to what degree everything is open source, but the dev…

helloitsme commented on issue mrq/ai-voice-cloning#254 2023-06-05 22:05:11 +00:00
No module named 'dlas'

Ran into this in the colab, needed to reclone the repo. When everything is getting setup, there's a link in the modules folder that pulls dlas from another repo. It may not have been able to…

helloitsme commented on issue mrq/ai-voice-cloning#253 2023-06-05 09:10:03 +00:00
Results, Retrospectives, and Recommendations

One thing I haven't tried (yet) is combining models and latents from different speakers. Can I get the prosody of one speaker in the voice of another?

helloitsme commented on issue mrq/ai-voice-cloning#253 2023-06-04 04:21:57 +00:00
Results, Retrospectives, and Recommendations

Simple Audio remastering tools: vocalremover.org (great for first pass) Adobe Enhance Speech (great but tends to distort pitch) Castofly (better than adobe imo) UVR (best for removing…

helloitsme opened issue mrq/ai-voice-cloning#253 2023-06-04 00:11:53 +00:00
Results, Retrospectives, and Recommendations
helloitsme opened issue mrq/ai-voice-cloning#251 2023-05-26 10:11:17 +00:00
"Iterations" when generating
helloitsme commented on issue mrq/ai-voice-cloning#244 2023-05-25 22:23:06 +00:00
Step by step data prep and training/finetuning guide?

I've found it best when running into cuda memory allocation errors to just restart everything. In fact, I run into that issue mostly when trying to do multiple tasks within the same session (ie…

helloitsme commented on issue mrq/ai-voice-cloning#225 2023-05-25 22:07:21 +00:00
Requesting tips to make inference as fast as possible

If you have a large dataset, go to your dataset and rename the audio folder so it doesnt get seen by the UI. Select 10-50 audio samples from the DS audio folder, put these in the voices…