"Iterations" when generating #251

Open
opened 2023-05-26 10:11:17 +00:00 by helloitsme · 1 comment

I read in the wiki that Iterations are for improving the actual sound quality of your output audio... However, what is it actually doing to achieve that? Is is just noise reduction? I ask because I notice it skews to more tinny-sounding output.

I read in the wiki that Iterations are for improving the actual sound quality of your output audio... However, what is it actually doing to achieve that? Is is just noise reduction? I ask because I notice it skews to more tinny-sounding output.
Owner

From what I remember, the "iterations" in the web UI determines how many steps to run the outputted codes from the AR through the diffusion sampler (desu the names I went with are a bit confusing). I think I had some understanding that it would only really effect the actual sound quality, as it determines what goes into the final waveform, but that was with the old vocoder.

In my sparse generations after swapping the vocoder with BigVGAN, it seems that the "iterations" isn't all that necessary.


In other words, these days, as long as you are using the default setting for the vocoder (BigVGAN), you shouldn't need to stress about having as high of an "iterations" value as possible. I think 60 is what I just leave it to and it sounds decent enough compared to, if I used a different value for the same seed. VoiceFixer has been failing me more and more with some odd crackle at the end.

From what I remember, the "iterations" in the web UI determines how many steps to run the outputted codes from the AR through the diffusion sampler (desu the names I went with are a bit confusing). I think I had some understanding that it would only really effect the actual sound quality, as it determines what goes into the final waveform, but that was with the old vocoder. In my sparse generations after swapping the vocoder with BigVGAN, it seems that the "iterations" isn't all that necessary. --- In other words, these days, as long as you are using the default setting for the vocoder (BigVGAN), you shouldn't need to stress about having as high of an "iterations" value as possible. I think 60 is what I just leave it to and it sounds decent enough compared to, if I used a different value for the same seed. VoiceFixer has been failing me more and more with some odd crackle at the end.
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#251
No description provided.