All finetuned models are unstable when synthesizing lengthy content #224

New Issue

pheonis · 2023-05-01T07:11:23Z

pheonis commented

2023-05-01 07:11:23 +00:00

I discovered an intriguing phenomenon while working on fine-tuned AI language models: When attempting to produce lengthier text using models previously shared on this platform, I observed instability in their performance. The majority of outputs generated contained
1: noticeable "artifacts,"
2: repetitions, or garbled sound following sentences.
3: Sometimes some sentences were completed omited from synthesis.

It's almost as though these systems struggle with generating coherent long-form content without losing clarity.

While this is the case with the finetuned models, Interestingly though the autoregressive model worded flawlessly without any of th above issues.

I dont know, What can be done, so that we get the models that are as close as autoregressive model and produce outputs without any of the mentioned issues.

Also to mention, All the finetuned models that i trained have mel loss ce value between 0.2 to 0.8 and are trained for 100 to 300 epoches.

I discovered an intriguing phenomenon while working on fine-tuned AI language models: When attempting to produce lengthier text using models previously shared on this platform, I observed instability in their performance. The majority of outputs generated contained 1: noticeable "artifacts," 2: repetitions, or garbled sound following sentences. 3: Sometimes some sentences were completed omited from synthesis. It's almost as though these systems struggle with generating coherent long-form content without losing clarity. While this is the case with the finetuned models, Interestingly though the autoregressive model worded flawlessly without any of th above issues. I dont know, What can be done, so that we get the models that are as close as autoregressive model and produce outputs without any of the mentioned issues. Also to mention, All the finetuned models that i trained have mel loss ce value between 0.2 to 0.8 and are trained for 100 to 300 epoches.

pheonis changed title from ~~All finetuned models are unstable when creting lengthy content~~ to All finetuned models are unstable when creating lengthy content

2023-05-01 07:11:40 +00:00

pheonis changed title from ~~All finetuned models are unstable when creating lengthy content~~ to All finetuned models are unstable when synthesizing lengthy content

2023-05-01 07:12:08 +00:00

FrioGlakka commented

2023-05-01 10:04:08 +00:00

Not a solution, but might be a temporary "workaround":

I've noticed that adding the line derimiter character as often as possible (after every period or comma), essentially turns your large request into a lot of small requests. For my use case, this works great.

Not a solution, but might be a temporary "workaround": I've noticed that adding the line derimiter character as often as possible (after every period or comma), essentially turns your large request into a lot of small requests. For my use case, this works great.

👍 1

pheonis commented

2023-05-01 12:21:32 +00:00

Not a solution, but might be a temporary "workaround":

I've noticed that adding the line derimiter character as often as possible (after every period or comma), essentially turns your large request into a lot of small requests. For my use case, this works great.

What are you doing to achieve this? The line delimiter by default is set to "\n" in the repo. Do you change this?

> Not a solution, but might be a temporary "workaround": > > I've noticed that adding the line derimiter character as often as possible (after every period or comma), essentially turns your large request into a lot of small requests. For my use case, this works great. What are you doing to achieve this? The line delimiter by default is set to "\n" in the repo. Do you change this?

FrioGlakka commented

2023-05-02 20:34:04 +00:00

What are you doing to achieve this? The line delimiter by default is set to "\n" in the repo. Do you change this?

I did not change it, it's still "\n".
But I make sure after every period or comma, I add a new line (the default line delimiter "\n" means new line).

This way, longer texts get cut up into smaller texts, and combined into 1 wav automatically.

> What are you doing to achieve this? The line delimiter by default is set to "\n" in the repo. Do you change this? I did not change it, it's still "\n". But I make sure after every period or comma, I add a new line (the default line delimiter "\n" means new line). This way, longer texts get cut up into smaller texts, and combined into 1 wav automatically.

Sign in to join this conversation.