training model from scratch #381

New Issue

phoenix7 · 2023-09-13T16:45:51Z

phoenix7 commented

2023-09-13 16:45:51 +00:00

how long does it take to train the network autoregressive.pth from the beginning rather than using the pre-trained model to fine-tune? Is it possible to do so with >1000 hours of data?

mrq commented

2023-09-13 17:38:40 +00:00

I haven't done any tests involving training a TorToiSe model from scratch, as per the original repo's README:

These models were trained on my "homelab" server with 8 RTX 3090s over the course of several months. They were trained on a dataset consisting of ~50k hours of speech data

You will always be better off just finetuning the existing weights, and from my experiments with VALL-E, you need a lot of hours of speech and unique speakers for a serviceable zero-shot model.

I haven't done any tests involving training a TorToiSe model from scratch, as per [the original repo's README](https://github.com/neonbjb/tortoise-tts#training): > These models were trained on my "homelab" server with 8 RTX 3090s over the course of several months. They were trained on a dataset consisting of ~50k hours of speech data You will always be better off just finetuning the existing weights, and from my experiments with VALL-E, you *need* a lot of hours of speech and unique speakers for a serviceable zero-shot model.

👍 1

Sign in to join this conversation.