training model from scratch #381

Open
opened 2023-09-13 16:45:51 +00:00 by phoenix7 · 1 comment

how long does it take to train the network autoregressive.pth from the beginning rather than using the pre-trained model to fine-tune? Is it possible to do so with >1000 hours of data?

how long does it take to train the network autoregressive.pth from the beginning rather than using the pre-trained model to fine-tune? Is it possible to do so with >1000 hours of data?
Owner

I haven't done any tests involving training a TorToiSe model from scratch, as per the original repo's README:

These models were trained on my "homelab" server with 8 RTX 3090s over the course of several months. They were trained on a dataset consisting of ~50k hours of speech data

You will always be better off just finetuning the existing weights, and from my experiments with VALL-E, you need a lot of hours of speech and unique speakers for a serviceable zero-shot model.

I haven't done any tests involving training a TorToiSe model from scratch, as per [the original repo's README](https://github.com/neonbjb/tortoise-tts#training): > These models were trained on my "homelab" server with 8 RTX 3090s over the course of several months. They were trained on a dataset consisting of ~50k hours of speech data You will always be better off just finetuning the existing weights, and from my experiments with VALL-E, you *need* a lot of hours of speech and unique speakers for a serviceable zero-shot model.
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#381
No description provided.