Very Bad Training Time & Results. #403

New Issue

FortermalGreek · 2023-10-05T23:01:55Z

FortermalGreek commented

2023-10-05 23:01:55 +00:00

I tried to train a model using 7-Minute audio file. 500 epochs, with a batch size of 110 and gradient of 110.

The estimated Wait time was 3 days. This is really weird. I have seen others with similar specs to my PC who only had to wait for a few hours or just a few minutes.

I ended up waiting the 3 days, and the results of the model were extremely bad.

I played a lot with the fine-tuning, and it stayed really bad. i reinstalled the software multiple times, factory reset my pc, nothing changed.

Google Colab seems to be working fine, but Colab is by nature really slow, It takes 11 Hours for the same data-set (Ironically way faster) The results seemed to be somewhat good on Colab

Is there any other fork of tortoise that may work? or what should i do?
Any help is appreciated, here are my specs:

Intel Core i5-9600K @ 3.70GHz
16GB ram
RTX 3070
Windows 10

I tried to train a model using 7-Minute audio file. 500 epochs, with a batch size of 110 and gradient of 110. The estimated Wait time was 3 days. This is really weird. I have seen others with similar specs to my PC who only had to wait for a few hours or just a few minutes. I ended up waiting the 3 days, and the results of the model were extremely bad. I played a lot with the fine-tuning, and it stayed really bad. i reinstalled the software multiple times, factory reset my pc, nothing changed. Google Colab seems to be working fine, but Colab is by nature really slow, It takes 11 Hours for the same data-set (Ironically way faster) The results seemed to be somewhat good on Colab Is there any other fork of tortoise that may work? or what should i do? Any help is appreciated, here are my specs: - Intel Core i5-9600K @ 3.70GHz - 16GB ram - RTX 3070 - Windows 10

mrq commented

2023-10-06 02:38:57 +00:00

This might be related to #399 where using the "latest" drivers is actually a detriment when training closed to max VRAM usage. I'd suggest either:

downgrading your drivers
reducing your batch size

This might be related to #399 where using the "latest" drivers is actually a detriment when training closed to max VRAM usage. I'd suggest either: * downgrading your drivers * reducing your batch size

FortermalGreek commented

2023-10-30 23:49:51 +00:00

@mrq I tried downgrading to NVIDIA drivers 531.79
and i played a lot with the fine tuning, I'm still really unhappy with the wait time.

I tried using DL Art School (From GitHub) and the wait time was also unusually long, the issue must have to do with DLAS. are there any alternatives to DLAS? or what should i do? :c

@mrq I tried downgrading to NVIDIA drivers 531.79 and i played a lot with the fine tuning, I'm still really unhappy with the wait time. I tried using DL Art School (From GitHub) and the wait time was also unusually long, the issue must have to do with DLAS. are there any alternatives to DLAS? or what should i do? :c

FortermalGreek referenced this issue

2023-11-04 05:54:47 +00:00

How do i run the project on CPU? #440

Sign in to join this conversation.