utf-8 codec can't decod ebyte 0x81 in position 2 #166
Labels
No Label
bug
duplicate
enhancement
help wanted
insufficient info
invalid
news
not a bug
question
wontfix
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: mrq/ai-voice-cloning#166
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Getting following error when trying to start to train
Likely to be invalid UTF8 characters in your train.txt or validation.txt files. Are you training a language other than English?
nope currently training only English, German I will try later. However it's so quick with the error I don't know if it's even started reading the train or validation.txt file
Here's the full error when starting to train, maybe there's something else there.
If helpful I could also post the train.txt and the validation.txt however I couldn't imagine what the non utf-8 character would be
If you have Notepad++ you can open up those two files, then go to Encoding>Convert to UTF8, save them and see if there's any difference.
Tried it, same outcome with the same error.
Somehow I missed the
[Training] [2023-03-23T01:40:04.035070] ModuleNotFoundError: No module named 'dlas'
bit above. You might need to re-run the setup script. If that doesn't fix it I could try training with your dataset if its small enough to upload.looks like it's working currently, at least a new smaller sample worked (with around 30 wav only).
What i did was the setup again (setp-cuda.bat) and after that update.bat, it downloaded some new files maybe that's what fixed it. However currently working, I write again if everything is ok.
well it doesn't at least not now. First try worked, now I've tried it again, however now the utf-8 error comes a bit later in the process, right before the training begins
#159
https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Issues#local_state-k-v-grad_accum_step-indexerror-list-index-out-of-range
training seems to run, but I got this
it said finished training, is it done? But why did I get the utf-8 error again?
Does it happen with other python packages?
don't now, how could I test it, sadly not very familiar with python
If you check the
training/<voice name>/finetune/models/
directory is there a800_gpt.pth
file there? Can you use it to generate samples?In that folder there are 129 files with that structure.
I can generate samples with the voice. However I've around 40 minutes of clean voice samples, transcribed and sliced everything and trained it with it (with the error above). With that I can generate samples, however, despite playing arount with the iterations, samples and temperature, the samples are really bad, so I think something went wrong during the traiing.