modified conversion scripts to not give a shit about bitrate and formats since torchaudio.load handles all of that anyways, and it all gets resampled anyways

This commit is contained in:
mrq 2023-02-15 04:44:14 +00:00
parent 2e777e8a67
commit b721e395b5
3 changed files with 3 additions and 2 deletions

View File

@ -195,6 +195,7 @@ You'll be presented with a bunch of options in the default `Generate` tab, but d
- this is a very tricky setting to suggest, as there's not necessarily a go-to solution - this is a very tricky setting to suggest, as there's not necessarily a go-to solution
+ some samples seem to work best if it's just one whole chunk + some samples seem to work best if it's just one whole chunk
+ other voices seem to work better if i split it up more + other voices seem to work better if i split it up more
+ I'm *very* sure the best way to go about it is for it to compute latents per sentence, then average, but that's tedious.
- the best advice is to just play around with it a bit; pick the lowest chunk size you can make, and if a voice doesn't quite replicate right, increase the chunk count. - the best advice is to just play around with it a bit; pick the lowest chunk size you can make, and if a voice doesn't quite replicate right, increase the chunk count.
* `(Re)Compute Voice Latents`: regenerates a voice's cached latents. * `(Re)Compute Voice Latents`: regenerates a voice's cached latents.

View File

@ -1,4 +1,4 @@
@echo off @echo off
rm .\in\.gitkeep rm .\in\.gitkeep
rm .\out\.gitkeep rm .\out\.gitkeep
for %%a in (".\in\*.*") do ffmpeg -i "%%a" -ar 22050 -ac 1 -c:a pcm_f32le ".\out\%%~na.wav" for %%a in (".\in\*.*") do ffmpeg -i "%%a" -ac 1 ".\out\%%~na.wav"

2
convert/convert.sh Normal file → Executable file
View File

@ -1 +1 @@
for a in $(find "in/" -maxdepth 1 -not -name '.gitkeep' -type f); do ffmpeg -i "$a" -ar 22050 -ac 1 -c:a pcm_f32le "out/$(basename $a).wav"; done for a in $(find "in/" -maxdepth 1 -not -name '.gitkeep' -type f); do ffmpeg -i "$a" -ac 1 "out/$(basename $a).wav"; done