modified conversion scripts to not give a shit about bitrate and formats since torchaudio.load handles all of that anyways, and it all gets resampled anyways

2023-02-15 04:44:14 +00:00 · 2023-02-15 04:44:14 +00:00 · b721e395b5
commit b721e395b5
parent 2e777e8a67
3 changed files with 3 additions and 2 deletions
--- a/README.md
+++ b/README.md
@ -195,6 +195,7 @@ You'll be presented with a bunch of options in the default `Generate` tab, but d
 	- this is a very tricky setting to suggest, as there's not necessarily a go-to solution
 		+ some samples seem to work best if it's just one whole chunk
 		+ other voices seem to work better if i split it up more
+		+ I'm *very* sure the best way to go about it is for it to compute latents per sentence, then average, but that's tedious.
 	- the best advice is to just play around with it a bit; pick the lowest chunk size you can make, and if a voice doesn't quite replicate right, increase the chunk count.
 * `(Re)Compute Voice Latents`: regenerates a voice's cached latents.

--- a/convert/convert.bat
+++ b/convert/convert.bat
@ -1,4 +1,4 @@
@echo off
 rm .\in\.gitkeep
 rm .\out\.gitkeep
-for %%a in (".\in\*.*") do ffmpeg -i "%%a" -ar 22050 -ac 1 -c:a pcm_f32le ".\out\%%~na.wav"
+for %%a in (".\in\*.*") do ffmpeg -i "%%a" -ac 1 ".\out\%%~na.wav"
--- a/convert/convert.sh
+++ b/convert/convert.sh
@ -1 +1 @@
-for a in $(find "in/" -maxdepth 1 -not -name '.gitkeep' -type f); do ffmpeg -i "$a" -ar 22050 -ac 1 -c:a pcm_f32le "out/$(basename $a).wav"; done
+for a in $(find "in/" -maxdepth 1 -not -name '.gitkeep' -type f); do ffmpeg -i "$a" -ac 1 "out/$(basename $a).wav"; done