forked from mrq/tortoise-tts
modified conversion scripts to not give a shit about bitrate and formats since torchaudio.load handles all of that anyways, and it all gets resampled anyways
This commit is contained in:
parent
2e777e8a67
commit
b721e395b5
|
@ -195,6 +195,7 @@ You'll be presented with a bunch of options in the default `Generate` tab, but d
|
|||
- this is a very tricky setting to suggest, as there's not necessarily a go-to solution
|
||||
+ some samples seem to work best if it's just one whole chunk
|
||||
+ other voices seem to work better if i split it up more
|
||||
+ I'm *very* sure the best way to go about it is for it to compute latents per sentence, then average, but that's tedious.
|
||||
- the best advice is to just play around with it a bit; pick the lowest chunk size you can make, and if a voice doesn't quite replicate right, increase the chunk count.
|
||||
* `(Re)Compute Voice Latents`: regenerates a voice's cached latents.
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
@echo off
|
||||
rm .\in\.gitkeep
|
||||
rm .\out\.gitkeep
|
||||
for %%a in (".\in\*.*") do ffmpeg -i "%%a" -ar 22050 -ac 1 -c:a pcm_f32le ".\out\%%~na.wav"
|
||||
for %%a in (".\in\*.*") do ffmpeg -i "%%a" -ac 1 ".\out\%%~na.wav"
|
2
convert/convert.sh
Normal file → Executable file
2
convert/convert.sh
Normal file → Executable file
|
@ -1 +1 @@
|
|||
for a in $(find "in/" -maxdepth 1 -not -name '.gitkeep' -type f); do ffmpeg -i "$a" -ar 22050 -ac 1 -c:a pcm_f32le "out/$(basename $a).wav"; done
|
||||
for a in $(find "in/" -maxdepth 1 -not -name '.gitkeep' -type f); do ffmpeg -i "$a" -ac 1 "out/$(basename $a).wav"; done
|
Loading…
Reference in New Issue
Block a user