|
69f140ba45
|
fix oversight with phonemizing french because espeak defines french as fr-fr instead of fr (even though spain spanish is es and not es-sp or some shit, but portugal portuguese is pt-pt)
|
2024-09-13 12:53:36 -05:00 |
|
|
eac353cd0b
|
busy work and cleanup while I wait for 1TB of audio to quantize... again.
|
2024-08-06 20:23:33 -05:00 |
|
|
491ae2a684
|
some insanity for sanity checks (some phonemes from phonemizing japanese are not in my tokenizer...)
|
2024-07-22 00:30:40 -05:00 |
|
|
ad024f400f
|
actually pass language into dataset process script, fix coercing japanese into hiragana because espeak does not like kanji
|
2024-07-21 23:21:37 -05:00 |
|
|
234f9efc6e
|
ugh
|
2024-06-09 11:39:43 -05:00 |
|
|
4f5c9e518a
|
actually use the passed-through sample rate from encode for DAC because it does its own resampling I guess
|
2024-04-18 13:32:41 -05:00 |
|
|
78378ed1ce
|
overhauled dataloading code to be marginally faster, mostly cleaned up, and can leverage a metadata json to help things out
|
2023-08-26 19:53:23 -05:00 |
|
|
608c1970eb
|
ops
|
2023-08-03 20:36:19 -05:00 |
|
|
bf8cedc9dd
|
Rewrite init
|
2023-08-02 21:53:35 +00:00 |
|