|
11fa3da665
|
some cleanup, fixed the wrapper attention to explicitly use other sdpa backends
|
2024-08-03 19:51:00 -05:00 |
|
|
9564ecda43
|
wrapper attention class for other sdpa backends + xformers seems to have broke...
|
2024-08-03 15:12:11 -05:00 |
|
|
ad024f400f
|
actually pass language into dataset process script, fix coercing japanese into hiragana because espeak does not like kanji
|
2024-07-21 23:21:37 -05:00 |
|
|
7b210d9738
|
sanity cleanup
|
2024-07-04 15:58:08 -05:00 |
|
|
db62e55a38
|
oops, I forgot to use the new thing for audio_backend
|
2024-07-04 14:54:11 -05:00 |
|
|
7feeb944a0
|
probably insane with even entertaining going this route
|
2024-06-03 20:26:27 -05:00 |
|
|
ddbacde0d1
|
DAC just doesn't work well enough......
|
2024-05-25 11:07:52 -05:00 |
|
|
74e531d391
|
ugh
|
2024-05-18 12:02:56 -05:00 |
|
|
59ef9461f8
|
ugh
|
2024-05-18 10:13:58 -05:00 |
|
|
d9aabfa3ae
|
final tweaks, hopefully, again
|
2024-05-15 23:04:19 -05:00 |
|
|
2437a86efa
|
ugh
|
2024-05-12 13:02:15 -05:00 |
|
|
4f1593c8db
|
a bunch of shit to salvage my old encodec-quantized audio because dac-encoded audio just does not want to converge
|
2024-05-12 10:17:29 -05:00 |
|
|
c6e0f905b5
|
final tweaks (again) before training restarts
|
2024-05-08 02:11:38 -05:00 |
|
|
8aa1b2dabf
|
documentation update
|
2024-05-04 21:03:46 -05:00 |
|
|
caad7ee3c9
|
final tweaks, hopefully
|
2024-04-28 22:28:29 -05:00 |
|
|
ffc334cf58
|
added dataset transcription helper script (now I don't ever have to touch ai-voice-cloning) (to-do: unify scripts into the module)
|
2024-04-21 17:43:20 -05:00 |
|
|
071fb97777
|
dataset preparation script updates, caved and am using HF tokenizer now
|
2024-04-21 14:49:18 -05:00 |
|
|
a8ffa88844
|
it slipped my mind that technically DAC can be used at any sample rate, since it models waveforms; make it a config YAML option to allow this behavior
|
2024-04-19 18:36:54 -05:00 |
|
|
00804a47e9
|
Forgot to copy intermediary dataset conversion script
|
2024-04-18 21:34:28 -05:00 |
|
|
4f5c9e518a
|
actually use the passed-through sample rate from encode for DAC because it does its own resampling I guess
|
2024-04-18 13:32:41 -05:00 |
|
|
09cda7d3f9
|
added sampling by speaker group name (might be better to de-emphasize the LibriVox/Audiobooks that are in large numbers, and emphasize the smaller pools), log cleanup
|
2023-10-16 19:30:38 -05:00 |
|
|
2deb995cc9
|
updated setup script
|
2023-10-06 20:08:28 -05:00 |
|
|
1fd91b6437
|
cleanup
|
2023-10-06 10:13:54 -05:00 |
|
|
3db7e7dea1
|
implicitly load checkpoint if deepspeed checkpoint not found, updated setup script to grab the diskcached dataloader things
|
2023-10-06 10:02:45 -05:00 |
|
|
2f2505b12f
|
updated setup script
|
2023-10-06 08:08:28 -05:00 |
|
|
153f8b293c
|
added min-x and min-y arguments to plot.py, helper script to download from my existing checkpoint
|
2023-10-04 19:41:37 -05:00 |
|
|
5ac119a6e7
|
added light web UI (need to port the telemetry disabling bandaids from aivc)
|
2023-09-09 16:17:20 -05:00 |
|
|
4613781e23
|
integrated plot script, added tts-c task token to help the model be able to mix between normal VALL-E and VALL-E continuous
|
2023-09-02 16:29:53 -05:00 |
|
|
f7e942ec99
|
modified plotting script to be more agnostic to X
|
2023-09-02 13:59:43 -05:00 |
|
|
21e5d250cc
|
fixed up plot script that I forgot about
|
2023-09-02 13:31:04 -05:00 |
|
|
5c8694db8e
|
nasty bandaid if there's no validation dataset specified during training (for example, during finetunes)
|
2023-08-30 18:23:05 -05:00 |
|
|
7b3be3d7bf
|
added helper scripts to process LibriTTS/LibriLight, detect duplicate speaker+books between them, and script to directly phonemize and quantize LibriTTS
|
2023-08-26 10:21:12 -05:00 |
|
|
bf8cedc9dd
|
Rewrite init
|
2023-08-02 21:53:35 +00:00 |
|