|
ab0abd2b12
|
fixes fixes fixes (a quarter of my recently processed audio returned zero'd tensors......)
|
2025-02-22 09:07:33 -06:00 |
|
|
50506e5ebc
|
oops
|
2025-02-20 20:55:58 -06:00 |
|
|
fc1ec2019d
|
added option to buffer process jobs across multiple speakers to maybe squeeze out some throughput speeds for vall_e.emb.process (in the event of lots of speakers with low file counts, such as Emilia)
|
2025-02-20 14:56:32 -06:00 |
|
|
ce1ca0124a
|
lol...
|
2025-02-20 13:40:36 -06:00 |
|
|
92139b6da9
|
additional cruft, added a note in documentation to be aware of NUMA node topology when running vall_e.emb.process with more than one process
|
2025-02-18 19:56:30 -06:00 |
|
|
596c2df11c
|
added arg to skip processing speakers with not enough utterances for whenever I get around to processing my subest of Emilia for nvidia/audio-codec-44khz (because Emilia has a ton of low-utternace speaker counts and right now my focus with the nemo model is on getting it to actually speak without much problems rather than feed it a gorillion speakers)
|
2025-02-18 10:49:21 -06:00 |
|
|
8331eee6fa
|
added arg to limit vall_e.emb.process batch size since there's some speaker groups in LibriLight/Speech/whatever that have 10K utterances and I'm going impatient
|
2025-02-18 10:19:17 -06:00 |
|
|
8f86cf0e4e
|
possible logic optimization so I don't spend another 15 minutes simply iterating back to the point I was at in vall_e.emb.process
|
2025-02-16 11:34:05 -06:00 |
|
|
c0b46b82eb
|
tweaks
|
2025-02-10 21:48:29 -06:00 |
|
|
d6a679ca5c
|
tweaks
|
2025-02-10 20:53:08 -06:00 |
|
|
276a2342a4
|
tweaks to processing script
|
2025-02-10 19:18:13 -06:00 |
|
|
953015748f
|
ugh
|
2025-02-07 20:49:28 -06:00 |
|
|
ed94b261dc
|
could have sworn i had 'vall_e.emb.process --dtype' working, also possible RAM optimization so I can stop locking up my server when firing four encoding processes
|
2025-02-07 18:52:19 -06:00 |
|
|
67a9401cce
|
oops
|
2025-02-06 15:14:14 -06:00 |
|
|
712ce4af5d
|
maybe fixed errors with DAC backend, added option to limit by duration in emb.process (because I only really need short utternaces right now and I'm not ready to spend a week on processing everything again)
|
2025-02-06 12:37:18 -06:00 |
|
|
299cc88821
|
re-added amp encoding/decoding for audio, possible bad idea to ignore using amp instead if requested
|
2025-02-05 21:55:06 -06:00 |
|
|
7592befc53
|
updated vall_e.emb.process to allow for batched processing, some typo fixes (it's painfully slow on my 7900XTX...)
|
2025-02-05 21:13:20 -06:00 |
|
|
bef43a0c18
|
added experimental entropix sampling support
|
2024-10-11 21:18:26 -05:00 |
|
|
52299127ab
|
fix vall_e.emb.process
|
2024-10-08 20:00:34 -05:00 |
|
|
0656a762af
|
fix vall_e.emb.transcriber
|
2024-10-08 19:24:43 -05:00 |
|
|
32287710a2
|
moved prints to use logger, edited readme (fused_attn doesnt seem stable for training)
|
2024-08-29 13:27:16 -05:00 |
|
|
79a6781c9e
|
fix vall_e.data --action=hdf5 actually transcribing because past me completely forgot it tried to already put the transcribe/process dataset scripts inside the module before
|
2024-08-08 07:51:42 -05:00 |
|
|
613024ec0d
|
ugh
|
2024-08-06 20:35:15 -05:00 |
|
|
eac353cd0b
|
busy work and cleanup while I wait for 1TB of audio to quantize... again.
|
2024-08-06 20:23:33 -05:00 |
|
|
b6ba2cc8e7
|
tweaked vall_e.emb.process to instead process audio one file at a time instead of all the files for a given speaker to avoid OOMing on less-memory-filled systems with --low-memory
|
2024-08-06 14:24:40 -05:00 |
|
|
9710b06b74
|
tweaks and things
|
2024-08-06 08:17:25 -05:00 |
|
|
134dac8c2b
|
re-adapted process_libritts.py to a 'better' way (better because it processed without needing to shuffle a bunch of things and adapt to cope or something)
|
2024-08-05 20:34:58 -05:00 |
|
|
3f73fcca29
|
oops
|
2024-08-05 20:12:13 -05:00 |
|
|
597441e48b
|
moved transcribe and process dataset scripts to vall_e/emb within the module itself, argparse-ified transcription script
|
2024-08-05 19:40:50 -05:00 |
|