|
5ac119a6e7
|
added light web UI (need to port the telemetry disabling bandaids from aivc)
|
2023-09-09 16:17:20 -05:00 |
|
|
10c34c5b98
|
added a length-based decay factor for repetition penalty
|
2023-09-08 21:02:00 -05:00 |
|
|
14c78bae39
|
added lots of sampling options (top-k/top-p, repetition penalty, length penalty)
|
2023-09-08 20:30:54 -05:00 |
|
|
b2907ae7e0
|
seems that my PromEmbedding/RespEmbedding doesn't actually work all that well, naively using dedicated MultiEmbeddings for AR/NAR in the monolithic model is the best way to go
|
2023-09-08 01:03:24 -05:00 |
|
|
ab5134f385
|
tweaks and fixes
|
2023-09-07 17:08:38 -05:00 |
|
|
e40c0d34a0
|
somewhat got recurrent forward working (it's as accurate as chunkwise forward: it's not accurate at all), added option to use AMP instead of blanket setting the weight's dtype
|
2023-09-01 20:58:29 -05:00 |
|
|
6455a2f9d7
|
I think I fixed a bug?
|
2023-08-24 23:33:36 -05:00 |
|
|
4585824cd3
|
tweaks, including exporting on save/quit
|
2023-08-23 16:43:03 -05:00 |
|
|
524d289c9c
|
Forgot to re-add in setting the weight's dtype on model load
|
2023-08-22 22:57:23 -05:00 |
|
|
7b1b82e0e5
|
inferencing cleanup
|
2023-08-20 21:36:02 -05:00 |
|
|
a47029065b
|
I don't know if the lack of start/stop tokens being added was causing my inference tests to fail, but it seems better now
|
2023-08-20 19:21:54 -05:00 |
|
|
bbb0563b3d
|
pseudocode polyfill stub some other flavor of working on adding the tasks
|
2023-08-18 22:22:13 -05:00 |
|
|
ced31fd9b7
|
removed the sampler as it's very misleading
|
2023-08-18 14:47:48 -05:00 |
|
|
1e3e1d9315
|
tweaks
|
2023-08-15 21:58:16 -05:00 |
|
|
13571380be
|
made exporter make more sense
|
2023-08-13 22:56:28 -05:00 |
|
|
d7deaf6def
|
distributed training works now (hopefully)
|
2023-08-13 22:07:45 -05:00 |
|
|
d1b9770d41
|
set model to eval when inferencing (very important)
|
2023-08-05 04:29:05 +00:00 |
|
|
c85101403f
|
big cleanup
|
2023-08-03 20:26:36 -05:00 |
|
|
bf8cedc9dd
|
Rewrite init
|
2023-08-02 21:53:35 +00:00 |
|