Commit Graph

14 Commits

Author SHA1 Message Date
mrq
9e65e05e83 more windows specific fixes, limit gradio to <5.0.0 on linux (it works on windows, but not on my linux machine tm) 2024-11-04 18:00:33 -06:00
mrq
c83670c38c Windows specific fixes (to-do: find libespeak-ng.dll automatically because it cannot be trusted to do it by default) 2024-11-03 19:19:15 -06:00
mrq
c09133d00f added safetensors support (with metadata) and feed whatever torch.load/torch.save into it 2024-08-03 23:15:20 -05:00
mrq
ff97e7480d fixed pip shitting itself on setup 2024-06-13 13:03:36 -05:00
mrq
da8242d086 finally got around to removing omegaconf 2024-06-07 20:23:53 -05:00
mrq
33b7f81b94 small cleanups 2024-05-04 22:37:22 -05:00
mrq
cce929e136 nasty hotfix for transformer's Mixtral throwing an error when batch sizes > 1 2024-01-26 19:41:12 -06:00
mrq
0db3203b21 added LLaMA/Mixtral (if experts>1) model arches, utilize XMoE's loss as well, set MoE frequency to 1 to make every layer MoE'd for RetNet, etc. (going to do tests without burning out again to see how things go) 2023-12-22 19:27:36 -06:00
mrq
12cfc9e502 added prodigyopt as a dependency because I keep forgetting 2023-10-04 19:42:56 -05:00
mrq
c0b25541e3 restructured some things with the model to remove dead weights 2023-09-20 19:10:59 -05:00
mrq
b6c9686f7d Do not install DeepSpeed under Windows (to-do: default backend to use local if on Windows) 2023-08-24 14:27:36 -05:00
mrq
2e03e5ac93 Fixed an issue with having fairseq installed at all will brick logging 2023-08-02 22:57:10 -05:00
mrq
0f9b81de75 oops 2023-08-02 18:12:36 -05:00
mrq
bf8cedc9dd Rewrite init 2023-08-02 21:53:35 +00:00