Commit Graph

7 Commits

Author SHA1 Message Date
mrq
d1ad634ea9 added japanese preprocessor for tokenizer 2023-03-17 20:03:02 +00:00
mrq
af78e3978a deduce if preprocessing text by checking the JSON itself instead 2023-03-16 14:41:04 +00:00
mrq
1f674a468f added flag to disable preprocessing (because some IPAs will turn into ASCII, implicitly enable for using the specific ipa.json tokenizer vocab) 2023-03-16 04:33:03 +00:00
Johan Nordberg
491fe7f6d3 Remove some assumptions about working directory
This allows cli tool to run when not standing in repository dir
2022-05-29 01:10:19 +00:00
James Betker
0570034eda Automatically pick batch size based on available GPU memory 2022-05-13 10:30:02 -06:00
James Betker
00e84bbd86 fix paths 2022-05-02 20:56:28 -06:00
James Betker
23a3d5d00b Move everything into the tortoise/ subdirectory
For eventual packaging.
2022-05-01 16:24:24 -06:00