Commit Graph

6 Commits

Author SHA1 Message Date
James Betker
52410fd9d9 256-bpe tokenizer 2021-12-25 08:52:08 -07:00
James Betker
e55d949855 GrandConjoinedDataset 2021-12-23 14:32:33 -07:00
James Betker
b9de8a8eda More fixes 2021-12-22 19:21:29 -07:00
James Betker
191e0130ee Another fix 2021-12-22 18:30:50 -07:00
James Betker
6c6daa5795 Build a bigger, better tokenizer 2021-12-22 17:46:18 -07:00
James Betker
c737632eae Train and use a bespoke tokenizer 2021-12-22 15:06:14 -07:00