Commit Graph

2 Commits

Author SHA1 Message Date
James Betker
0dca36946f Hard Routing mods
- Turns out my custom convolution was RIDDLED with backwards bugs, which is
   why the existing implementation wasn't working so well.
- Implements the switch logic from both Mixture of Experts and Switch Transformers
  for testing purposes.
2021-02-02 20:35:58 -07:00
James Betker
96bc80313c Add switch norm, up dropout rate, detach selector 2021-01-26 09:31:53 -07:00