- Add LowDimRRDB; essentially a "normal RRDB" but the RDB blocks process at a low dimension using PixelShuffle
- Add switching wrappers around it
- Add support for switching on top of multi-headed inputs and outputs
- Moves PixelUnshuffle to arch_util
Renames AttentiveRRDB to SwitchedRRDB. Moves SwitchedConv to
an external repo (neonbjb/switchedconv). Switchs RDB blocks instead
of conv blocks. Works good!
After doing some thinking and reading on the subject, it occurred to me that
I was treating the generator like a discriminator by focusing the network
complexity at the feature levels. It makes far more sense to process each conv
level equally for the generator, hence the FlatProcessorNet in this commit. This
network borrows some of the residual pass-through logic from RRDB which makes
the gradient path exceptionally short for pretty much all model parameters and
can be trained in O1 optimization mode without overflows again.