Got rid of the converged multiplexer bases but kept the configurable architecture. The
new multiplexers look a lot like the old one.
Took some queues from the transformer architecture: translate image to a higher filter-space
and stay there for the duration of the models computation. Also perform convs after each
switch to allow the model to anneal issues that arise.