diff --git a/codes/models/composable/README.md b/codes/models/composable/README.md
new file mode 100644
index 00000000..58fb318c
--- /dev/null
+++ b/codes/models/composable/README.md
@@ -0,0 +1,61 @@
+This directory contains the code for my vision for how building machine learning models should actually work: they should be 
+composable models from primitives that are defined by your inputs and outputs, and agnostic to anything else.
+
+Building a composable model requires you to make a few decisions:
+
+1. What inputs can you provide? More is always better.
+2. What type of outputs do you expect?
+3. **Very roughly** how much compute do you want to throw at the problem?
+
+## Some basic concepts
+
+### Structure
+
+Before we go much further, I want to define an important notion of "structure" which will be used throughout this document.
+"Structure" refers to how wide the aggregate variable dimensions of your input and outputs would be.
+
+For example, a 256x256px input image has a "structural" dimension of 256*256=65,536.
+
+I use three classifications for structure in this document. 
+
+1. **Pointwise** has a structural dimension = O(1).
+2. **Low structure** data has a structural dimension < O(1000) *(at least for us mere mortals)*
+3. **Highly structured** data has a structural dimension > O(1000) magnitude.
+
+These definitions are bounded by power laws. Dense computation in machine learning is essential for good performance,
+but consumes O(N²) compute and memory. It is therefore impossible to perform dense computation on highly structured data,
+and we must reduce it first. Composable models takes care of this for you, as long as you make the distinction on what
+input types you provide.
+
+### Alignment
+
+
+## Building a composable model
+
+Composable models are built by simply defining your inputs, your outputs, and the compute you wish to expend. 
+
+Lets come up with a fairly preposterous toy problem to show the power of composable models. Say you have a dataset 
+consisting of pictures of animals, a textual description of those animals, a label for each animal and an audio clip
+of the sounds that animal was making when the picture was taken. You want to build a model that predicts the sound. Here
+is how you would do it: 
+
+```python
+image_input = Input(structure=HighStructure, dimensions=2, compute=Medium)
+text_input = Input(structure=LowStructure, dimensions=1, discrete=True, compute=High)
+class_input = Input(structure=Point, discrete=True, compute=Medium)
+
+sound_output = Output(structure=HighStructure, dimension=1)
+
+model = UniversalModel(unaligned_inputs=[image_input, text_input, class_input], aligned_inputs=[], outputs=[sound_output], compute=Medium)
+```
+
+
+Once you've decided these, you can build a composable model.
+
+There are four "types" of composable models:
+1. Fan-in models, which reduce a highly structured input (e.g. images) to a less structured output. Applications are classifiers, coarse object detection and speech recognition.
+2. U models, which efficiently process highly structured inputs. Applications are generative models and fine object detection.
+3. Straight models, which perform dense computation on less structured data. Think text inputs and outputs. Applications are text generation
+4. Fan-out models, which take low-structured inputs and produce highly structured outputs. Applications are generative networks like GANs, though I recommend you actually use diffusion models for this purpose.
+
+A composable model consists of three parts:
\ No newline at end of file
diff --git a/codes/models/composable/__init__.py b/codes/models/composable/__init__.py
new file mode 100644
index 00000000..e69de29b