Scope
Reproduces the CECI (Cross-Encoded Component Interchange) splicing experiment: hot-swapping FFN layers between two independently UGT-trained models and measuring PPL degradation. Validates the bilateral UGT requirement --- that component interchange only succeeds when both models share the same UGT taxonomic basis, proving the basis encodes functional semantics.
Hardware target
- Two UGT-trained models at 135M scale (SmolLM2-135M).
- GPU: any CUDA 12.x with ≥4GB VRAM.
- Or CPU-only: ~5 minutes for the 135M splice test.
Prerequisites
- Python 3.10+, PyTorch 2.x, transformers.
- Two UGT-trained models (Phase 5, k=256, 100K steps).
scripts/e2e_pipeline.pyfor the full UGT training pipeline.
Step 1: Train two UGT models (if not already available)
python scripts/e2e_pipeline.py --model SmolLM2-135M --k 256 --steps 100000 \
--output-a ../models/ugt_model_a --output-b ../models/ugt_model_b
Step 2: Run the CECI splice test
python scripts/ceci_splice_test.py --model-a ../models/ugt_model_a \
--model-b ../models/ugt_model_b --layers 0,4,8,12,16,20,24,28
Step 3: Expected output
- Bilateral UGT (both models UGT-trained): 7/7 layers pass. Mean ΔPPL = -0.11 (slight improvement from component diversity).
- Unilateral UGT (one model UGT, one vanilla): FFN transfer fails. ΔPPL > +50, confirming bilateral requirement.
- Random splice (different layers): ΔPPL > +100, expected failure.
Validation
The key validation: FFN transfer works ONLY when both models share the UGT basis. This proves the basis encodes functional semantics --- it's not just a compression artifact. The bilateral requirement is the central architectural claim of Paper XI (Universal Geodesic Taxonomy).