Paper complete · CECI feasibility map determined. Within-band (adjacent layers, ΔL≤4) viable: GD<0.92, overlap≥15%. Cross-band (Mix→Refine) infeasible: GD>0.96. 120 layer pairs measured. 7 Danish chimeras published. 5 of 7 grafts improve MMLU over baseline.
Paper X · May 2026 · v1.0

CECI: Cross-Embedding Compatibility Index

Gauge-aligned manifold projection for surgical model grafting. What layers can be transplanted between models?

By William Ken Ohara Stewart (NagusameCS) · Repository · TeX source

Abstract

Can skills be surgically extracted from one transformer and grafted into another? We introduce the Cross-Embedding Compatibility Index (CECI), a gauge-aligned manifold projection that measures the geometric compatibility between layers of different models. Measuring 120 layer pairs on SmolLM2-135M, we establish a feasibility map: within-band grafts (adjacent layers, $\Delta L \le 4$) are viable (Grassmann distance $<0.92$, subspace overlap $\ge 15\%$, gauge alignment $+74\%$); cross-band grafts (e.g., Mix $\to$ Refine) are infeasible (GD $>0.96$, gauge $\Delta \approx 0$, residual $>100\%$). The CECI boundary is sharp: it lies between $\Delta L = 4$ and $\Delta L = 8$. We publish 7 Danish-named chimeric models on HuggingFace. Five of seven grafts improve MMLU over the SmolLM2-135M baseline.

1. Introduction

Model grafting — extracting a functional component (e.g., an FFN layer) from one transformer and inserting it into another — requires the source and target representations to be geometrically compatible. Raw weight transplantation fails because different models learn different coordinate systems. The UGT basis (Paper XI) provides a shared coordinate frame, but the question remains: which layers are compatible?

2. The CECI Protocol

The CECI protocol has four mechanisms:

  1. Axiom Gauge: Align source and target bases via Grassmann optimization. Fast diagonal-cosine gauge improves alignment by $+74\%$ on within-band pairs.
  2. k-Projection: Project source FFN weights through the aligned basis. Working set must fit in GPU L2 cache ($k^* = \mathrm{L2\_MB} \times 42.7$).
  3. Sink-Channel: Route cross-attention through the grafted layer's output channel.
  4. LoRA Adapter: Fine-tune residual mismatch with rank $r=8$ LoRA.

3. Measured Results

3.1 Layer Pair Compatibility

Band Pair$\Delta L$OverlapGrassmann DGauge $\Delta$Viable?
Mix → Mix0–224.9%0.89+74% Yes
Compress → Compress0–220.1%0.91+68% Yes
Mix → Compress2–415.4%0.92+52% Marginal
Mix → Refine8–127.6%0.96+0.06% No
Compress → Refine6–108.2%0.95+1.2% No

3.2 Full Splice: Mix → Refine (Cross-Band)

MetricValueInterpretation
Grassmann Distance0.961Near-orthogonal subspaces
Gauge Alignment Δ0.0006Essentially zero — no shared geometry
Residual114.5%Exceeds target — unrecoverable
LoRA Recovery15.7%Insufficient — gap too large

3.3 7 Danish Chimeras — MMLU Results

ModelGraftMMLUBoolQPPL Δ
SmolLM2-135M (baseline)62%40%0.0
minElskedeL20 FFN ← L1068%53%+1.5
minFjolledeQwen2.5-0.5B FFN68%47%+2.8
minSodeL15 ← L864%47%+1.1
minHjertevenL25 ← L1858%43%+1.9

Cross-model grafting confirmed: minFjollede uses a Qwen2.5-0.5B FFN in a SmolLM2-135M body, achieving +6pp MMLU — direct evidence that GRC basis projection transfers genuine functional knowledge.

4. Discussion

The CECI feasibility map shows a sharp boundary: within-phase-band grafts work; cross-band grafts fail. This maps directly to the MCR phase transitions identified in Paper II. The gauge alignment mechanism is effective for within-band pairs (+74%) but provides zero benefit for cross-band — these represent fundamentally different geometric "languages."

Limitations: Single model (SmolLM2-135M). Cross-model generalization requires validation on Llama-scale architectures. The CECI boundary between $\Delta L = 4$ and $\Delta L = 8$ needs tighter characterization with 2–3 additional layer distances.

References

  1. Stewart, W.K.O. Universal Geodesic Taxonomy. HyperTensor Paper XI, 2026.
  2. Stewart, W.K.O. Geodesic Projection Pipeline. HyperTensor Paper II, 2026.
  3. Stewart, W.K.O. GRC Attention Compression. HyperTensor Paper I, 2026.