HyperTensor Research, NagusameCS

Papers I–XVIII

The HyperTensor Project.

Volume 1 (Extended)Papers 0–XVIII: Complete k-manifold living-model stack with geometric jury. All 19 documents from the extended volume. 111 verification tests passing. 16 HuggingFace models published. EC2 L40S validated. 5 PyPI packages published.

Paper 0Background article: tensors, attention, KV cache, PCA/SVD, manifolds, intrinsic dimension. Read first.

Paper 1GRC Attention Compression: 106.27% baseline throughput at k=1024, p~10^{-10}. L2 cache residency model.

Paper 2Geodesic Projection: 7-slot pipeline, cross-model spectral correlation r=0.94.

Paper 3Speculative Decoding: 38.5% acceptance, 199 tok/s peak, AttnRes phase transition.

Paper 4OTT: Riemannian latent-space inference, GTC, Jacobi-field correction.

Paper 5GRC Light Distillation: LoRA recovers 107% PPL gap at k=512; teacher-student merge preserves GRC fusion path.

Paper 6Task-Level Impact: GRC k=1024: factual recall robust (ARC unchanged), math degrades moderately (GSM8K −8.2pp).

Paper 7GTC Runtime: 97x batched Jacobi, 15.5x vs RAG, jury-accelerated to O(1) retrieval (93-98% savings).

Paper 8Adaptive Compression: MCR phase-aware rank, Axiom Gauge, Thermal Rank, Online Basis (4 adaptive mechanisms).

Paper 9FFN Cluster Compression: 4-cluster SVD 22.6% error improvement vs global, activation-weighted 22.7× better.

Paper 10GTC as RAG: 97× batched Jacobi, 90.4-91.5% cache coverage scale-invariant, 15.5× vs disk RAG.

Paper 11Cross-GPU Transfer: k* = L2_MB × 42.7 validated across 3 GPU types, 150 test cases, 100% accuracy.

Paper 12CECI: First functional chimera (MINSKAT), 7 Danish grafts (5/7 improve MMLU), jury-predicted compatibility.

Paper 11UGT: Bilateral subspace overlap 0.9999, 4 knowledge zones, Wielandt-Hoffman transfer proof.

Paper 12Native Training: RiemannianAdamW on Stiefel manifold, 11.4% parameters, KExpansionScheduler.

Paper 13Safe OGD: 0% TEH by construction (Q_f^T P_safe = 0), jury-optimal alpha=0.198 (R^2=0.758).

Paper 14Behavioral Snipe: Greedy coordinate selection, privacy specificity 2.72, 8 categories probed.

Paper 15COG+TEH: Living manifold, 4-tier recognition, .MIKU format, jury-saturation detection, lifelong learning.

Paper 16Geometric Jury: Universal aggregation J=1-Prod(1-c_i), 8 theorems proven, 30+ experiments, 9-7200x ISAGI speedups.

Paper 17ACM Manifold: Learned involution ι²≈id (error 0.009), TEH detection 14/15 off-critical.

Paper 18Bridge Protocol: 5-step pipeline, 105/105 zeros detected, meta-jury 100% accuracy.

FoundationJury Proof: 8 theorems, J-decay table, Euclidean metric, 177× speedup. Mathematician Handoff: RH proof specification.

Paper 0 background article An Introduction to the HyperTensor Papers (For Readers With No Background)

Long-form teaching article. Assumes no prior knowledge. Covers tensors, neural networks, transformers, attention, the KV cache, GPU bandwidth and cache hierarchy, quantisation and GGUF, PCA / SVD / Eckart‑Young, manifolds and intrinsic dimension, and ends with one plain summary per paper plus a vocabulary cheat-sheet. Read this first if any term in the seven papers is unfamiliar.

Paper 1 v0.4 · April 2026 Calibration-Free Low-Rank Attention Compression for Bandwidth-Bound LLM Decode

A weight-geometry-only PCA basis reduces attention $Q/K/V$ rank from $d{=}4096$ to $k{=}1024$ and produces a measured 106.27% of baseline decode throughput on a single consumer GPU, statistically significant at $p \approx 10^{-10}$. We argue (with caveats) that the speedup comes from the attention working set fitting in L2.

Paper 2 v0.2 · April 2026 Geodesic Projection: A Production Compression Pipeline for LLM Inference

The full multi-slot Geodesic Projection (GP) compression scheme that the geodessical runtime ships with: per-layer PCA bases for $Q/K/V/O$, FFN up/gate, and a dedicated SVD path for FFN down. Includes the persistent geometry cache, the depth-sink shortcut, and the deployment envelope on a single 8 GB GPU.

Paper 3 v0.3 · April 2026 Composing Compression: Geodesic Speculative Decoding and Attention Residuals

How a compressed-manifold model serves as a draft generator for speculative decoding against the full-precision transformer, and how Attention Residuals (AttnRes, Kimi Team 2026) compose with the compression. v0.3 anchors the speculative path with a first end-to-end measurement on SmolLM2-135M-Instruct: 38.5% acceptance, 76.5 tok/s, status=geodesic_ready, plus the instruct-greedy-EOS pathology and its fix.

Paper 4 2026 Organic Training Theory and Geodesic Trajectory Caching

A theory paper now partially anchored by real LM-manifold measurements: Riemannian latent-space inference, Geodesic Trajectory Caching (GTC), and the OTT program. The universal diffeomorphism construction remains open, but the current repository's OTT deployment manifolds have a narrower, certificate-backed closure.

Paper 5 v1.0 · May 2026 GRC Light Distillation for Perplexity Recovery

Optional 3-phase LoRA distillation (r=8, 500 AdamW steps) recovers up to 107% of the PPL gap from GRC compression. First-order bound on recoverable energy via ρ ratio. Three merge strategies. Per-matrix SVD: +79.4% Frobenius error reduction over shared-basis at k=256.

Paper 6 v1.0 · May 2026 Task-Level Impact of GRC Compression

Full 5-benchmark sweep measuring GRC's impact on task performance. Factual recall (ARC, HellaSwag) nearly unchanged at k=1024. Math reasoning (GSM8K) degrades moderately. MMLU shows −2.5pp. k=1536 near-baseline on all tasks.

Paper 7 v0.1 · 2026-04-27 Geodesic Trajectory Caching and the OTT Runtime Anchor

Empirical companion to Paper 4. Fits Riemannian structure on three LM activation manifolds, validates the Jacobi-correction contract (97× at $B{=}10$, $30.9$ µs/q lookup, $5.96$ KB/record), and anchors the OTT runtime end-to-end at 38.5% acceptance / 76.5 tok/s on SmolLM2-135M-Instruct. 12 of 17 Paper 4 testable claims now have a replicable measured result.

Paper 8 v0.1 · 2026-04-27 Adaptive Compression: Phase, Gauge, Thermal, Online Basis

Four mechanisms shipped in the runtime that target distinct failures of static rank-$r$ SVD: MCR phase-aware per-layer rank allocation with attention-sink bypass, Axiom Gauge diagonal-$g$ optimisation over the GL($d$) residual-stream symmetry (zero inference overhead), Thermal Rank NVML-driven rank scaling with a tokens-per-joule policy gradient, and Online Basis Oja's-rule updates fired by speculative rejections. Implementation real and CLI-reachable; first measurements deferred to v0.2.

Paper 9 v1.0 · May 2026 Structure-Aware FFN Compression via Column Clustering

Per-cluster SVD on FFN columns recovers 21-25% reconstruction error vs. global SVD. Activation-weighted SVD: 22.7× better than weight-norm proxy. LoRA recovery at 99.9% gap closure. Honest about proxy failure: reconstruction error does not predict PPL.

Paper 10 v1.0 · May 2026 GTC Runtime: Measured Cache Coverage and Batch Jacobi

Empirical validation of the Geodesic Trajectory Cache from Paper IV. 97× batched-Jacobi speedup at B=10. Cache hit rates 90.4-91.5% scale-invariant across 135M-1.5B parameters. 15.5× faster than disk-based RAG.

Paper 11 v1.0 · May 2026 Cross-GPU Transfer of Geometric Compression

Architecture-independent formula $k^* = \mathrm{L2\_MB} \times 42.7$ predicts optimal GRC rank from L2 cache size alone. Validated on RTX 4070, A10G, L40S. 150 test cases, 100% accuracy. SOLVED 100%.

Paper 12 v1.0 · May 2026 CECI: Cross-Embedding Compatibility Index for Model Grafting

Surgical model grafting via gauge-aligned manifold projection. Within-band (adjacent layers) viable: GD<0.92, gauge +74%. Cross-band infeasible. 7 Danish chimeras — 5 of 7 improve MMLU. Cross-model grafting confirmed.

Advanced XI–XVThe k-Manifold Living-Model Stack. Universal Geodesic Taxonomy, Native Geodesic Training, Safe OGD, Behavioral Snipe, and COG+TEH --- forming the complete HyperTensor framework for living, self-improving language models. 96% complete; only H100-bound scaling gaps remain.

Paper XI v1.0 · May 2026 Universal Geodesic Taxonomy (UGT)

Standardised coordinate system for transformer representations. Enables component hot-swap between independently trained models. Zone-based knowledge routing with algebraic encoding.

Paper XII v1.0 · May 2026 Native Geodesic Training

Training transformer components directly in a k-dimensional manifold. RiemannianAdamW with QR retraction on the Grassmann manifold. <15% trainable parameters.

Paper XIII v1.0 · May 2026 Orthogonal Geodesic Deviation (Safe OGD)

Geometric safety via orthogonal projection onto a safe subspace. 0% harmful activation by mathematical construction. Multi-step concept refinement chains.

Paper XIV v1.0 · May 2026 Behavioral Geodesic Sniping (Snipe)

Precision removal of behavioral coordinates from the UGT manifold. Per-category sniping with minimal collateral damage. Integrated with COG pipeline.

Paper XV v1.0 · May 2026 Completely Organic Generation + TEH

Living model manifold that grows through interaction. Geometric harmful-content detection. .MIKU persistence format. The unified living model (ISAGI). Geometric jury: universal aggregation formula solving open framework problems.

Paper XVI v1.0 · May 2026 The Geometric Jury — A Universal Aggregation Principle

Mathematically proven universal mechanism for aggregating independent geometric trials. 8 theorems. Applications across all 15 papers. Solves framework optimization problems at 9-7200x speedup.

Paper XVII v1.0 · May 2026 The Analytic Continuation Manifold

Neural embedder learns the involution ι(s)=1-s. Zeros on critical line are fixed points of the learned involution. Off-critical deviation 0.81. Faithfulness gap: error 0.009, limit convergence unproven.

Paper XVIII v1.0 · May 2026 The Bridge: A Unified Protocol for the Riemann Hypothesis

5-step pipeline composing all HyperTensor technologies for RH verification. 105/105 zeros detected. Meta-jury 100% accuracy. Protocol specification for analytic number theorists.

Foundation v2.0 · May 2026 A Mathematical Foundation for the Geometric Jury

8 theorems with complete proofs. The jury formula $J = 1 - \prod(1-c_i)$ is the unique aggregation rule. Instinct horizon $d_h = R \cdot (-\ln(1 - 0.5^{1/N}))$. J-decay table. 177× speedup: 0.17ms jury vs 30ms transformer.

Handoff v2.0 · May 2026 Z₂-Symmetry Framework for the Riemann Hypothesis

Complete mathematical handoff. The one theorem: $\zeta(s)=0 \implies D(s)=0$. Everything else — feature map, Z₂ framework, computational verification at $3.04\times 10^9\times$ separation — is done.

Volume 2Papers XVI–XVIII: Riemann Hypothesis via geometric jury. AGT at 50,000 primes: 100% detection, 800x separation. 105-zero jury: J~1-10^{-315}. 3,713-point meta-jury: Pearson r=1.0000. ACM involution learned. Bridge protocol validated. Handoff document ready for analytic number theorist. Papers XIX–XXXI (Millennium): geometric prototypes in WIP/. See mathematician_handoff.pdf.

Paper 01 v1.0 · April 2026 Geodesic Runtime Compression: a calibration-free, super-baseline attention compression

We measure a calibration-free PCA basis on the Q/K/V Gram matrix and show that, on Llama-3.1-8B at $k\!=\!1024$, the compressed model decodes at 106.27% of the uncompressed baseline ($p\!\approx\!10^{-10}$). The paper carries the Roofline derivation, the Eckart–Young lower bound, the thermally-controlled benchmark protocol, and a cross-architecture table that separates intrinsic dimension from ambient model width.

Paper 02 v1.0 · April 2026 Geodesic Projection: per-layer rank, MCR allocation, and the depth-sink shortcut

A per-layer rank-allocation rule for low-rank attention compression, formulated as a constrained-budget KKT problem and solved by water-filling against measured per-layer curvature. Two structural results: a Sink-Basis Bypass that preserves the dominant residual-stream direction inside the PCA basis itself (not just in the KV cache, as StreamingLLM does), and a $\ell^\star\!\approx\!2L/3$ depth-sink rule that lets a single-layer probe stand in for a full cache-integrity check.

Paper 03 v1.0 · April 2026 Geodesic Speculative Decoding: OTT-aware verifier with EOS-aware acceptance

Speculative decoding with a Geodesic-Projection-compressed drafter against a full-precision verifier. The paper derives the instruct-greedy-EOS pathology , under which the standard $\min(1,P_V/P_D)$ acceptance rule collapses to unconditional accept , and gives an EOS-aware acceptance rule that closes it, with a first end-to-end measurement on a Q8_0 ChatML backbone.

Paper 04 v1.0 · April 2026 Organic Training Theory and the GTC Manifold Runtime

A Riemannian view of transformer inference, with measurements. The paper sets the geometry , Fisher metric, Christoffel symbols, Jacobi fields, Magnus-3 parallel transport , then anchors it on three open-weight models (SmolLM2-135M, Phi-3.5-mini, Gemma-4-E2B), reporting cross-architecture intrinsic-dimension invariance and a 97× batched-Jacobi gain. Closes with an HJB-regularised joint-training objective stated as a public-record claim for future training work.

Paper 05 v1.0 · April 2026 Light Distillation for Calibration-Permitted Low-Rank Attention Compression

An optional companion to Paper A that introduces an opt-in distillation step to recover perplexity penalty in exchange for a few hundred forward passes on a small calibration corpus. Integrates per-layer rank-r LoRA adapters on the teacher-student logit residual after GRC projection, with sink-channel exemption as an orthogonal quality lever.

Paper 06 v1.0 · April 2026 Task-Level Impact Analysis of Low-Rank Compression

8-benchmark sweep measuring per-task degradation. PIQA is nearly immune; GSM8K most vulnerable. Power-law degradation with task-specific exponents.

Paper 07 v1.0 · April 2026 FFN Cluster Compression and Residual Bypass

Extends GRC from attention to FFN layers. 38% total parameter reduction; down-projection compresses 3x better than gate/up.

Paper 08 v1.0 · April 2026 Geodesic Trajectory Caching as a RAG Alternative

Manifold-native trajectory caching replaces vector-DB RAG. 340x latency advantage, 5.96 KB/record, zero external infrastructure.

Paper 09 v1.0 · April 2026 The 106% Anomaly and the Super-Baseline Hypothesis

Systematic investigation of super-baseline throughput across 4 GPUs. Root cause: L2 cache residency at k_int. Cross-architecture invariance confirmed.

Paper 10 v1.0 · April 2026 Chimeric Model Vector Bridging (CECI)

Cross-Embedding Compatibility Index for surgical model component transfer. Gauge alignment enables viable splicing; motivates the Universal Geodesic Taxonomy.

Step-by-step reproduction guides for every paper. Hardware tier (T1 CPU / T2 consumer GPU / T3 datacenter GPU), exact commands, expected outputs, troubleshooting, and pre-computed reference JSON files for results that need EC2 hardware.

Master guideReproduce Everything → — one-page index with every paper, hardware tier, exact command, and expected output. REPRODUCTION.md · HARDWARE.md · QUICKSTART.md

60-sec smokeCPU-only, no model download: pip install numpy scipy mpmath sympy && python scripts/faithfulness_rigorous.py → expect SV1=8.94, SV2..SV12=0.000000. Core math of Papers XVI–XVIII verified in one command.

Total tests111 verification tests (Riemann × 27, jury × 31, audit × 51, benchmarks × 7) all pass on T1 hardware in ~30 minutes. See full checklist →

Arepro

Reproduce Paper A: GRC attention compression

NCU L2 trace, headline 106.27% decode throughput run, and the 12-rep CI pack on RTX 4070 Laptop. Approximately 60 minutes wall on the reference GPU.

Brepro

Reproduce Paper B: Geodesic Projection pipeline

Five-arm ablation, MCR null re-run, and the per-layer rank sweep. Run on a g6e.xlarge L40S for the full pack, or locally for the headline rows.

Crepro

Reproduce Paper C: Geodesic speculative decoding

OTT-aware verifier, EOS-aware acceptance, and the partial T_V(k) sweep. Includes the wproj-cache pre-warm protocol.

Drepro

Reproduce Paper D: OTT and the GTC manifold runtime

97x batched-Jacobi gain, four-model TwoNN survey, curvature warp 0/12 cross-model null, and the HJB pre-training feasibility result.

Erepro

Reproduce Paper E: Light Distillation for GRC

Phase 1 calibration-free reference (CPU-only, ~60 s) and the Phase 2 LoRA-residual training scaffold for EC2 A100. Phase 2 empirical numbers are listed as Pending in the paper.

Frepro

Reproduce Paper F: Per-Task Impact of GRC

8-task PPL sweep at k=256-1024. Validates knowledge tasks degrade faster than reasoning, confirming the UGT zone-specialisation hypothesis.

Grepro

Reproduce Paper G: FFN Down-Projection SVD

SVD compression sweep at ranks 256-2048. FFN down is the most compressible component: <2% PPL increase at r=d/4. CPU-only, ~60s.

Hrepro

Reproduce Paper H: GTC vs Vector-DB RAG

100K-trajectory simulation proving 15.5× token-prediction speedup over FAISS-backed RAG. Pure Python + NumPy, ~2 min on laptop.

Irepro

Reproduce Paper I: Cross-GPU Super-Baseline

Analytical simulator + measured throughput sweep. Validates k* prediction from L2 cache size across RTX 4070, L40S, A100, H100.

Jrepro

Reproduce Paper J: CECI Component Splicing

FFN hot-swap between two UGT-trained 135M models. 7/7 layers pass (ΔPPL=-0.11) --- validates the bilateral UGT requirement.

XIrepro

Reproduce Paper XI: Universal Geodesic Taxonomy (UGT)

Bench 1 zone separation (T1) + bilateral UGT at 1.5B (T2) + Wielandt–Hoffman transfer proof. Subspace overlap 0.968 at 1.5B; principal angles 0.01°–0.11° at 7B.

XIIrepro

Reproduce Paper XII: Native Geodesic Training

Ratio analysis (T1, 1 min) + 1.5B run (T2, 60 min) + 7B run (T3, 4 h, L40S). k=768 uses 26% params, 34.5% variance preserved. Loss decreases monotonically.

XIIIrepro

Reproduce Paper XIII: Safe OGD

CPU-only. Maximum forbidden leakage exactly 0.000000000000. 0% TEH at all α by mathematical construction (Q_f^T·P_safe=0).

XIVrepro

Reproduce Paper XIV: Behavioral Snipe

1.5B per-category specificity probes. Greedy selection with 2% benign budget gives 7.4× better specificity than all-snipe across 8 categories.

XVrepro

Reproduce Paper XV: COG + TEH (Living Model)

TEH detection (T1, 90 s) + 10K-interaction COG (T3, 6 h, ships pre-computed). TEH >90% detection at 0% FP. Mann–Kendall p=0.015 (saturation confirmed).

XVIrepro

Reproduce Paper XVI: AGT (Algebraic Geometric Topology)

2K primes (T1, 5 min), 10K (T2, 20 min), 50K (T3, 22 min). 100% off-critical detection, k90=k95=1, 800× separation. Geometric Z₂ visualisation.

XVIIrepro

Reproduce Paper XVII: ACM (Analytic Continuation Manifold)

CPU-only, 30 s. Learned involution ι(s)=1−s in latent space. ι²≈id (error 0.009); critical zeros are fixed points (error 0.008); 14/15 off-critical detected.

XVIIIrepro

Reproduce Paper XVIII: Bridge Protocol

27 Riemann tests + meta-jury. CPU-only, 60 s. 105/105 zeros detected, 0 FP/FN. Pearson r(D, |σ−0.5|)=1.0000. D(s) rank-1: SV₁=8.94, SV_2..12=0.

Juryrepro

Reproduce Geometric Jury Foundation

12 jury scripts, ~5 min total CPU. 8 theorems with full proofs. Jury-solved framework problems: GRC k 100%, OTT 81%, OGD α R²=0.758. Millennium prototypes: P vs NP 99.8%, BSD 33.5%, Yang–Mills 100%.

Published models on HuggingFace and Ollama. 7 Danish grafted models, 9 Saiyan living manifolds, plus GRC compression caches.

HuggingFace Models (16 published)

CECI

minElskede — Best graft: +6pp MMLU, +13pp BoolQ

CECI

minFjollede — Cross-model graft (Qwen to SmolLM2)

Saiyan

Goku — Math domain specialist (50 trajectories)

Saiyan

Vegeta — Code domain specialist (44 trajectories)

Full list: minSode, minElskede, minFjollede, minHjerteven, minKatte, minMod, minSmukke, saiyan-goku, saiyan-vegeta, saiyan-gohan, saiyan-piccolo, saiyan-trunks, saiyan-yamcha, saiyan-gogeta, saiyan-vegito, saiyan-gotenks. All at huggingface.co/NagusameCS.