18 Papers · May 2026 · 15/18 at 100%

Geometry-aware inference for large language models.

Documentation and Research of the HyperTensor project. Papers I–XV: engineering the k-manifold living-model stack — compression, safety, behavioural control, and adaptive intelligence. Papers XVI–XVIII: a geometric attack on the Riemann Hypothesis via Z2 symmetry and the geometric jury principle.

By William Ken Ohara Stewart (NagusameCS) · 2026 · Source

The HyperTensor Project.

Volume 1 (Extended)Papers 0–XVIII: Complete k-manifold living-model stack with geometric jury. All 19 documents from the extended volume. 111 verification tests passing. 16 HuggingFace models published. EC2 L40S validated. 5 PyPI packages published.
Paper 0Background article: tensors, attention, KV cache, PCA/SVD, manifolds, intrinsic dimension. Read first.
Paper 1GRC Attention Compression: 106.27% baseline throughput at k=1024, p~10^{-10}. L2 cache residency model.
Paper 2Geodesic Projection: 7-slot pipeline, cross-model spectral correlation r=0.94.
Paper 3Speculative Decoding: 38.5% acceptance, 199 tok/s peak, AttnRes phase transition.
Paper 4OTT: Riemannian latent-space inference, GTC, Jacobi-field correction.
Paper 5GRC Light Distillation: LoRA recovers 107% PPL gap at k=512; teacher-student merge preserves GRC fusion path.
Paper 6Task-Level Impact: GRC k=1024: factual recall robust (ARC unchanged), math degrades moderately (GSM8K −8.2pp).
Paper 7GTC Runtime: 97x batched Jacobi, 15.5x vs RAG, jury-accelerated to O(1) retrieval (93-98% savings).
Paper 8Adaptive Compression: MCR phase-aware rank, Axiom Gauge, Thermal Rank, Online Basis (4 adaptive mechanisms).
Paper 9FFN Cluster Compression: 4-cluster SVD 22.6% error improvement vs global, activation-weighted 22.7× better.
Paper 10GTC as RAG: 97× batched Jacobi, 90.4-91.5% cache coverage scale-invariant, 15.5× vs disk RAG.
Paper 11Cross-GPU Transfer: k* = L2_MB × 42.7 validated across 3 GPU types, 150 test cases, 100% accuracy.
Paper 12CECI: First functional chimera (MINSKAT), 7 Danish grafts (5/7 improve MMLU), jury-predicted compatibility.
Paper 11UGT: Bilateral subspace overlap 0.9999, 4 knowledge zones, Wielandt-Hoffman transfer proof.
Paper 12Native Training: RiemannianAdamW on Stiefel manifold, 11.4% parameters, KExpansionScheduler.
Paper 13Safe OGD: 0% TEH by construction (Q_f^T P_safe = 0), jury-optimal alpha=0.198 (R^2=0.758).
Paper 14Behavioral Snipe: Greedy coordinate selection, privacy specificity 2.72, 8 categories probed.
Paper 15COG+TEH: Living manifold, 4-tier recognition, .MIKU format, jury-saturation detection, lifelong learning.
Paper 16Geometric Jury: Universal aggregation J=1-Prod(1-c_i), 8 theorems proven, 30+ experiments, 9-7200x ISAGI speedups.
Paper 17ACM Manifold: Learned involution ι²≈id (error 0.009), TEH detection 14/15 off-critical.
Paper 18Bridge Protocol: 5-step pipeline, 105/105 zeros detected, meta-jury 100% accuracy.
FoundationJury Proof: 8 theorems, J-decay table, Euclidean metric, 177× speedup. Mathematician Handoff: RH proof specification.
Paper 0 background article An Introduction to the HyperTensor Papers (For Readers With No Background)

Long-form teaching article. Assumes no prior knowledge. Covers tensors, neural networks, transformers, attention, the KV cache, GPU bandwidth and cache hierarchy, quantisation and GGUF, PCA / SVD / Eckart‑Young, manifolds and intrinsic dimension, and ends with one plain summary per paper plus a vocabulary cheat-sheet. Read this first if any term in the seven papers is unfamiliar.

Paper 1 v0.4 · April 2026 Calibration-Free Low-Rank Attention Compression for Bandwidth-Bound LLM Decode

A weight-geometry-only PCA basis reduces attention $Q/K/V$ rank from $d{=}4096$ to $k{=}1024$ and produces a measured 106.27% of baseline decode throughput on a single consumer GPU, statistically significant at $p \approx 10^{-10}$. We argue (with caveats) that the speedup comes from the attention working set fitting in L2.

Paper 2 v0.2 · April 2026 Geodesic Projection: A Production Compression Pipeline for LLM Inference

The full multi-slot Geodesic Projection (GP) compression scheme that the geodessical runtime ships with: per-layer PCA bases for $Q/K/V/O$, FFN up/gate, and a dedicated SVD path for FFN down. Includes the persistent geometry cache, the depth-sink shortcut, and the deployment envelope on a single 8 GB GPU.

Paper 3 v0.3 · April 2026 Composing Compression: Geodesic Speculative Decoding and Attention Residuals

How a compressed-manifold model serves as a draft generator for speculative decoding against the full-precision transformer, and how Attention Residuals (AttnRes, Kimi Team 2026) compose with the compression. v0.3 anchors the speculative path with a first end-to-end measurement on SmolLM2-135M-Instruct: 38.5% acceptance, 76.5 tok/s, status=geodesic_ready, plus the instruct-greedy-EOS pathology and its fix.

Paper 4 2026 Organic Training Theory and Geodesic Trajectory Caching

A theory paper now partially anchored by real LM-manifold measurements: Riemannian latent-space inference, Geodesic Trajectory Caching (GTC), and the OTT program. The universal diffeomorphism construction remains open, but the current repository's OTT deployment manifolds have a narrower, certificate-backed closure.

Paper 5 v1.0 · May 2026 GRC Light Distillation for Perplexity Recovery

Optional 3-phase LoRA distillation (r=8, 500 AdamW steps) recovers up to 107% of the PPL gap from GRC compression. First-order bound on recoverable energy via ρ ratio. Three merge strategies. Per-matrix SVD: +79.4% Frobenius error reduction over shared-basis at k=256.

Paper 6 v1.0 · May 2026 Task-Level Impact of GRC Compression

Full 5-benchmark sweep measuring GRC's impact on task performance. Factual recall (ARC, HellaSwag) nearly unchanged at k=1024. Math reasoning (GSM8K) degrades moderately. MMLU shows −2.5pp. k=1536 near-baseline on all tasks.

Paper 7 v0.1 · 2026-04-27 Geodesic Trajectory Caching and the OTT Runtime Anchor

Empirical companion to Paper 4. Fits Riemannian structure on three LM activation manifolds, validates the Jacobi-correction contract (97× at $B{=}10$, $30.9$ µs/q lookup, $5.96$ KB/record), and anchors the OTT runtime end-to-end at 38.5% acceptance / 76.5 tok/s on SmolLM2-135M-Instruct. 12 of 17 Paper 4 testable claims now have a replicable measured result.

Paper 8 v0.1 · 2026-04-27 Adaptive Compression: Phase, Gauge, Thermal, Online Basis

Four mechanisms shipped in the runtime that target distinct failures of static rank-$r$ SVD: MCR phase-aware per-layer rank allocation with attention-sink bypass, Axiom Gauge diagonal-$g$ optimisation over the GL($d$) residual-stream symmetry (zero inference overhead), Thermal Rank NVML-driven rank scaling with a tokens-per-joule policy gradient, and Online Basis Oja's-rule updates fired by speculative rejections. Implementation real and CLI-reachable; first measurements deferred to v0.2.

Paper 9 v1.0 · May 2026 Structure-Aware FFN Compression via Column Clustering

Per-cluster SVD on FFN columns recovers 21-25% reconstruction error vs. global SVD. Activation-weighted SVD: 22.7× better than weight-norm proxy. LoRA recovery at 99.9% gap closure. Honest about proxy failure: reconstruction error does not predict PPL.

Paper 10 v1.0 · May 2026 GTC Runtime: Measured Cache Coverage and Batch Jacobi

Empirical validation of the Geodesic Trajectory Cache from Paper IV. 97× batched-Jacobi speedup at B=10. Cache hit rates 90.4-91.5% scale-invariant across 135M-1.5B parameters. 15.5× faster than disk-based RAG.

Paper 11 v1.0 · May 2026 Cross-GPU Transfer of Geometric Compression

Architecture-independent formula $k^* = \mathrm{L2\_MB} \times 42.7$ predicts optimal GRC rank from L2 cache size alone. Validated on RTX 4070, A10G, L40S. 150 test cases, 100% accuracy. SOLVED 100%.

Paper 12 v1.0 · May 2026 CECI: Cross-Embedding Compatibility Index for Model Grafting

Surgical model grafting via gauge-aligned manifold projection. Within-band (adjacent layers) viable: GD<0.92, gauge +74%. Cross-band infeasible. 7 Danish chimeras — 5 of 7 improve MMLU. Cross-model grafting confirmed.


Advanced XI–XVThe k-Manifold Living-Model Stack. Universal Geodesic Taxonomy, Native Geodesic Training, Safe OGD, Behavioral Snipe, and COG+TEH --- forming the complete HyperTensor framework for living, self-improving language models. 96% complete; only H100-bound scaling gaps remain.
Paper XI v1.0 · May 2026 Universal Geodesic Taxonomy (UGT)

Standardised coordinate system for transformer representations. Enables component hot-swap between independently trained models. Zone-based knowledge routing with algebraic encoding.

Paper XII v1.0 · May 2026 Native Geodesic Training

Training transformer components directly in a k-dimensional manifold. RiemannianAdamW with QR retraction on the Grassmann manifold. <15% trainable parameters.

Paper XIII v1.0 · May 2026 Orthogonal Geodesic Deviation (Safe OGD)

Geometric safety via orthogonal projection onto a safe subspace. 0% harmful activation by mathematical construction. Multi-step concept refinement chains.

Paper XIV v1.0 · May 2026 Behavioral Geodesic Sniping (Snipe)

Precision removal of behavioral coordinates from the UGT manifold. Per-category sniping with minimal collateral damage. Integrated with COG pipeline.

Paper XV v1.0 · May 2026 Completely Organic Generation + TEH

Living model manifold that grows through interaction. Geometric harmful-content detection. .MIKU persistence format. The unified living model (ISAGI). Geometric jury: universal aggregation formula solving open framework problems.

Paper XVI v1.0 · May 2026 The Geometric Jury — A Universal Aggregation Principle

Mathematically proven universal mechanism for aggregating independent geometric trials. 8 theorems. Applications across all 15 papers. Solves framework optimization problems at 9-7200x speedup.

Paper XVII v1.0 · May 2026 The Analytic Continuation Manifold

Neural embedder learns the involution ι(s)=1-s. Zeros on critical line are fixed points of the learned involution. Off-critical deviation 0.81. Faithfulness gap: error 0.009, limit convergence unproven.

Paper XVIII v1.0 · May 2026 The Bridge: A Unified Protocol for the Riemann Hypothesis

5-step pipeline composing all HyperTensor technologies for RH verification. 105/105 zeros detected. Meta-jury 100% accuracy. Protocol specification for analytic number theorists.

Foundation v2.0 · May 2026 A Mathematical Foundation for the Geometric Jury

8 theorems with complete proofs. The jury formula $J = 1 - \prod(1-c_i)$ is the unique aggregation rule. Instinct horizon $d_h = R \cdot (-\ln(1 - 0.5^{1/N}))$. J-decay table. 177× speedup: 0.17ms jury vs 30ms transformer.

Handoff v2.0 · May 2026 Z₂-Symmetry Framework for the Riemann Hypothesis

Complete mathematical handoff. The one theorem: $\zeta(s)=0 \implies D(s)=0$. Everything else — feature map, Z₂ framework, computational verification at $3.04\times 10^9\times$ separation — is done.

Volume 2Papers XVI–XVIII: Riemann Hypothesis via geometric jury. AGT at 50,000 primes: 100% detection, 800x separation. 105-zero jury: J~1-10^{-315}. 3,713-point meta-jury: Pearson r=1.0000. ACM involution learned. Bridge protocol validated. Handoff document ready for analytic number theorist. Papers XIX–XXXI (Millennium): geometric prototypes in WIP/. See mathematician_handoff.pdf.

Installable Python packages from the HyperTensor ecosystem. Available on PyPI.

hypertensor-framework v1.1.0

Riemannian geometry for transformer compression, speculative decoding, safety, and the Riemann Hypothesis.

pip install hypertensor-framework
hypersort v0.1.0

O(1) instant sort via Riemannian Comparison Manifold. Cross-language: Python, JavaScript, Java.

pip install hypersort
hypertensor-core v1.0.0

Geometric core: Riemannian metrics, hallucination guards, geodesic trajectory analysis. import hypercore

pip install hypertensor-core
@nagusamecs/hypersort npm

JavaScript/TypeScript package. GitHub Packages registry.

npm install @nagusamecs/hypersort
/sort live demo

Real-time visual comparison: HyperSort vs QuickSort vs MergeSort vs Native.