Reproduce XI–XVIII · Living Stack + Riemann

Reproduce Papers XI–XVIII

William Ken Ohara Stewart (NagusameCS Independent Research)

HyperTensor Project · May 2026 · REPRODUCTION.md · HARDWARE.md · Zenodo DOI

Scope

This guide reproduces the eight papers that form the second half of the HyperTensor volume: the k-manifold living-model stack (Papers XI–XV) and the geometric Riemann framework (Papers XVI–XVIII). Most paths run on a consumer GPU; the four Riemann verification scripts are CPU-only and finish in 15 seconds.

Every command below is meant to be run from the repository root after the Path A or Path B setup. The hardware tier (T1 / T2 / T3) for each paper is in the spec table at the top of its block. T3 (datacenter GPU) results all ship pre-computed in benchmarks/ — you can verify the JSON outputs without re-running.

Jump to:
  1. Paper XI — UGT
  2. Paper XII — Native Training
  3. Paper XIII — Safe OGD
  4. Paper XIV — Snipe
  5. Paper XV — COG + TEH
  6. Paper XVI — AGT
  7. Paper XVII — ACM
  8. Paper XVIII — Bridge Protocol
  9. Jury Foundation (auxiliary)
  10. Cross-cutting audits

Prerequisites

Disclaimer (Papers XVI–XVIII). The Riemann papers are presented as a geometric visualisation of the functional equation's Z2 symmetry, not as a contribution to analytic number theory. See the explicit disclaimers in each paper's abstract.

Paper XI — Universal Geodesic Taxonomy (UGT)

HardwareT1 (zone analysis) · T2 (1.5B bilateral) · T3 (7B bilateral)
TierA / B
Wall-clock5 min (T1) · 90 min (T2) · 90 min (T3)
Outputbenchmarks/xi_transfer_proof.json · benchmarks/bilateral_*/
# Bench 1 — zone separation analysis (T1, 60 s)
python scripts/benchmarks_quick.py

# Bilateral UGT at 1.5B (T2)
python scripts/close_xi_bilateral_ec2.py

# Wielandt–Hoffman transfer proof (T1, 30 s)
python scripts/xi_transfer_proof.py

# Bilateral UGT at 7B (T3, L40S, 28-32 GB VRAM)
python scripts/bilateral_7b.py --quick
Verifies. 4 knowledge zones (syntax / factual / reasoning / creative) are measurably separated; mean zone separation ≈ 0.114; bilateral subspace overlap 0.968 at 1.5B; principal angles 0.01°–0.11° at 7B (4-bit + checkpointing on L40S).

Paper XII — Native Geodesic Training

HardwareT1 (ratio analysis) · T2 (1.5B run) · T3 (7B run)
TierA / B
Wall-clock1 min (T1) · 60 min (T2) · 4 h (T3)
Outputbenchmarks/native_15b_v2/
python scripts/benchmarks_quick.py        # Bench 5: ratio analysis (T1)
python scripts/native_15b.py              # Initial 1.5B run
python scripts/native_15b_v2.py           # Production 1.5B run (recommended)
python scripts/native_7b_final.py         # 7B run (T3, L40S)
Verifies. Compression ratios analytically match W_native = B C Bᵀ. At k=768 on d=1536: 26% parameters (3.8× compression), 34.5% variance preserved. Loss decreases monotonically. KExpansionScheduler auto-grows k from initial budget. PPL parity at k ≥ 256 requires H100-class VRAM (or 2× L40S).

Paper XIII — Safe Orthogonal Geodesic Deviation (Safe OGD)

HardwareT1
TierA
Wall-clock90 s
Outputbenchmarks/safe_ogd_results.json
python scripts/benchmarks_quick.py    # Bench 2 — safety guarantee
python scripts/safe_ogd.py
python scripts/ogd_cog_loop.py
python scripts/close_xiii_safe_ogd_creativity.py
Verifies. Maximum forbidden leakage is exactly 0.000000000000. The geometric identity QfT·Psafe = 0 holds identically for 1,000 random vectors. 0% TEH at all α ∈ {0.05, 0.10, …, 0.30} by mathematical construction. Multi-step OGD chains preserve coherence.

Paper XIV — Behavioral Geodesic Sniping (Snipe)

HardwareT2
TierB
Wall-clock15 min
Outputbenchmarks/snipe_specificity_*/
python scripts/benchmarks_quick.py        # Bench 3 (T1 quick check)
python scripts/snipe_specificity.py
python scripts/multi_snipe.py
python scripts/close_xiv_snipe_collateral.py
Verifies. Per-category specificity (harm/benign ratio) > 2.0 for privacy, illegal advice, toxicity, sycophancy. Greedy selection with 2% benign budget achieves 7.4× better specificity than all-snipe. 8 behavioural categories probed at 1.5B. Pre/post-COG pipeline integration.

Paper XV — COG + TEH (Living Model)

HardwareT1 (TEH) · T3 (10K-interaction COG)
TierA / B
Wall-clock90 s (T1 TEH) · 6 h (T3 COG)
Outputbenchmarks/teh_roc/ · benchmarks/cog_10k_results.json
python scripts/benchmarks_quick.py        # Bench 4 — TEH detection
python scripts/teh_roc.py
python scripts/teh_15b_500.py
python scripts/teh_15b_probed.py

# 10,000-interaction COG run (T3)
python scripts/cog_10k.py --n 10000

# 50-loop safe OGD + COG integration
python scripts/close_xv_cog_100.py
Verifies. TEH detection rate > 90% at FP rate = 0%; optimal threshold via ROC calibration. 4-tier query recognition. .MIKU persistence format. ISAGI v1.0 living manifold. Mann–Kendall test on 10K-interaction COG run: p = 0.015 — metric saturates, lifelong learning convergence confirmed across 14 trajectories.

Paper XVI — Algebraic Geometric Topology (AGT)

HardwareT1 (2K primes) · T2 (10K) · T3 (50K)
TierA
Wall-clock5 min · 20 min · 22 min (T3)
Outputbenchmarks/agt_v3_results.json · benchmarks/agt_50k_results.json
python scripts/agt_10k.py --quick           # 2K primes (T1, 5 min)
python scripts/agt_10k.py                   # 10K primes (T2, 20 min)
python scripts/agt_10k.py --scale 50000     # 50K primes (T3, 22 min)
python scripts/agt_scale_ec2.py             # EC2 wrapper, 50K primes
Verifies. 100% off-critical detection across 50,000 primes and 1,030 zeta zeros. k90 = k95 = 1 (1D critical subspace at all scales). 800× separation between on-critical and off-critical points.

Paper XVII — Analytic Continuation Manifold (ACM)

HardwareT1
TierA (numpy + scipy)
Wall-clock30 s
Outputbenchmarks/acm_prototype_results.json
python scripts/acm_prototype.py
Verifies. Learned involution ι(s) = 1 − s in latent space: ι² ≈ id (error 0.009); critical-line zeros are fixed points (error 0.008); off-critical deviation 0.81. 14/15 off-critical zeros detected, 0/10 false positives. Faithfulness gap: error 0.009, limit convergence unproven.

Paper XVIII — Bridge Protocol

HardwareT1
TierA
Wall-clock60 s
Outputbenchmarks/jury_bridge/ · benchmarks/faithfulness_rigorous.json
# Core proof: D(s) is rank-1
python scripts/faithfulness_rigorous.py

# 27-test Riemann verification suite (15 s total)
python scripts/riemann_comprehensive_verify.py
python scripts/riemann_adversarial_tests.py
python scripts/riemann_mega_verify.py

# 3,713-point meta-jury
python scripts/jury_bridge.py

# Full Riemann attack pipeline
python scripts/close_xvii_xviii_riemann.py
Verifies. 5-step pipeline (AGT → ACM → Safe OGD → TEH → Contradiction). 105/105 known zeros detected, 0 FP / 0 FN. Pearson r(D, |σ−0.5|) = 1.0000. Meta-jury (3,713 points): 100% accuracy. D(s) rank-1: SV1=8.94, SV2..12=0.000000.
Honest reporting. The jury confidence figure J ≈ 1 − 10−315 treats the 105 known zeros as independent jurors, but they are consequences of the same classical theorem-and-computation pipeline (Hardy 1914; Selberg; van de Lune et al.; Platt & Trudgian 2021). Computed in extended precision (mpmath, dps ≥ 400). This is a geometric visualisation of the functional equation's Z2 symmetry, not a proof-search protocol.

Geometric Jury Foundation (auxiliary suite)

HardwareT1
TierA
Wall-clock~5 min total
Outputbenchmarks/jury_*/
python scripts/jury_discovery.py        # 7 discovery experiments
python scripts/jury_solver.py           # 6 improvement experiments
python scripts/jury_advance.py          # 5 advance experiments
python scripts/jury_open.py             # 5 open problems
python scripts/jury_final.py            # R² regression
python scripts/jury_gaps.py             # 7 gap analyses
python scripts/jury_gtc.py              # GTC acceleration
python scripts/jury_gtc_extreme.py      # production gates
python scripts/jury_ugt.py              # UGT zone classification
python scripts/millennium_jury.py       # P vs NP, BSD, Yang–Mills
python scripts/jury_ensemble.py         # ensemble regression
python scripts/jury_solve_all.py        # improved feature engineering
Verifies. Jury formula J = 1 − ∏(1−ci) is the unique aggregation rule (8 theorems with full proofs in jury_proof.pdf). Cross-domain transfer; jury-solved framework problems (GRC k 100%, OTT 81%, OGD α R² = 0.758). 9–7200× ISAGI speedups. Millennium prototypes: P vs NP 99.8%, BSD 33.5%, Yang–Mills 100%.

Cross-cutting audits and external verification

# 51-claim bulletproof audit across papers I-XV
python scripts/bulletproof_audit.py

# 7-benchmark quick suite
python scripts/benchmarks_quick.py

# Performance optimisations (12 of them)
python scripts/benchmark_optimizations.py

# External verification (synthetic baselines)
python scripts/verify_external.py

# External verification on real 1.5B model (14/14)
python scripts/verify_external_15b.py

# Millennium WIP prototypes
python scripts/bench_millennium_wip.py
Verifies. 51/51 published claims survive line-by-line audit. External verification 14/14 against PyTorch k-NN/centroid baselines, cuBLAS (EC2 L40S), 10⁸ Monte Carlo trials, and the von Mangoldt explicit formula. Performance optimisations: randomised SVD 9×, svd_lowrank 10.6×, batch cosine 220×.

What success looks like

After running the full T1 path (~30 minutes) you should have 111 passing tests across the four Riemann suites, the jury foundation, and the bulletproof audit. After the T2 paths (~90 minutes) you have re-derived every per-layer geometric measurement. The remaining T3 results ship pre-computed in benchmarks/ and can be verified by inspecting the JSON without re-running.

For the canonical end-to-end document with hardware tiers, dependency tiers, determinism notes, troubleshooting, EC2 setup, and the full result inventory, see REPRODUCTION.md in the repository root.