Reproduce G · FFN Compression

Reproduce Paper G: FFN Down-Projection SVD Compression

William Ken Ohara Stewart (NagusameCS Independent Research)

HyperTensor Project · May 2026 · Paper G (HTML) · repro tree

Scope

Reproduces the FFN down-projection SVD compression results: measuring PPL degradation when the FFN down matrix (W_down) is compressed via truncated SVD at ranks r=256, 512, 1024, 2048. Validates the finding that FFN down is the most compressible component of the transformer block (less than 2% PPL increase at r=d/4).

Hardware target

Prerequisites

Step 1: Extract and compress FFN down

python scripts/check_q4_layout.py --model ../models/qwen2.5-7b-q4_k_m.gguf \
    --extract-ffn-down --output ../benchmarks/ffn_down_svd.json

Step 2: Run the SVD compression sweep

python scripts/ffn_svd_sweep.py --input ../benchmarks/ffn_down_svd.json \
    --ranks 256,512,1024,2048 --output ../benchmarks/ffn_svd_results.csv

Step 3: Expected output

Validation

Compare singular value spectra against the paper's Figure 2. The SVD energy curve should show that 90% of energy is captured in the first 25% of singular values for all layers.