RMT ε‑Band Acceptance (Edge Risk Score)

Plain language: The RMT guard limits how much the activation edge risk can grow beyond its baseline, ensuring structural shifts trigger a failure while expected noise passes.

Overview

AspectDetails
PurposeDefine the RMT edge-risk acceptance band and the report fields needed to audit it.
AudienceRMT guard maintainers, calibration reviewers, and release reviewers checking activation evidence.
Contract scopeBaseline-relative activation edge-risk growth, per-family epsilon bands, and report-verifier behavior.
Source of truthsrc/invarlock/guards/rmt*.py, runtime/tiers.yaml, and RMT assurance tests.

Claim

The Random Matrix Theory (RMT) guard accepts an edit when the activation edge risk score stays within the calibrated ε‑band for each family.

Let rfbaser_f^{\text{base}} be the baseline edge risk score and rfcurr_f^{\text{cur}} the current score for family ff. The guard accepts if:

rfcur(1+ϵf)rfbaser_f^{\text{cur}} \le (1+\epsilon_f)\, r_f^{\text{base}}

with ϵf\epsilon_f calibrated from null runs (e.g., 95th–99th percentile of rfcur/rfbase1r_f^{\text{cur}}/r_f^{\text{base}} - 1).

What is the edge risk score?

For a (token×hidden) activation matrix, the guard forms a whitened centered and standardized matrix, estimates its top singular value via a deterministic matvec estimator, and normalizes by the Marchenko–Pastur edge for the same shape:

r=σ^max(A)σMP(m,n)r = \frac{\hat{\sigma}_{\max}(A')}{\sigma_{\mathrm{MP}}(m,n)}

The contract fixes the estimator budget and the activation sampling policy; those knobs are recorded in the report.

This note documents the runtime report contract for the activation edge-risk mode surfaced in reports; it does not catalog the full implementation surface inside src/invarlock/guards/rmt.py.

Derivation (sketch)

  • Edge risk fluctuates under null due to finite‑sample deviations from the Marchenko–Pastur edge and estimator noise.
  • The ε‑band permits expected null drift, flagging structural increases.
  • Large edge risk indicates concentration of activation energy along a small number of directions beyond random‑matrix expectations.

Assumptions & Scope

  • Null calibration must cover each family {ffn, attn, embed, other}; default ε values are exposed whenever data is sparse.
  • Baseline and current scores use identical activation sampling and token‑weighted aggregation.
  • CI/release and activation-required evidence require activation-based scoring; if activation batches are missing in those paths, the RMT guard fails closed.

Calibration (pilot-derived)

  • Balanced tier uses ϵf={0.01,0.01,0.01,0.01}\epsilon_f = \{0.01, 0.01, 0.01, 0.01\} for {ffn, attn, embed, other} respectively (q95–q97 of null deltas).
  • Conservative uses the same per-family ε defaults: ϵf={0.01,0.01,0.01,0.01}\epsilon_f = \{0.01, 0.01, 0.01, 0.01\}. Values are recorded in the packaged tiers.yaml (runtime/tiers.yaml) and surfaced in reports. Provide overrides via INVARLOCK_CONFIG_ROOT/runtime/tiers.yaml when needed.

Example: with r_base = 1.20 and ε = 0.01, the guard allows r_cur ≤ (1+0.01) × 1.20 = 1.212.

Recalibration

Calibration values are derived from null-sweep runs and stored in the packaged runtime/tiers.yaml. See the full calibration methodology in 09-tier-v1-calibration.md.

To recalibrate, run null baselines (no edit) and compute per-family deltas Δ(f) = r_cur(f)/r_base(f) − 1 (skip cases with missing or zero baseline). Set ε(f) to the q95–q99 quantile of Δ(f). For small families or tiny sample sizes, use a slightly larger ε to avoid spurious failures.

Runtime Contract (report)

  • report records rmt.{mode,edge_risk_by_family_base,edge_risk_by_family,epsilon_default,epsilon_by_family,epsilon_violations,stable,status}.
  • Per-family details for rendering live under rmt.families.*.{edge_base,edge_cur,epsilon,allowed,ratio,delta}.
  • rmt.measurement_contract.kind = "activation_edge_risk" records which RMT measurement path produced the evidence.
  • report lint verifies the inequality and marks violations; validation.rmt_stable reflects the ε‑band gate.

Observability

  • rmt.edge_risk_by_family_base.* and rmt.edge_risk_by_family.*.
  • rmt.epsilon_default and rmt.epsilon_by_family.*.
  • rmt.status / rmt.stable and rmt.epsilon_violations for pass/fail context.
  • resolved_policy.rmt.{margin,deadband,epsilon_by_family} — resolved thresholds archived with the report bundle.

Edge cases

  • Small samples: estimator variance dominates; increase activation sample count or widen ε for tiny families.

Background reading

  • Marchenko, V. A., & Pastur, L. A. (1967). “Distribution of eigenvalues for some sets of random matrices.” Mathematics of the USSR-Sbornik, 1(4), 457–483.
  • Bai, Z. D., & Silverstein, J. W. (2010). Spectral Analysis of Large Dimensional Random Matrices (2nd ed.). Springer.
  • Pennington, J., & Worah, P. (2017). “Nonlinear Random Matrix Theory for Deep Learning.” Advances in Neural Information Processing Systems (NeurIPS). https://papers.nips.cc/paper/6857-nonlinear-random-matrix-theory-for-deep-learning
  • Martin, C. H., & Mahoney, M. W. (2021). “Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning.” Journal of Machine Learning Research, 22(165), 1–73. Preprint: https://arxiv.org/abs/1810.01075