Release
Deterministic proof packs and safer perplexity runs
Proof packs gain a deterministic bash test suite and better runtime helpers, window selection becomes stable/offline, and perplexity runs get safer around bad token IDs.
Release: InvarLock 0.3.5 - Offline window selection, runtime helpers, and token-ID guards
Highlights
- Proof pack bash test suite + runtime helpers for capturing artifacts during long runs.
- WikiText-2 window stratification switches to a deterministic offline byte-level n-gram scorer.
- Perplexity sanitizes out-of-range token IDs, while B200 defaults reduce queue and cache friction.
0.3.5 is a trust-the-machinery release. Proof packs now have their own bash test suite with deterministic command mocks and optional coverage checks, which is exactly the kind of unglamorous work that prevents subtle breakage later. Runtime helpers and pack build/verify helpers also make it easier to capture the right artifacts during long runs without improvising ad hoc scripts.
On evaluation stability: window stratification for WikiText-2 moves to a deterministic offline scorer. That keeps window selection consistent across model families and avoids implicit downloads—two things that matter a lot when you’re trying to make runs comparable.
Perplexity evaluation is more defensive too, masking out-of-range token IDs rather than triggering device-side asserts. On B200 workflows, dynamic scheduling becomes the only validation path, dependency promotion is centralized to reduce queue lock contention, generated configs avoid slow CPU spectral calibration by default, and Hugging Face caches move under the work directory to avoid small root partitions. The old INVARLOCK_SCORES_BATCH_SIZE variable is removed because the new scorer no longer batches on device.
For the immutable release record, read the tagged CHANGELOG.md for v0.3.5.
More from the blog
Continue through recent releases and implementation notes.
Synthesis
The Minimum Evidence Surface for Trustworthy Weight-Edit Results
A trustworthy weight-edit result needs more than a benchmark delta. It needs a bounded claim, an exactly paired comparison, and verification that rejects incomplete evidence.
Release
Evidence packs and explicit runtime provenance
InvarLock 0.8.0 moves the public bundle surface to evidence packs, pins docs to versioned release paths, and makes container-vs-host runtime provenance explicit across evaluate and verify.
Research Note
Fail-Closed Verification for Weight-Edit Evaluation
A verifier is only useful if it rejects incomplete evidence. InvarLock's verification path is designed to stop stronger claims when the evidence bundle is missing or inconsistent.