Announcement · Getting Started
Welcome to InvarLock
A quick introduction to InvarLock: evaluate LLM weight edits with statistical guarantees and auditable proof packs.
Post: What InvarLock is, what it checks, and how to try it.
Highlights
- Evaluate edited weights against a baseline with paired metrics and confidence intervals.
- GuardChain checks for “unsafe to compare” measurement mismatches and quality drift.
- Proof packs capture the artifacts you need to verify and share results.
If you edit model weights (quantization, pruning, fine-tuning, merges), you eventually hit the same question: did this change silently break anything that matters? “It loads” isn’t enough, and single-number metrics often miss the failure modes you’ll regret later.
InvarLock is designed for that moment. It produces an evaluation report that is both human-readable and machine-verifiable, so you can make upgrade decisions with evidence—not vibes.
Quickstart
Install InvarLock via pip:
pip install "invarlock[hf]"
Run your first evaluation:
INVARLOCK_ALLOW_NETWORK=1 invarlock evaluate \\
--baseline gpt2 \\
--subject gpt2 \\
--adapter auto \\
--profile dev
That produces an evaluation report and (optionally) a proof pack you can archive, verify, and share.
What’s next
- Design-partner refinement of the private on-prem review workflow
- Broader adapter and framework coverage
- Better “what changed?” analytics over time
To go deeper, start with the docs. For questions and feedback, email [email protected].
If your team wants to help shape the on-prem path, start with design partners.
More from the blog
Continue through recent releases and implementation notes.
Synthesis
The Minimum Evidence Surface for Trustworthy Weight-Edit Results
A trustworthy weight-edit result needs more than a benchmark delta. It needs a bounded claim, an exactly paired comparison, and verification that rejects incomplete evidence.
Release
Evidence packs and explicit runtime provenance
InvarLock 0.8.0 moves the public bundle surface to evidence packs, pins docs to versioned release paths, and makes container-vs-host runtime provenance explicit across evaluate and verify.
Research Note
Fail-Closed Verification for Weight-Edit Evaluation
A verifier is only useful if it rejects incomplete evidence. InvarLock's verification path is designed to stop stronger claims when the evidence bundle is missing or inconsistent.