Announcement · Getting Started
Welcome to InvarLock
A quick introduction to InvarLock: evaluate LLM weight edits with statistical guarantees and auditable proof packs.
Post: What InvarLock is, what it checks, and how to try it.
Highlights
- Evaluate edited weights against a baseline with paired metrics and confidence intervals.
- GuardChain checks for “unsafe to compare” measurement mismatches and quality drift.
- Proof packs capture the artifacts you need to verify and share results.
If you edit model weights (quantization, pruning, fine-tuning, merges), you eventually hit the same question: did this change silently break anything that matters? “It loads” isn’t enough, and single-number metrics often miss the failure modes you’ll regret later.
InvarLock is designed for that moment. It produces an evaluation report that is both human-readable and machine-verifiable, so you can make upgrade decisions with evidence—not vibes.
Quickstart
Install InvarLock via pip:
pip install "invarlock[hf]"
Run your first evaluation:
INVARLOCK_ALLOW_NETWORK=1 invarlock evaluate \\
--baseline gpt2 \\
--subject gpt2 \\
--adapter auto \\
--profile dev
That produces an evaluation report and (optionally) a proof pack you can archive, verify, and share.
What’s next
- Evaluation-as-a-Service (hosted runs and reviewable outputs)
- Broader adapter and framework coverage
- Better “what changed?” analytics over time
To go deeper, start with the docs. For questions and feedback, email [email protected].
Want updates as we ship? Join the waitlist.
More from the blog
Continue through recent releases and implementation notes.
Release
Stable public contracts with stricter fail-closed verification
InvarLock 0.4.0 stabilizes contracts around policies, proof packs, and evaluation provenance while tightening verification, CI, and coverage enforcement.
Release
Coverage hardening across CLI, reporting, and observability paths
Coverage thresholds now enforce split-module branch floors for critical CLI/reporting paths.
Release
Targeted regression hardening for quantization and reporting paths
A focused hardening release: safer AWQ plugin discovery, stronger quantization clipping behavior, and broader report-schema acceptance for edge payloads.