Proof Packs
Overview
| Aspect | Details |
|---|---|
| Purpose | Hardware-agnostic validation runs that bundle reports into portable evidence artifacts. |
| Audience | CI operators producing validation evidence across GPU topologies. |
| Requires | GPU capable of fitting selected models; HF cache or network for model download. |
| Outputs | Proof pack directory with reports, reports, checksums, and optional GPG signature. |
| Source of truth | scripts/proof_packs/run_suite.sh, scripts/proof_packs/run_pack.sh. |
Proof packs are hardware-agnostic validation runs that bundle InvarLock reports, summary reports, and verification metadata into a portable evidence artifact. They replace the B200-specific validation harness with a suite that can run on any NVIDIA GPU topology that can fit the selected models.
By default, a proof pack is evidence-grade (integrity + report verification). Treat it as proof-grade only when the manifest is signed, the pack is verified in strict verification mode, and the final verdict is PASS.
Operationally, proof packs are a maintainer smoke test that also emits reusable evidence data. The same run should let maintainers catch regressions, let third parties verify reported outcomes, and provide structured outputs for downstream analysis.
Terminology: the proof-pack suite includes a run-scoped Preset Derivation phase (
CALIBRATION_RUN -> GENERATE_PRESET) that writescalibrated_preset_<model>.yaml/jsonfor that suite run. It does not directly modify globalruntime/tiers.yaml. For global tier policy tuning, useinvarlock calibrate ...(see Tier Policy Tuning CLI).
Entrypoint Guide
| Script | Purpose | Output | Use When |
|---|---|---|---|
run_pack.sh | Full proof pack: runs suite + packages artifacts | Proof pack directory with manifest + checksums | Default: distributable validation evidence |
run_suite.sh | Suite execution only | Reports + certs under the run directory | Development/debugging, iterative runs |
verify_pack.sh | Validate an existing proof pack | Verification status | Validating received proof packs |
Quick Start
# RECOMMENDED: Full proof pack with verification artifacts
PACK_TUNED_EDIT_PARAMS_FILE=./scripts/proof_packs/tuned_edit_params.json \
./scripts/proof_packs/run_pack.sh --suite subset --net 1
# Development/debugging only (runs the suite, but does not build a proof pack)
./scripts/proof_packs/run_suite.sh --suite subset --resume
# Verify an existing proof pack
./scripts/proof_packs/verify_pack.sh --pack ./proof_pack_runs/subset_20250101_000000/proof_pack
Note: clean edits require tuned preset parameters. Either set
PACK_TUNED_EDIT_PARAMS_FILE or place the file at
scripts/proof_packs/tuned_edit_params.json.
How It Works
This page focuses on running proof packs. For the internal task graph, scheduler flow, and artifacts, see Proof Pack Internals.
Suites
Model suites live in scripts/proof_packs/suites.sh. You can also override individual
models via MODEL_1–MODEL_8.
| Suite | Models | Notes |
|---|---|---|
subset | mistralai/Mistral-7B-v0.1 | Single-GPU friendly |
showcase | 7B–14B ungated models | Multi-GPU recommended; adds guard-focused scenarios |
workshop3 | 7B–32B ungated models | Workshop-friendly 3-model suite (architecture diversity) |
full | 7B–72B ungated models | Multi-GPU recommended |
Scenario selection is driven by scripts/proof_packs/scenarios.json. Scenarios can
optionally declare suites: ["subset", "showcase", "full", ...]; during execution the
suite writes the effective (filtered) manifest to OUTPUT_DIR/state/scenarios.json,
and both task generation and final verdict compilation use that state manifest.
Network & Model Revisions
Proof packs require pinned model revisions for reproducibility:
- Use
--net 1on the first run to preflight and pin revisions inOUTPUT_DIR/state/model_revisions.json. - Offline runs use
--net 0(default) and error if the cache is missing. - The
PACK_NETenvironment variable is exported as1or0to gateHF_*_OFFLINEsettings.
Output Layout
A suite run writes artifacts under OUTPUT_DIR (default: ./proof_pack_runs/<suite>_<timestamp>):
reports/final_verdict.txt+reports/final_verdict.jsonreports/category_summary.jsonreports/guard_signal_summary.jsonreports/guard_intervention_summary.json(non-failing remediation signals, e.g. spectral caps + VE probe)reports/scenario_signal_summary.jsonanalysis/determinism_repeats.json(when--repeatsis used)*/reports/**/evaluation.report.json
run_pack.sh copies curated artifacts into a pack directory (default
OUTPUT_DIR/proof_pack) and organizes them as:
results/final_verdict.txt+results/final_verdict.jsonresults/**/category_summary.json,results/**/guard_signal_summary.json,results/**/guard_intervention_summary.json,results/**/scenario_signal_summary.jsonresults/**/determinism_repeats.json(if present)certs/<model>/<edit>/<run>/evaluation.report.jsoncerts/**/rmt_probe.json(optional sidecar; emitted by some scenarios, e.g.rmt_norm_noise)certs/**/ve_probe.json(optional sidecar; emitted by VE demo scenarios, e.g.ve_mlp_scale_skew)certs/**/evaluation.html+certs/**/verify.jsonREADME.md,manifest.json,checksums.sha256manifest.json.ascif GPG signing is available
Edit Provenance Labels
reports record the edit algorithm used:
| Label | When to Use |
|---|---|
noop | Baseline model with no edit applied |
quant_rtn, magnitude_prune, etc. | Using InvarLock's built-in edit functions |
custom | BYOE (Bring-Your-Own-Edit) pre-edited models |
For BYOE workflows, use --edit-label custom or let InvarLock infer from the model path.
Determinism
Use --determinism strict to disable TF32 and cuDNN benchmarks and align with
strict InvarLock presets. --repeats N reruns a single edit N times and records
a drift summary in results/determinism_repeats.json.
Signing & Verification (Evidence vs Proof-Grade)
manifest.json includes checksums_sha256_digest (sha256 of checksums.sha256) so a
signed manifest cryptographically binds the checksums file (and thus all hashed artifacts).
Signed packs also record signing_key_fingerprint for audit trails.
The manifest contract is published at contracts/proof_pack_manifest.schema.json.
verify_pack.sh validates this schema before checksum and signature verification so
malformed proof packs fail deterministically.
Use verify_pack.sh:
- Default:
scripts/proof_packs/verify_pack.sh --pack <dir>- Verifies
checksums_sha256_digest, validateschecksums.sha256, and runsinvarlock verify. - Warns (but does not fail) if the pack is unsigned; this is evidence-grade verification.
- Verifies
- Strict (recommended for distributable evidence):
scripts/proof_packs/verify_pack.sh --pack <dir> --strict- Fails if
manifest.json.ascis missing,gpgverification fails, or extra files exist outsidechecksums.sha256. - Alternative: set
PACK_STRICT_MODE=1(e.g.,PACK_STRICT_MODE=1 scripts/proof_packs/verify_pack.sh --pack <dir>).
- Fails if
For proof-grade attestation, require all three: signed manifest, strict verification, and PASS final verdict.
To skip signing during pack creation, set PACK_GPG_SIGN=0. To require signing, set PACK_STRICT_MODE=1.