From Sweep Outputs to Tier Policy

Calibration becomes operational when sweep artifacts end in reviewable YAML patches that later appear as resolved runtime policy in reports.

May 18, 2026

5 min read

InvarLock Team

Historical context: This dated note records the InvarLock product model available when it was published. InvarLock v0.13.0 replaced the current operator workflow with a closed request, canonical signed evidence, and independent verifier replay; use the v0.13.0 release note and current documentation for present-day instructions.

A calibration story is strongest when it survives as policy

Highlights

Each calibration sweep emits four outputs: JSON, CSV, Markdown, and a tiers_patch_*.yaml recommendation.
Approved patch keys merge into tiers.yaml; the catalog separates calibration-shaped keys (family caps, VE min-effect) from explicit policy choices (floors, deadbands, caps).
The same key shows up later in the run's resolved_policy.*, giving the sweep → patch → policy → resolved-policy path a traceable identity.

Many calibration workflows stop at interpretation. They produce a chart, a summary paragraph, or a recommendation in prose, then leave the product surface ambiguous.

InvarLock's public calibration surface is stronger when read the other way around.

The interesting output is the patch. The summary helps a human understand the sweep, but the patch is what connects the result to the runtime tier policy that later governs real evaluations.

Why The Patch Is The Real Output

The calibration CLI reference makes this explicit. Both null sweeps and VE sweeps emit JSON, CSV, Markdown, and a tiers_patch_*.yaml recommendation file.

That output hierarchy matters. JSON and CSV preserve machine-readable evidence. Markdown gives a human summary. But the YAML patch is the artifact that can actually change runtime behavior after review.

If calibration stops before that point, it remains analysis. Once it reaches a patch, it becomes a reviewable policy proposal.

How Tier Policy Absorbs Calibrated Keys

The tier-policy catalog is useful here because it draws an important distinction: some values are calibrated from pilot or null runs, while others remain explicit policy choices such as floors, caps, and deadbands.

That distinction is easiest to see with examples. Spectral family caps and VE minimum-effect settings are calibration-shaped. Deadbands, floors, and some caps remain policy choices around those calibrated values.

That prevents a common confusion. A calibration sweep does not "discover the whole tier." It recommends a narrow set of calibrated values for a policy surface that still includes deliberate design choices.

The patch needs review because it is proposing edits to a policy document, not revealing a timeless truth.

Why `resolved_policy.*` Matters Downstream

The guards reference and policy catalog both point to the same operational idea: the policy that actually governed a run should be observable later in the report surface.

That gives the patch path continuity.

The sweep produces a recommendation. The recommendation lands in tier policy after review. Later, a run exposes the resolved values it actually used. That means calibration is not trapped in an offline notebook. It becomes part of the evidence chain for future evaluation and verification.

Without that downstream visibility, the patch would still be better than prose, but much weaker as an audit surface.

That handoff is easier to review when the patch and later report expose the same policy path, for example:

balanced:
  variance_guard:
    min_effect_lognll: 0.016

{
  "resolved_policy": {
    "variance": {
      "min_effect_lognll": 0.016
    }
  }
}

Why Review Still Matters

The calibration docs are careful about this too. They show how to inspect a patch and merge it into tiers.yaml, but they do not present that step as automatic or self-justifying.

That stance is useful. A calibrated recommendation is stronger than intuition, but it is still a recommendation. Operators should be able to inspect the diff, ask whether the sweep assumptions are still local and valid, and decide whether the change deserves adoption.

Review is not friction around calibration. It is part of calibration.

What This Path Still Does Not Prove

The patch path makes a deliberately narrow claim.

A clean patch path does not prove that every calibrated value transfers across families, hardware, or window budgets. It does not prove that every policy knob is empirical. And it does not remove the need to revisit thresholds when guard contracts change.

What it does show is that the public calibration story is operational. It has a stable output, a review surface, and a downstream policy trace.

Claim Map

The practical path is:

run a policy-tuning sweep
inspect JSON, CSV, and Markdown outputs
review the emitted tiers_patch_*.yaml
merge approved keys into tier policy
verify later runs against the resolved_policy.* they actually used

That is a much stronger systems story than a calibration notebook with a conclusion slide.

Limitations

The patch path is the reviewable shape; whether a given recommendation is locally valid is a review decision, not a derivation result.
Sweep-to-report continuity is the claim; individual patch keys still need review.
Calibration and tier-policy references remain authoritative for command syntax, output names, and merge mechanics.

From Sweep Outputs to Tier Policy

Highlights

Why The Patch Is The Real Output

How Tier Policy Absorbs Calibrated Keys

Why `resolved_policy.*` Matters Downstream

Why Review Still Matters

What This Path Still Does Not Prove

Claim Map

Limitations

Sources

More in Research Note

Calibration Is the Product Surface, Not a Side Utility

Variance Enablement Should Be Evidence-Gated

What Belongs in evaluation.report.json

Highlights

Why The Patch Is The Real Output

How Tier Policy Absorbs Calibrated Keys

Why resolved_policy.* Matters Downstream

Why Review Still Matters

What This Path Still Does Not Prove

Claim Map

Limitations

Sources

More in Research Note

Calibration Is the Product Surface, Not a Side Utility

Variance Enablement Should Be Evidence-Gated

What Belongs in evaluation.report.json

Why `resolved_policy.*` Matters Downstream