Back to blog

Release

Token-weighted paired statistics and stricter release gates

Ink/charcoal doodle: paired metric bars pass through stricter release controls.

Token-weighted paired bootstrap lands across the pipeline, strictness toggles expand, and CI/release pairing expectations become explicit and enforceable.

1 min read
InvarLock Team

Release: InvarLock 0.3.3 - Paired bootstrap, strictness toggles, and clearer failures

Highlights

  • Token-weighted paired Δlog-loss bootstrap support (core + primary metric + variance guard).
  • Window pairing enforcement becomes more explicit (overlap/duplicates/mismatch detection).
  • Strictness toggles and report metadata improvements for clearer evaluation outcomes.

0.3.3 tightens the statistical backbone of paired evaluation. The paired Δlog-loss bootstrap work isn’t just a “numbers” change—it’s about making drift conclusions more faithful to what was actually evaluated (token-weighted and paired, not loosely aggregated).

It also makes CI/release expectations blunt and explicit: perfect pairing, non-overlapping windows, and coverage floors aren’t “best effort” anymore—they’re enforced. That’s a theme in this release: fewer fuzzy edges, more things you can confidently point to.

And when things do go wrong, reports carry better context (including evaluation soft-fail metadata), which helps turn failures into something you can diagnose instead of something you just re-run blindly.

For the immutable release record, read the tagged CHANGELOG.md for v0.3.3.

More from the blog

Continue through recent releases and implementation notes.