System Architecture
Overview
| Aspect | Details |
|---|---|
| Purpose | Edit-agnostic safety evaluation framework for ML model weight modifications. |
| Audience | Developers extending InvarLock, operators debugging pipelines, security reviewers. |
| Core components | CLI shells, Core/runtime policy layer, Guard chain, Reporting/artifact subsystem. |
| Design goals | Torch-independent core, edit-agnostic guards, deterministic evaluation, explicit artifact contracts, full provenance. |
| Source of truth | src/invarlock/core/*.py, src/invarlock/reporting/*.py, src/invarlock/runtime_provenance.py, src/invarlock/runtime_verify.py, src/invarlock/cli/commands/*.py, src/invarlock/cli/run_*.py, src/invarlock/guards/*.py. |
See the Glossary for definitions of terms such as the canonical guard chain, policy digest, and measurement contract.
Contents
- Quick Reference
- High-Level Architecture
- Component Layers
- Pipeline Flow
- Guard Chain Architecture
- Report Generation Flow
- Architecture Guardrails
- Key Design Decisions
- Module Dependencies
- Extension Points
- Related Documentation
Quick Reference
High-Level Architecture
InvarLock follows a layered architecture with clear separation of concerns:
Component Layers
CLI Layer (src/invarlock/cli/)
Typer-based command shells providing user-facing entry points. The command modules should stay thin: parse arguments, call core/reporting owners, render output, and map failures to exit codes.
Shell support modules such as cli/config_execution.py, cli/run_execution.py,
cli/run_config.py, cli/run_pairing.py, cli/run_overhead.py, and
cli/run_artifacts.py belong to this boundary layer as well. They can perform
CLI-facing adaptation and console/event rendering, but they must not become
policy owners.
| Command | Purpose | Primary Output |
|---|---|---|
evaluate | Compare baseline vs subject with pinned windows | report JSON + MD |
verify | Validate report against schema and pairing | Exit code + messages |
report | Render/compare reports and report outputs | MD/HTML/JSON artifacts |
doctor | Environment diagnostics | Health check output |
advanced | Maintenance workflows such as evidence packs, policy packs, plugins, and calibration | Exit code + workflow-specific artifacts |
version | Emit package and schema version information | Version string |
Core Policy / Contracts (src/invarlock/core/, src/invarlock/reporting/)
Deterministic policy, artifact-contract, and report-verification owners shared by the CLI and non-CLI entrypoints.
| Module | Responsibility |
|---|---|
evaluate_contract.py | Baseline-report validation and emitted run-artifact contract enforcement for evaluate |
evaluate_plan.py | Evaluation result policy, degradation classification, and emitted outcome shaping |
report_inputs.py | Canonical report path resolution and JSON-object validation |
doctor_findings.py | Structured doctor findings and optional report cross-check analysis |
verify_contract.py | Structured report-verification service used by verify and evidence-pack flows |
runtime_manifest_verify.py + runtime_provenance.py | Authoritative runtime-manifest verification and runtime-provenance ownership for report verification |
run_policy.py | Shared run policy helpers such as split choice, PM thresholds, and overhead policy |
run_retry_policy.py | Retry-attempt summaries and retry state transitions |
run_snapshot_contract.py + run_snapshot_policy.py | Snapshot planning, restore behavior, and retry transitions |
run_guard_overhead_policy.py | Guard-overhead normalization, summary building, and report shaping |
run_provenance_contract.py + run_report_contract.py | Run provenance and run-report assembly contracts |
run_report_payload_policy.py | Deterministic payload shaping for context, metrics, guards, and flags |
Runtime Provenance Verification Ownership
Runtime provenance uses a single verifier implementation:
core/runtime_manifest_verify.pyis the authoritative verifier forruntime.manifest.jsonplus report-digest binding checks.runtime_verify.pyandcli/runtime_verify.pyare the programmatic and CLI entrypoints for that verifier.runtime_provenance.pycalls the same verifier wheninvarlock verifyenforces runtime provenance on container-backed reports.- Product behavior does not depend on finding an external verifier binary on
PATH; verifier semantics are package-native and deterministic across installs.
Core Runtime (src/invarlock/core/)
Pipeline orchestration without direct torch imports (torch-independent coordination).
| Module | Responsibility |
|---|---|
runner.py + runner_*.py | Pipeline phases: prepare → guards → edit → eval → finalize |
api.py | Protocol definitions for ModelAdapter, ModelEdit, Guard |
bootstrap.py | BCa bootstrap CI computation for paired metrics |
checkpoint.py | Snapshot/restore primitives for retry loops |
registry.py | Plugin discovery and registration |
Guard Layer (src/invarlock/guards/)
Four-guard pipeline for edit safety validation.
| Guard | Focus | Key Metric |
|---|---|---|
invariants | Structural integrity, NaN/Inf checks | validation.invariants_pass |
spectral | Weight matrix spectral norm stability | κ-threshold violations |
rmt | Activation edge-risk via Random Matrix Theory | ε-band compliance |
variance | Variance equalization with A/B gate | Predictive gain |
Reporting Layer (src/invarlock/reporting/)
Report generation, validation, persistence, and rendering.
| Module | Responsibility |
|---|---|
report_schema.py | Evaluation report schema and structural validation |
report_validation.py | Canonical validation-flag computation |
report_make.py | Public evaluation-report entrypoint that coordinates the split report-making owners |
report_make_inputs.py | Input normalization, baseline reference building, and build-section extraction |
report_make_assembly.py | Policy/provenance/guard assembly and report build-context composition |
report_make_output.py | Final evaluation-report shaping and output payload construction |
report_bundle.py | Evaluation-bundle persistence, manifest writing, and evidence attachment |
report_contract.py | Input loading and report-generation planning |
report_console.py | Console/report validation summary helpers used by CLI/reporting surfaces |
report_summary.py | Shared executive-summary/view-model derivation for reporting surfaces |
render.py | Markdown rendering for evaluation reports |
html.py | HTML export with styling |
report_files.py | Raw run-report JSON/Markdown/HTML persistence |
evidence.py | Evidence file normalization and attachment helpers |
telemetry.py | Performance metrics collection |
Pipeline Flow
Guard Chain Architecture
Report Generation Flow
Architecture Guardrails
The shell/core split is enforced by design and by targeted architecture guard tests. The intended invariants are:
- No lazy exports in package roots such as
adapters/__init__.pyorguards/__init__.py. Package roots should expose only explicit canonical exports. - No
rmt_legacyreferences in production source. RMT ownership lives inrmt.py,rmt_analysis.py,rmt_detection.py, andrmt_math.py. - No dependency-map orchestration in command shells. Public command owners must
stay thin and must not rebuild giant
depsdictionaries or inject callables to recreate removed indirection. - No compatibility-only command signatures once a canonical owner contract
exists. Example: lens-metric calculation takes a required
MetricsConfiginstead of deprecated per-call overrides. - No CLI imports inside owner layers. Modules under
src/invarlock/core/andsrc/invarlock/reporting/must stay callable without importinginvarlock.cli.
These guardrails keep the CLI as an imperative shell while policy, contracts, and verdict computation remain reusable from non-CLI flows such as evidence-pack verification and programmatic execution.
Key Design Decisions
| Decision | Rationale | Implementation |
|---|---|---|
| Torch-independent core | runner.py coordinates without importing torch; adapters encapsulate torch-specific logic. | Adapter protocol in core/api.py |
| Edit-agnostic guards | Guards work with any weight modification (quantization, pruning, LoRA merge). | Guard protocol validates model state, not edit type |
| Tier-based policies | Calibrated thresholds in tiers.yaml for balanced/conservative/aggressive safety profiles. | Policy resolution in guards/policies.py |
| Deterministic evaluation | Seed bundle + window pairing schedules ensure reproducible metrics. | meta.seeds, dataset.windows.stats tracking |
| Functional-core / imperative-shell split | Keep policy, artifact contracts, and verdict computation reusable outside the CLI while CLI modules stay thin. | core/*.py + reporting/*.py owners called from cli/commands/*.py |
| Single verifier ownership | Runtime-manifest verification should not vary with host tooling, so it must use one product implementation. | core/runtime_manifest_verify.py, runtime_verify.py, runtime_provenance.py |
| Plugin architecture | Entry points for guards, adapters, edits enable extension without core changes. | importlib.metadata discovery in core/registry.py |
| Log-space primary metrics | Paired ΔlogNLL with BCa bootstrap avoids ratio math bias. | core/bootstrap.py implementation |
Module Dependencies
Extension Points
InvarLock supports extension via entry points without modifying core code.
| Extension Type | Entry Point Group | Example |
|---|---|---|
| Adapters | invarlock.adapters | hf_causal, hf_mlm, hf_causal |
| Guards | invarlock.guards | invariants, spectral, rmt, variance |
| Edits | invarlock.edits | quant_rtn, noop |
Custom Adapter Example
# my_adapter.py
from invarlock.core.api import ModelAdapter
class MyAdapter(ModelAdapter):
name = "my_custom_adapter"
def load(self, model_id: str, device: str) -> nn.Module:
# Custom loading logic
...
def describe(self, model: nn.Module) -> dict:
# Return model metadata
...
# pyproject.toml
[project.entry-points."invarlock.adapters"]
my_custom_adapter = "my_adapter:MyAdapter"
Troubleshooting
- Import errors in torch-free context: ensure
invarlock.coreimports stay torch-independent; use adapters for torch operations. - Guard preparation failures: check tier policy compatibility; use
context.run.strict_guard_prepare: falsefor debugging. - Report generation errors: verify baseline and subject reports exist and have compatible window structures.
Observability
- Pipeline phases emit timing via
print_timing_summary()in CLI. - Guard results recorded in
report.guards[]and reportvalidation.*flags. - Telemetry fields include
memory_mb_peak,latency_ms_*,duration_s.
Related Documentation
- CLI Reference — Command usage and options
- Guards Reference — Guard configuration and evidence
- Configuration Schema — YAML config structure
- reports — report schema and verification
- Assurance Case Overview — Assurance claims and evidence