System Architecture

Overview

AspectDetails
PurposeEdit-agnostic safety evaluation framework for ML model weight modifications.
AudienceDevelopers extending InvarLock, operators debugging pipelines, security reviewers.
Core componentsCLI shells, Core/runtime policy layer, Guard chain, Reporting/artifact subsystem.
Design goalsTorch-independent core, edit-agnostic guards, deterministic evaluation, explicit artifact contracts, full provenance.
Source of truthsrc/invarlock/core/*.py, src/invarlock/reporting/*.py, src/invarlock/runtime_provenance.py, src/invarlock/runtime_verify.py, src/invarlock/cli/commands/*.py, src/invarlock/cli/run_*.py, src/invarlock/guards/*.py.

See the Glossary for definitions of terms such as the canonical guard chain, policy digest, and measurement contract.

Contents

  1. Quick Reference
  2. High-Level Architecture
  3. Component Layers
  4. Pipeline Flow
  5. Guard Chain Architecture
  6. Report Generation Flow
  7. Architecture Guardrails
  8. Key Design Decisions
  9. Module Dependencies
  10. Extension Points
  11. Related Documentation

Quick Reference

System overview showing user input, processing stages, and report outputs.

High-Level Architecture

InvarLock follows a layered architecture with clear separation of concerns:

Layered architecture connecting CLI shells, policy contracts, runtime services, guards, and reporting files.

Component Layers

CLI Layer (src/invarlock/cli/)

Typer-based command shells providing user-facing entry points. The command modules should stay thin: parse arguments, call core/reporting owners, render output, and map failures to exit codes.

Shell support modules such as cli/config_execution.py, cli/run_execution.py, cli/run_config.py, cli/run_pairing.py, cli/run_overhead.py, and cli/run_artifacts.py belong to this boundary layer as well. They can perform CLI-facing adaptation and console/event rendering, but they must not become policy owners.

CommandPurposePrimary Output
evaluateCompare baseline vs subject with pinned windowsreport JSON + MD
verifyValidate report against schema and pairingExit code + messages
reportRender/compare reports and report outputsMD/HTML/JSON artifacts
doctorEnvironment diagnosticsHealth check output
advancedMaintenance workflows such as evidence packs, policy packs, plugins, and calibrationExit code + workflow-specific artifacts
versionEmit package and schema version informationVersion string

Core Policy / Contracts (src/invarlock/core/, src/invarlock/reporting/)

Deterministic policy, artifact-contract, and report-verification owners shared by the CLI and non-CLI entrypoints.

ModuleResponsibility
evaluate_contract.pyBaseline-report validation and emitted run-artifact contract enforcement for evaluate
evaluate_plan.pyEvaluation result policy, degradation classification, and emitted outcome shaping
report_inputs.pyCanonical report path resolution and JSON-object validation
doctor_findings.pyStructured doctor findings and optional report cross-check analysis
verify_contract.pyStructured report-verification service used by verify and evidence-pack flows
runtime_manifest_verify.py + runtime_provenance.pyAuthoritative runtime-manifest verification and runtime-provenance ownership for report verification
run_policy.pyShared run policy helpers such as split choice, PM thresholds, and overhead policy
run_retry_policy.pyRetry-attempt summaries and retry state transitions
run_snapshot_contract.py + run_snapshot_policy.pySnapshot planning, restore behavior, and retry transitions
run_guard_overhead_policy.pyGuard-overhead normalization, summary building, and report shaping
run_provenance_contract.py + run_report_contract.pyRun provenance and run-report assembly contracts
run_report_payload_policy.pyDeterministic payload shaping for context, metrics, guards, and flags

Runtime Provenance Verification Ownership

Runtime provenance uses a single verifier implementation:

  • core/runtime_manifest_verify.py is the authoritative verifier for runtime.manifest.json plus report-digest binding checks.
  • runtime_verify.py and cli/runtime_verify.py are the programmatic and CLI entrypoints for that verifier.
  • runtime_provenance.py calls the same verifier when invarlock verify enforces runtime provenance on container-backed reports.
  • Product behavior does not depend on finding an external verifier binary on PATH; verifier semantics are package-native and deterministic across installs.

Core Runtime (src/invarlock/core/)

Pipeline orchestration without direct torch imports (torch-independent coordination).

ModuleResponsibility
runner.py + runner_*.pyPipeline phases: prepare → guards → edit → eval → finalize
api.pyProtocol definitions for ModelAdapter, ModelEdit, Guard
bootstrap.pyBCa bootstrap CI computation for paired metrics
checkpoint.pySnapshot/restore primitives for retry loops
registry.pyPlugin discovery and registration

Guard Layer (src/invarlock/guards/)

Four-guard pipeline for edit safety validation.

GuardFocusKey Metric
invariantsStructural integrity, NaN/Inf checksvalidation.invariants_pass
spectralWeight matrix spectral norm stabilityκ-threshold violations
rmtActivation edge-risk via Random Matrix Theoryε-band compliance
varianceVariance equalization with A/B gatePredictive gain

Reporting Layer (src/invarlock/reporting/)

Report generation, validation, persistence, and rendering.

ModuleResponsibility
report_schema.pyEvaluation report schema and structural validation
report_validation.pyCanonical validation-flag computation
report_make.pyPublic evaluation-report entrypoint that coordinates the split report-making owners
report_make_inputs.pyInput normalization, baseline reference building, and build-section extraction
report_make_assembly.pyPolicy/provenance/guard assembly and report build-context composition
report_make_output.pyFinal evaluation-report shaping and output payload construction
report_bundle.pyEvaluation-bundle persistence, manifest writing, and evidence attachment
report_contract.pyInput loading and report-generation planning
report_console.pyConsole/report validation summary helpers used by CLI/reporting surfaces
report_summary.pyShared executive-summary/view-model derivation for reporting surfaces
render.pyMarkdown rendering for evaluation reports
html.pyHTML export with styling
report_files.pyRaw run-report JSON/Markdown/HTML persistence
evidence.pyEvidence file normalization and attachment helpers
telemetry.pyPerformance metrics collection

Pipeline Flow

Evaluation pipeline from baseline and subject runs into normalized metric comparison, policy application, and report rendering.

Guard Chain Architecture

Guard chain execution across pre-edit and post-edit checks.

Report Generation Flow

Report generation flow from baseline and subject reports through evidence assembly.

Architecture Guardrails

The shell/core split is enforced by design and by targeted architecture guard tests. The intended invariants are:

  • No lazy exports in package roots such as adapters/__init__.py or guards/__init__.py. Package roots should expose only explicit canonical exports.
  • No rmt_legacy references in production source. RMT ownership lives in rmt.py, rmt_analysis.py, rmt_detection.py, and rmt_math.py.
  • No dependency-map orchestration in command shells. Public command owners must stay thin and must not rebuild giant deps dictionaries or inject callables to recreate removed indirection.
  • No compatibility-only command signatures once a canonical owner contract exists. Example: lens-metric calculation takes a required MetricsConfig instead of deprecated per-call overrides.
  • No CLI imports inside owner layers. Modules under src/invarlock/core/ and src/invarlock/reporting/ must stay callable without importing invarlock.cli.

These guardrails keep the CLI as an imperative shell while policy, contracts, and verdict computation remain reusable from non-CLI flows such as evidence-pack verification and programmatic execution.

Key Design Decisions

DecisionRationaleImplementation
Torch-independent corerunner.py coordinates without importing torch; adapters encapsulate torch-specific logic.Adapter protocol in core/api.py
Edit-agnostic guardsGuards work with any weight modification (quantization, pruning, LoRA merge).Guard protocol validates model state, not edit type
Tier-based policiesCalibrated thresholds in tiers.yaml for balanced/conservative/aggressive safety profiles.Policy resolution in guards/policies.py
Deterministic evaluationSeed bundle + window pairing schedules ensure reproducible metrics.meta.seeds, dataset.windows.stats tracking
Functional-core / imperative-shell splitKeep policy, artifact contracts, and verdict computation reusable outside the CLI while CLI modules stay thin.core/*.py + reporting/*.py owners called from cli/commands/*.py
Single verifier ownershipRuntime-manifest verification should not vary with host tooling, so it must use one product implementation.core/runtime_manifest_verify.py, runtime_verify.py, runtime_provenance.py
Plugin architectureEntry points for guards, adapters, edits enable extension without core changes.importlib.metadata discovery in core/registry.py
Log-space primary metricsPaired ΔlogNLL with BCa bootstrap avoids ratio math bias.core/bootstrap.py implementation

Module Dependencies

Module dependency graph linking CLI shells, shared contracts, runtime owners, and extension surfaces.

Extension Points

InvarLock supports extension via entry points without modifying core code.

Extension TypeEntry Point GroupExample
Adaptersinvarlock.adaptershf_causal, hf_mlm, hf_causal
Guardsinvarlock.guardsinvariants, spectral, rmt, variance
Editsinvarlock.editsquant_rtn, noop

Custom Adapter Example

# my_adapter.py
from invarlock.core.api import ModelAdapter

class MyAdapter(ModelAdapter):
    name = "my_custom_adapter"

    def load(self, model_id: str, device: str) -> nn.Module:
        # Custom loading logic
        ...

    def describe(self, model: nn.Module) -> dict:
        # Return model metadata
        ...
# pyproject.toml
[project.entry-points."invarlock.adapters"]
my_custom_adapter = "my_adapter:MyAdapter"

Troubleshooting

  • Import errors in torch-free context: ensure invarlock.core imports stay torch-independent; use adapters for torch operations.
  • Guard preparation failures: check tier policy compatibility; use context.run.strict_guard_prepare: false for debugging.
  • Report generation errors: verify baseline and subject reports exist and have compatible window structures.

Observability

  • Pipeline phases emit timing via print_timing_summary() in CLI.
  • Guard results recorded in report.guards[] and report validation.* flags.
  • Telemetry fields include memory_mb_peak, latency_ms_*, duration_s.