Getting Started

Overview

AspectDetails
PurposeInstall InvarLock and run your first evaluation.
AudienceNew users setting up their environment.
Python3.12+ recommended (CI uses 3.13).
Installpip install "invarlock[hf]" for HF adapter workflows.
Next stepQuickstart for hands-on commands.

This guide covers installation, environment setup, and your first evaluation run.

Learning Paths

Choose your path based on your role:

PersonaPath
First-time userGetting Started → QuickstartCompare & evaluate
Python developerGetting Started → Primary Metric SmokeAPI Guide
Custom data userGetting Started → Bring Your Own DataConfig Gallery
Plugin developerGetting Started → PluginsGuards Reference
Validation engineerGetting Started → Proof PacksProof Packs Internals
Security auditorGetting Started → Threat ModelBest Practices

Install InvarLock

# Minimal core (no torch; CLI + config/schema tools)
pip install invarlock

# Recommended (HF adapter + evaluation stack for evaluate/run)
pip install "invarlock[hf]"

# Full (all extras)
pip install "invarlock[all]"
pipx install --python python3.12 "invarlock[hf]"

Initialize Environment

conda create -n invarlock python=3.12 -y
conda activate invarlock
# Core + HF stack in this env
pip install "invarlock[hf]"

Verify Installation

invarlock doctor

Network Access

InvarLock blocks outbound network by default. When you need to download models or datasets, opt in per run with INVARLOCK_ALLOW_NETWORK=1:

INVARLOCK_ALLOW_NETWORK=1 invarlock run -c configs/presets/causal_lm/wikitext2_512.yaml --profile ci

For offline use, pre‑download assets and enforce offline reads with HF_DATASETS_OFFLINE=1. You can also relocate your HF cache via HF_HOME/HF_DATASETS_CACHE.

Run The Automation Loop

Use the prebuilt workflow to capture a baseline and execute the edit stack:

make eval-loop

For more hands-on examples, see the Example Reports.

See also: Compare & evaluate (BYOE) for a universal baseline→subject→report workflow when you already have two checkpoints.

Fast Smoke Runs

For quick local/CI checks, enable an approximate capacity pass to shorten dataset prep:

INVARLOCK_CAPACITY_FAST=1 invarlock run -c configs/presets/causal_lm/wikitext2_512.yaml --profile ci

Note: this skips full capacity/dedupe work; don’t use for release evidence.

Compare & evaluate First

INVARLOCK_ALLOW_NETWORK=1 INVARLOCK_DEDUP_TEXTS=1 invarlock evaluate \
  --source gpt2 \
  --edited /path/to/edited \
  --adapter auto \
  --profile ci \
  --preset configs/presets/causal_lm/wikitext2_512.yaml

Notes

  • Prefer Compare & evaluate (BYOE) for production. Use --edit-config overlays for quick smokes.

Device Support

InvarLock defaults to --device auto, probing CUDA → MPS → CPU in that order. All guard calculations and reports are device-agnostic; we continuously exercise CPU paths on Linux and macOS runners, document MPS fallbacks for Apple Silicon, and treat CUDA as optional-but-recommended for release-tier baselines. Native Windows is not supported; use WSL2 or a Linux container if you need to run InvarLock from a Windows host. When in doubt:

  • invarlock doctor reports the detected accelerators.
  • Use --device cpu to force portability runs, or --profile ci_cpu to exercise the reduced-window telemetry preset.
  • Keep INVARLOCK_OMP_THREADS >= 4 for long CPU jobs to avoid multi-hour baselines.

Next Steps

Choose your path based on your workflow:

I want to...Start here
evaluate my own edited model (BYOE)Compare & evaluate (BYOE)
Understand the CLI commandsQuickstart
Bring my own evaluation datasetBring Your Own Data
See example outputsExample Reports
Understand what's in a reportReading a report
Use InvarLock programmaticallyAPI Guide
Understand the assurance scopeAssurance Case
Set up secure production deploymentSecurity Best Practices