Model Family Catalog

Overview

This page is the human-readable rendering of contracts/model_family_catalog.json.

Use it to answer three distinct questions without weakening the public meaning of the support matrix:

  • What is currently supported as a public lane?
  • What families are implemented in code but not publicly supported?
  • What families or capabilities should be added next?

Support Tier vs Coverage State

TermMeaningSource of truth
support tierPublic support/assurance posture for a declared lane. Values stay aligned with support_matrix.json.contracts/support_matrix.json
coverage stateRepo implementation maturity outside the public support matrix, such as profile_first_class, profile_shared_alias, auto_or_loader_only, loader_only, or backlog states.contracts/model_family_catalog.json

The support matrix remains strict. The model family catalog is broader by design and records code-level visibility, usage-only checkpoints, and recommended additions. Access-gated vendor checkpoints are intentionally kept out of declared support lanes and shipped preset inventory.

Declared Support

FamilyStateRepresentative modelsNotes
GPT-2 causal LMpublished_basisopenai-community/gpt2Public lane derived from gpt2-causal-hf.
BERT / RoBERTa MLMpublished_basisbert-base-uncased, roberta-basePublic lane derived from bert-mlm-hf.
Mistral 7B causal LMsupported_experimentalmistralai/Mistral-7B-v0.1Pilot preset and calibration config are shipped.
Qwen2 7B causal LMsupported_experimentalQwen/Qwen2-7BPilot preset and calibration config are shipped.
Qwen3 causal LMsupported_experimentalQwen/Qwen3-8BPilot preset and calibration config are shipped.
QwQ-32B reasoning causal LMsupported_experimentalQwen/QwQ-32BPilot preset and calibration config are shipped, and the current remote evaluate/verify lane closes cleanly on the dense checkpoint.
DeepSeek-R1-Distill-Qwen causal LMsupported_experimentaldeepseek-ai/DeepSeek-R1-Distill-Qwen-7BPilot preset and calibration config are shipped.
Phi-4 causal LM (text-only eval)supported_experimentalmicrosoft/Phi-4-reasoning-plusText-only pilot preset and calibration config are shipped, and the current remote evaluate/verify lane closes cleanly.
TinyLlama 1.1B causal LMsupported_experimentalTinyLlama/TinyLlama-1.1B-Chat-v1.0Ungated Llama-family pilot lane with shipped preset and calibration config.
OLMo 2 causal LMsupported_experimentalallenai/OLMo-2-1124-7B, allenai/OLMo-2-1124-13B-InstructPilot presets and calibration configs are shipped for both 7B and 13B scale points.
Qwen3.5 causal LMsupported_experimentalQwen/Qwen3.5-9BPilot preset and calibration config are shipped.
Seq2Seq / local pairscommunity_experimentalt5-small, facebook/bart-baseGeneric seq2seq lane without a published-basis claim.

Implemented Coverage

FamilyCoverage stateRepresentative modelsNotes
Mixtralprofile_first_classmistralai/Mixtral-8x7B-v0.1Profile and loader code recognize the family directly.
Llamaprofile_first_classopenlm-research/open_llama_7b, TinyLlama/TinyLlama-1.1B-Chat-v1.0Generic Llama-family profile handling is first-class. TinyLlama now provides the ungated declared support lane, while access-gated vendor checkpoints remain omitted.
Qwen family aliases (Qwen1.5/Qwen2.5/Qwen3 naming)profile_first_classQwen/Qwen2.5-14B, Qwen/Qwen3.5-9B, Qwen/QwQ-32BShared qwen-family heuristics still cover aliases beyond the declared Qwen2, Qwen3, and Qwen3.5 lanes, including the ungated QwQ reasoning branch.
Yiprofile_first_class01-ai/Yi-34BTreated as a RoPE decoder family in profile logic.
Phi familyprofile_first_classmicrosoft/Phi-3-mini-4k-instruct, microsoft/Phi-4-reasoning-plusDedicated phi-family selectors now exist. Phi-4 now has a declared text-only lane, while multimodal Phi-4 remains backlog-only.
OPT / GPT-NeoX / GPT-Jprofile_shared_aliasfacebook/opt-1.3b, EleutherAI/gpt-neox-20bAvailable through shared GPT-style paths.
Falconauto_or_loader_onlytiiuae/falcon-7bVisible through adapter-auto heuristics only.
GLMauto_or_loader_onlyTHUDM/glm-4-9b-chatVisible through adapter-auto heuristics only.
DeepSeekprofile_first_classdeepseek-ai/DeepSeek-R1-Distill-Qwen-7BDeepSeek distill checkpoints continue to share the qwen-family route. Oversized FP8 checkpoint-specific repo hooks and shipped configs were removed after bring-up showed that they do not fit the supported hardware/runtime path.
Broader BERT-like MLMs (DistilBERT/ALBERT/DeBERTa/ELECTRA)auto_or_loader_onlydistilbert-base-uncased, microsoft/deberta-v3-baseLoader/auto support exceeds the public BERT / RoBERTa lane.
Broader seq2seq families (mBART/PEGASUS/Marian)auto_or_loader_onlyfacebook/mbart-large-50, Helsinki-NLP/opus-mt-en-deLoader support is broader than the generic seq2seq public lane.

Usage Only

FamilyStateRepresentative modelsNotes
Qwen2.5 familyusage_onlyQwen/Qwen2.5-7B, Qwen/Qwen2.5-14B, Qwen/Qwen2.5-32BUsed in proof-pack suites and validation defaults.
Qwen1.5 72Busage_onlyQwen/Qwen1.5-72BUsed concretely in proof-pack suites.
Yi 34Busage_only01-ai/Yi-34BUsed in workshop and full proof-pack suites.
Mixtral 8x7Busage_onlymistralai/Mixtral-8x7B-v0.1Used in proof-pack flows without a public support lane.
PriorityFamilyPlanned support modeRepresentative modelsNotes
P2Full multimodal evaluation pipelinefull_multimodal_evalmicrosoft/Phi-4-vision-reasoning-15BDeferred capability backlog item beyond text-only evaluation for ungated multimodal checkpoints.

Promotion Criteria

A family only moves into support_matrix.json after all of the following are present:

  1. explicit adapter/profile recognition
  2. a shipped preset
  3. a shipped calibration config
  4. targeted tests
  5. CLI smoke evidence
  6. approved calibration/evaluation evidence