Overview
cairn is an open-source, self-hosted execution layer for the work behind AI. You declare an outcome; cairn drives the tools underneath and records exactly what ran. The engine is domain-neutral — everything that knows about a domain lives in a pack, and everything that touches a tool is an operator. Today it spans the work behind AI — fine-tune a model, gate a dataset, evaluate it against a baseline, promote it behind a human, and run an incident RCA — each as a single governed command.
The engine — template runner, dispatcher, deterministic guards, action executor, LLM gateway, run store — knows nothing about fine-tuning. Swap the pack and the same machinery serves dataset checks, model evaluation, incident response, or infra changes.
The thesis — governed, not magic
Section titled “The thesis — governed, not magic”This is what separates cairn from a bare agent loop or a black-box tool: the work runs behind deterministic guardrails you can inspect afterward.
- Hard budget — every run accrues cost in a scope; over the cap it raises
BudgetExceeded, so a fine-tune can’t quietly overspend. - Policy gating + human approval — proposed actions pass through policy rules before any executor runs; high-risk ones (promote a model, destroy a resource) pause for a human (HITL) and resume durably.
- Reproducible runs — templates are content-addressed, so a run can be replayed exactly. “It worked last week” stops being a mystery.
- Tamper-evident audit — every model call, tool call, and action lands in a hash-chained, append-only log: what ran, what it cost, who approved it.
- Citation validation — for synthesized answers, every identifier the model cites is exact-string-matched against the context it was given; unmatched citations are stripped.
How packs and operators are discovered
Section titled “How packs and operators are discovered”Packs and operators are discovered through Python entry-point groups — not
filesystem scanning. At startup the engine walks the groups, loads each Pack
object, and registers it; it then reads everything it needs (templates,
operators, backends, settings, doctor checks) from the registry. The engine
never imports a pack module. That one-way dependency is what keeps it reusable
across domains — and lets anyone publish an operator that inherits the budget,
gate, and audit for free.
Installation — set up a dev environment with uv.
Architecture — the engine, packs, profiles, and the request lifecycle.
CLI reference — the cairn command surface.