Technology

CARF -- the Conformal Agentic Risk Framework

A statistical measurement architecture for runtime governance of AI agent deployments. It treats agent behaviour as a measurable quantity, not a self-reported claim. The core statistical engine is TED -- the Transform Engine for Decision Evidence.

Two lurchers inspecting a manifold map at a drafting table in a 1955 industrial manual style

Core Principle

We do not ask the nuclear reactor if it is safe. We measure the neutron flux with external sensors.

Current AI risk management asks the agent to grade its own homework. The failure mode of the safety mechanism is correlated with the failure mode of the agent. CARF breaks that correlation by measuring behaviour externally, using behavioural telemetry and distribution-free statistical methods under stated assumptions.

In deployed black-box systems, the safety-relevant property is often only partially accessible from internal state alone. In those settings, rigorous empirical measurement is not a fallback. It is the correct interface for operational control.

When the property of interest cannot be directly inspected, calibrated measurement over observables becomes the safety contract.

Illustration contrasting the trap of self-reporting agents with the solution of external instrumentation

Measurement Layer

AMBA: Behavioural Measurement

The Agentic Manifold of Behavioural Attestation.

AMBA observes agent behaviour through external telemetry and produces structured, replayable evidence. It does not make governance decisions -- it measures.

AMBA ingests raw multi-turn interactions and extracts behavioural state. It normalises per-turn sensor vectors across four families -- structural, semantic, interaction, and control -- distinguishing structural zeros (behaviour absent) from missing data (sensor failure). The mask ensures silence is never confused with compliance.

AMBA discovers fragility space from behavioural phenotype vectors and maintains a signature registry. It freezes attack sequences into reproducible stimuli -- Standard Candles -- building and curating the adversarial library used for calibration and stress testing. Every missed attack becomes a new candle. A trick only works once.

Two lurchers at a workbench with magnifying glass, test probes, multimeter, and ticker tape, representing the AMBA sensor array

Foundation

The Manifold Hypothesis

Hypothesis -- tested at Gate 0

Agent behaviour is not random. It lives on a low-dimensional manifold embedded in high-dimensional measurement space. In-envelope and out-of-envelope are not separate clusters. They are topological regions connected by traversable trajectories.

Observational epidemiology for AI. We instrument the vital signs to reconstruct the geometry of risk.

Lurcher in a hard hat at a chalkboard illustrating a manifold-valid latent structure

Statistical Engine

TED: The Transform Engine for Decision Evidence

Between AMBA's raw observations and CYRL's governance decisions sits TED -- the core statistical computation layer.

Conformal prediction sets

Coverage-guaranteed operating boundaries under exchangeability. Distribution-free. Finite-sample false alarm control.

E-value sequential monitoring

Accumulates evidence for drift without fixed stopping points. Anytime-valid sequential evidence control.

Nonnegative matrix factorisation with graph Laplacian regularisation

Extracts behavioural signatures -- stable patterns tested under resampling and cross-dataset validation.

Jensen-Shannon distance and distributional phenotyping

Converts sensor readings into fixed-length nonnegative vectors preserving behavioural distribution texture.

TED is implemented in cyrl_core, an internal Python package with 482 passing tests and no external statistical dependencies. All methods built from first principles.

Governance Layer

CYRL: Validity and Governance

The Conformal Yaw Recognition Layer.

If the statistical basis for assurance breaks, CYRL moves the system into a conservative posture -- review, escalation, constraint, or stop depending on deployment policy.

CYRL calibrates conformal prediction boundaries under stated assumptions -- Interior, Boundary, Exterior -- with finite-sample false alarm control.

CYRL runs the validity state machine: Commissioning, Valid, Suspect, Invalid. It accumulates e-value sequential evidence for drift detection and does not release the permission slip unless coverage is defensible.

CYRL converts validated risk signals into actions: thinking budget adjustment, tool restrictions, escalation to human review. Bidirectional control -- structural risk gets more thinking budget, rationalisation risk gets less. Real-time intervention is only enabled when the Tiger Battery confirms a stable, actionable dose-response curve. If the curve is chaotic, intervention is disabled.

Lurcher at a control panel with four indicator lights representing the validity state machine

Core Mathematics

Conformal Prediction: The Permission Slip

Theorem

Under exchangeability, the probability of false alarm is bounded by alpha. Finite-sample. Distribution-free.

Sequential monitoring via e-values detects slow drift. The alarm triggers when the product of e-values exceeds the threshold. Anytime-valid -- check at any time without inflating false alarm rates.

This is not a vibes check. It is a finite-sample statistical guarantee under stated assumptions.

Dark lurcher with spectacles stamping a calibration certificate beside a ticker tape machine

Evidence and Compliance

ARIC²: Evidence and Consent

Agentic Risk-Informed Consent and Calibration.

The compliance and evidence layer. ARIC² defines five stable interfaces -- join keys -- that regulators can write requirements against without understanding the underlying mathematics.

Trace Schema

What must be logged

Nonconformity Score

How deviation is scored

Calibration Obligation

What must be demonstrated before production

Validity Gating

What happens when assumptions break

Evidence Bundle

What must be retained for audit

The regulator does not need to understand graph-regularised NMF. They need to know the system has a calibrated alarm, knows when it is valid, and produces audit evidence.

Lurcher inspecting an apparatus with a policy scroll and five brass plates representing the five regulatory join keys

Commissioning

The Commissioning Pipeline

You do not deploy this. You commission it. Like a pressure vessel.

Define stratum

Identify the workflow, operating assumptions, and the boundary within which monitoring is intended to hold.

Collect baseline

Gather runtime traces and behavioural records from existing systems.

Run assays

Execute the Tiger Battery and calibration workloads against the collected baseline.

Fit pipeline

Build the measurement pipeline: sensor normalisation, signature extraction, state machine configuration.

Calibrate

Set conformal prediction boundaries with finite-sample guarantees under stated assumptions.

Verify coverage

Confirm that calibration contract holds on held-out data before production.

Enable enforcement

Activate the validity state machine and begin live monitoring.

Any change to the stratum voids the calibration and forces re-commissioning.

Seven lurchers working along a factory production line with a pressure vessel at the centre, representing the seven-step commissioning pipeline

Operational Modes

Two Operational Modes

Monitoring Mode (Defence)

Real-time. Validity-gated guarantees under stated assumptions. The question: are we in bounds right now?

Assay Mode (Discovery)

Offline stress-testing. No guarantees -- objective is discovery. Runs the Tiger Battery. The question: where are the bounds?

Split illustration: left shows a lurcher in a monitoring booth, right shows a lurcher in a cave with a lantern

Dose-Response Discovery

The Tiger Battery

Standard Candles are frozen attack sequences -- reproducible stimuli applied at varying intensity. The dose-response curve is not assumed monotone. The shape is empirically classified: monotone, threshold, non-monotone, hysteresis, flat.

The shape determines the intervention policy.

Lurcher in a cave with a lantern examining dose-response curves

Resilience

The Operator Ladder: Honest Degradation

If the complex geometry fails, fall back. If the manifold hypothesis fails, drop the geometry module. The system degrades to a calibrated anomaly detector. Simpler maths, not silence.

Tier 1: Theorem

Conformal Guarantees. Always shippable. Finite-sample false alarm control under stated assumptions.

Tier 2: Hypothesis

Manifold Geometry. Tested at Gate 0. If it fails, Tier 1 still holds.

Tier 3: Moonshot

Real-Time Intervention. Gated on dose-response discovery.

Cross-section of a three-tier machine showing the operator ladder for honest degradation

Epistemic Status

What we claim and at what confidence

Theorem

Conformal Prediction and E-Values

Mathematical proof under stated assumptions.

Definition

The Regulatory Substrate

True by construction.

Hypothesis

The Manifold and Signatures

Tested at Gate 0.

Moonshot

Real-Time Intervention

Gated on dose-response discovery.

We sell calibrated uncertainty, not false confidence.

Lurcher reviewing a clipboard that categorises claims by theorem, definition, hypothesis, and moonshot

The Cast

CARF is built by a squadron of sighthound engineers.

Each module has a named operator with a defined role and verb.

AMBA Squadron (Measurement)

Gauge

Instrument

Normalises and quality-controls the sensor array.

Amber

Measure

Lead observer. Extracts behavioural state from raw interactions.

Atlas

Map

Discovers and maintains the fragility map.

Candle

Standardise

Curates the adversarial library. Freezes attacks into Standard Candles.

CYRL Squadron (Governance)

Rufus

Calibrate

Sets conformal boundaries with finite-sample guarantees under stated assumptions.

Cyril

Gate

Runs the validity state machine. Accumulates sequential evidence.

Governor

Intervene

Controls the throttle. Bidirectional. Gated on dose-response.

Authority flows upward. The dog with the most power needs the most proof.

Warm Dogs. Cold Machines. Hard Math.

Squadron photograph: founder in a leather bomber jacket standing among seven sighthound engineers in an industrial workshop, each wearing a brass name tag

CARF -- the Conformal Agentic Risk Framework

AMBA: Behavioural Measurement

The Manifold Hypothesis

TED: The Transform Engine for Decision Evidence

Conformal prediction sets

E-value sequential monitoring

Nonnegative matrix factorisation with graph Laplacian regularisation

Jensen-Shannon distance and distributional phenotyping

CYRL: Validity and Governance

Conformal Prediction: The Permission Slip

ARIC2: Evidence and Consent

Trace Schema

Nonconformity Score

Calibration Obligation

Validity Gating

Evidence Bundle

The Commissioning Pipeline

Define stratum

Collect baseline

Run assays

Fit pipeline

Calibrate

Verify coverage

Enable enforcement

Two Operational Modes

Monitoring Mode (Defence)

Assay Mode (Discovery)

The Tiger Battery

The Operator Ladder: Honest Degradation

What we claim and at what confidence

Conformal Prediction and E-Values

The Regulatory Substrate

The Manifold and Signatures

Real-Time Intervention

CARF is built by a squadron of sighthound engineers.

ARIC²: Evidence and Consent