Your customers can already collect traces, score outputs, and generate compliance records. For higher-accountability deployments, they are also being asked to define operating scope, detect deviation under stated assumptions, and produce structured evidence for review when those assumptions begin to fail.
Overdog adds a commissioned measurement layer above existing platform telemetry. That layer is CARF -- the Conformal Agentic Risk Framework.
Discuss a reference integrationCARF is designed to use the telemetry and runtime records vendors already produce. The vendor remains the platform of record. CARF adds commissioning logic, validity monitoring, and structured evidence above that layer.
The gap is not collection. It is calibration, error control, and honest degradation when the assumptions behind monitoring stop holding.
Sets operating boundaries with finite-sample statistical guarantees under stated assumptions. The probability of false alarm is bounded -- not estimated, bounded.
Accumulates evidence for drift without requiring fixed stopping points. The system can check for deviation at any time without inflating false alarm rates.
CARF measures how an agent behaves -- tool-call patterns, interaction dynamics, structural coherence -- rather than relying on another model to judge the output.
These require a different discipline from dashboards and workflow tooling. The right relationship is integration: the vendor provides the telemetry, CARF provides calibrated evidence, and the customer gets both.
The vendor's telemetry becomes more valuable. The traces, metrics, and runtime records the vendor already collects become inputs to a measurement pipeline that produces calibrated validity signals and structured evidence. The same data, made more governable.
The vendor's customers get a harder answer. When a customer asks how operating scope is defined, how deviation is detected with known error properties, and what evidence exists for review, the vendor can point to a measurement layer that addresses those questions without having to build that capability internally.
The vendor's platform stays central. The vendor remains the platform of record for execution and telemetry. The measurement layer sits above it and emits outputs that feed the customer's review, escalation, and governance workflows.
A documented boundary within which monitoring assumptions are intended to hold.
Four states: Commissioning, Valid, Suspect, Invalid. The state reflects whether the statistical basis for monitoring currently holds. State transitions are logged.
A record linking state changes, policy actions, and runtime observations.
When the validity state moves to Suspect or Invalid, the system does not silently continue. It emits evidence for review and can trigger escalation, constraint, or autonomy reduction according to the commissioned policy.
The vendor keeps the execution and telemetry layer. CARF adds the measurement layer that makes those records more governable.
One workflow. One declared operating scope. One telemetry stream. One review pathway.
The goal of a reference integration is not broad rollout. It is to test whether the measurement layer produces usable control signals and structured evidence for review in a real deployment.
And where current monitoring cannot answer questions about formal evidence, known error properties, or structured evidence for review.
Overdog is focused on selected vendor conversations around reference integrations and joint pilot pathways for higher-accountability AI deployments.
Discuss a reference integration