This is the single public writing hub for Overdog. Position pieces, research notes, white papers, and shorter blog posts now live in one place.
Use the section links below to jump directly to the material you need. Individual paper URLs remain available for citation and sharing.
These are the recurring arguments behind the framework: why trust-based governance fails, why external measurement matters, and why regulated deployments need calibrated evidence rather than dashboards alone.
AI governance should be rebuilt around measurable runtime risk. Trust is prospective. Safety is retrospective. Governance needs instrumentation that can observe, test, and bound behaviour while the system is running.
When the safety mechanism shares failure modes with the agent, both can fail together. External measurement breaks that correlation by moving the observation layer outside the system under test.
Guardrails, filters, and self-reporting tools solve real problems, but they were not designed to provide independent measurement, sequential validity, or structured evidence for regulated review.
Thresholds and heuristics are not enough. Conformal prediction, e-values, and calibration logic matter because they make the monitoring layer explicit about assumptions, false alarms, and when the basis for assurance no longer holds.
This section collects the surrounding research programme context behind CARF: what is already specified, what is being formalised, and what is still in preparation.
ARIC2 defines five regulatory join keys: Trace Schema, Nonconformity Score, Calibration Obligation, Validity Gating, and Evidence Bundle. These are the stable interfaces regulators and controlled deployments can write requirements against.
The Atom 0 methodology paper is in preparation. It covers conformal prediction, e-value sequential monitoring, the Standard Candle perturbation protocol, and the validity state model for runtime monitoring of agentic systems.
The research programme is grounded in conformal prediction, sequential hypothesis testing, and calibrated uncertainty quantification, together with fifteen years of evidence-pipeline engineering in clinical genomics. Founder publication record and academic links remain on the About page.
Longer-form papers live here. They remain individually addressable so they can be cited, shared, and indexed as standalone documents.
AI governance should be rebuilt around calibrated, independent measurement infrastructure. This paper maps six architectural failures from aviation, oil and gas, nuclear, and drug safety onto current AI governance and proposes measurement-based alternatives.
First posts are in preparation. Shorter essays, technical explainers, implementation notes, and commentary on agentic risk measurement will appear here as they are published.
Long-form pieces are published on Medium and cross-posted to Substack. Shorter commentary and threads appear on X and LinkedIn. Links will be added here as accounts go live.