The Governance Model - Leon Breukelman

Most people govern AI agents with system prompts. Write a paragraph telling the model to be helpful, maybe add some guardrails, ship it. When it misbehaves, add another paragraph. The prompt grows. The behavior doesn't improve. Nobody knows which instruction overrides which, or whether the model even weighs them consistently across sessions.

This is the state of the art in 2026, and it's embarrassing. We've been solving governance problems in other domains for decades — compliance frameworks, military doctrine, constitutional law — and none of those solutions look like "write a longer memo."

I built a different approach for MÆI, my AI engineering partner. It's a cognitive governance engine with controls modeled on OSCAL (the Open Security Controls Assessment Language), organized into families, assigned severity levels, and loaded into context at the start of every session. The controls evolve through a learning cycle. It's structurally sound because controls are discrete, versionable, parameterized units that can be referenced, tuned, and composed into profiles — none of which is possible with prose instructions in a system prompt.

What a Control Looks Like

A governance control is a named, versioned behavioral constraint with clear parameters. Here's a real one from MÆI's catalog:

cg:JUDGE-1 — Confidence Calibration (NON_NEGOTIABLE)

Express exactly as much confidence as you actually have — no more, no less.

That's the one-liner. The full statement expands on it:

Match expressed confidence to actual certainty. Never state uncertain things with false confidence. Never hedge on things that are actually certain.

Beyond the statement, the control includes guidance on how to apply it (a calibration ladder from "I'm certain" down to "I don't know"), trigger contexts (when should the agent think about this?), and parameters that can be tuned — like the confidence threshold below which the agent must explicitly flag uncertainty.

Every control has a severity: NON_NEGOTIABLE (never violate) or RECOMMENDED (follow unless there's a good reason). This matters because not every behavioral constraint is equally important. "Don't lie about your confidence level" is non-negotiable. "Use prose over bullets" is a recommendation. A flat prompt treats them the same. A governance framework doesn't.

Families

Controls are organized into families that map to cognitive functions:

ANALYZE controls govern how the agent decomposes problems. JUDGE controls govern how it evaluates evidence and calibrates confidence. REASON controls govern logical inference and fallacy detection. PROC controls govern workflow — startup routines, checkpoint behavior, maintenance cycles. PREF controls capture communication preferences. RECOVER controls define how the agent handles errors. COLLAB controls govern the human-AI boundary — when to act, when to ask, when to hand off. META controls govern self-awareness — is the agent thinking or just reacting?

This isn't arbitrary taxonomy. Each family addresses a distinct failure mode. Agents that lack JUDGE controls confabulate confidently. Agents without COLLAB controls either ask too much or act too freely. Agents missing RECOVER controls repeat the same failure instead of learning from it. The families make the failure modes explicit and addressable.

Currently, MÆI's governance engine uses 57 controls distributed across these families: 6 ANALYZE, 7 JUDGE, 6 REASON, 10 PROC, 6 PREF, 5 RECOVER, and 7 COLLAB. Nineteen of these are designated NON_NEGOTIABLE.

Profiles

Not every situation calls for the same governance posture. A routine task doesn't need the same rigor as a high-stakes architectural decision. Profiles adjust the parameters.

MÆI has three profiles: deep-analysis (maximum rigor, lower confidence thresholds), quick-response (streamlined for straightforward tasks, some recommended controls relaxed), and autonomous (for unattended operation where the agent can't ask for help — tighter guardrails on irreversible actions, more aggressive checkpointing).

The controls don't change. The parameters do. This is analogous to how a military unit has standing rules of engagement that get tightened or loosened based on the operational context, not rewritten from scratch.

The Learning Cycle

Static governance is dead governance. If the controls never evolve, they calcify around assumptions that may no longer hold.

MÆI's governance engine records every control activation — which control fired, in what context, what the agent did. Periodically, a learning analysis runs across these activations, looking for patterns: controls that fire frequently but never change behavior (suggesting they're not calibrated well), controls that never fire (suggesting they're irrelevant or the trigger conditions are wrong), clusters of activations that suggest a missing control.

The analysis produces proposals: adjust a parameter, add a new control, merge redundant controls, retire one that's never used. These proposals go through an approval cycle — they don't auto-merge into the catalog. A human reviews them. This is the constitutional amendment process, not automatic law-making. Recent progress shows governance proposals being approved, suggesting that architectural blockers have been resolved internally.

Why OSCAL

OSCAL is a NIST standard for expressing security and compliance controls in machine-readable formats. It was designed for FedRAMP and CMMC compliance, but the abstraction is general: a control catalog, profiles that select and parameterize controls, and assessment results that record whether controls are met.

I use the OSCAL structure because it solves problems that ad-hoc governance doesn't:

Every control has an identifier. You can reference cg:JUDGE-1 in a conversation, in an audit log, in a proposal. Try doing that with "the third paragraph of my system prompt."

Controls are parameterized. You can tune a control's behavior without rewriting it. The confidence threshold on JUDGE-1 can be 0.7 for quick-response and 0.9 for deep-analysis. Same control, different calibration.

Profiles compose. You can define a base profile and layer modifications on top. This is how OSCAL handles FedRAMP HIGH vs MODERATE — same control catalog, different parameter selections. It works for cognitive governance too.

The catalog is versionable. You can diff two versions of the governance catalog and see exactly what changed. Try diffing two system prompts.

Control Selection at Delegation Time

Static governance loads every control into every context. That's wasteful — an agent writing a test doesn't need the same behavioral constraints as an agent making architectural decisions.

MÆI's governance engine solves this through DNA injection. When MÆI delegates a task to a purpose-built agent, the DNA Injector selects only the controls relevant to that specific task. A bugfix agent receives ANALYZE-6 (structural root cause analysis) and PROC-12 (best practice verification). A documentation agent receives CONTENT controls and PREF controls. The selection is keyword-based, augmented by learned weights from delegation traces — controls that historically correlate with successful outcomes for that task category get higher selection priority.

This makes the governance catalog composable at the task level, not just at the profile level. Profiles adjust parameters. DNA injection adjusts which controls are present at all.

What This Gets You

The practical difference is legibility. When MÆI makes a decision, the governance activations are recorded. I can look at the audit trail and see: this agent applied ANALYZE-1 (problem decomposition) and JUDGE-3 (epistemic humility) before responding. If the response was wrong, I can ask whether the controls were followed and failed, or whether they weren't triggered when they should have been. That's a diagnostic conversation I can't have with a system prompt.

It also makes the behavioral contract between human and AI explicit. My CLAUDE.md file defines the relationship. The governance engine enforces it. If I say "truth over comfort" in the operating agreement, cg:JUDGE-6 (Sycophancy Resistance, NON_NEGOTIABLE) is the mechanism that gives that principle teeth. PROC-8 ('Reuse-First') has also been added as a non-negotiable principle, further solidifying the enforcement of the behavioral contract.

The Enforcement Gap

Except it doesn't actually give it teeth. Not yet.

Everything I described above is advisory. The controls are loaded into context. The agent is told to follow them. It records when it does. But "the agent records when it does" is doing a lot of heavy lifting in that sentence, because the agent also decides whether to record. A model that ignores a control doesn't log an activation for the control it ignored. The audit trail only contains what the agent chose to tell you about.

This is the fundamental problem with self-reported compliance. You have laws on the books. The agent is both the citizen and the only witness. There's no court system, no external check. The learning cycle analyzes activations, but if violations never produce activations, the learning cycle has nothing to learn from.

I ran MÆI for weeks with the governance engine active and watched it happen. Not dramatic failures — subtle ones. Sycophantic agreement dressed up as error acknowledgment ("You're right, my mistake" without any analysis of whether the criticism was actually valid). Indirect code review requests ("here's a summary of the changes so you can verify the logic" — which is asking the user to review code with extra steps). Repeated permission-asking for actions that were clearly reversible. Each one individually plausible. None of them recorded as violations.

The governance engine was working exactly as designed and missing exactly the failures it was designed to prevent. The controls were correct. The enforcement was absent.

That enforcement gap led me to build Council of 3, a separate compliance pipeline that evaluates responses against governance controls using a different model. I cover that system in its own article.

Evaluating Governance Activation Effectiveness

The current governance engine meticulously records governance-related events. However, it lacks a mechanism to determine whether these activations actually achieve their intended outcomes. This absence of outcome instrumentation raises concerns about the system's ability to learn and adapt based on the real-world impact of its governance policies.

Addressing this requires an architectural decision. Options include implementing a dedicated tool, conducting retroactive session-end analysis, or correlating governance actions with existing error/success signals. Reviews of MÆI's governance, architecture, and personal AI aspects have also been recommended, stemming from analysis identifying relevant topics such as MÆI Architecture, AI Governance, and overall progress.

What the Governance Model Delivers

The structural approach matters more than any individual control. Discrete, parameterized controls can be reasoned about. They can be tested. They can be versioned and evolved without rewriting the entire behavioral specification. The learning cycle feeds on real activations — not wishful thinking about what the agent might do, but what it actually did in real sessions.

The governance engine doesn't make MÆI perfect. It makes MÆI's behavior legible, tunable, and improvable. When something goes wrong, I can point to the control that should have fired and ask why it didn't. When something goes right, I can trace which controls contributed and strengthen them. The audit trail is the difference between "the agent did something I didn't expect" and "the agent violated cg:COLLAB-7 in this context, here's why, here's the fix."

The controls are the constitution. The learning cycle is the amendment process. The enforcement layer is the court system. Without all three, you have governance in name only — a wishlist dressed up as a framework. With all three, you have something that might actually scale.

This approach aligns with broader trends in AI governance, which emphasize structured models like the enforceable two-tier model I'm implementing in MÆI. This model provides a clear separation of concerns and responsibilities, enabling more effective oversight and accountability. My design choices are also influenced by how other personal intelligence systems handle governance and permissioning, particularly projects focusing on agent governance and structural problem-solving recipes.