Delegation Capacity
Paper reference: Section 1, "Delegation capacity"; Theorem 1
Before optimizing review allocation or tuning a corrector, ask the most basic question: can this pipeline hit the target quality at all?
The operational ceiling
The delegation capacity \(C_\text{op}\) is the best possible output quality the pipeline can achieve, optimized over all task distributions. If the quality target \(p_\text{min}\) exceeds \(C_\text{op}\), no governance policy — no review allocation, no routing rule, no amount of corrector effort — can rescue the design.
from minimal_oversight.capacity import check_feasibility
report = check_feasibility(pipeline, p_min=0.80)
print(report.explanation)
# INFEASIBLE: Quality target p_min=0.800 exceeds pipeline
# capacity C_op=0.725. No governance policy can rescue this design.
When the AMO has no solution
If \(p_\text{min} > C_\text{op}\), the theory prescribes no delegation — not "more authority." The task must be performed by a more capable agent, decomposed into subtasks, or the topology must change.
For a single node
A single node with observation rate \(\eta\) and decay rate \(\delta\) achieves maximum capacity when \(\sigma_\text{skill} = 1\):
With \(\eta = 10\), \(\delta = 2\): \(C = 0.833\). This is the ceiling even for a perfect agent — the decay (stale evidence, environment shift) prevents reaching 1.0.
For a chain
Each layer degrades the signal. The recursive formula accounts for the fact that each corrector stabilizes quality before passing it downstream:
where \(R(\cdot)\) is the Return Operator at fixed point (Equation 11).
Verify the math
The theory-observation gap is less than 0.002 across all 28 conditions tested in the paper (Experiment 6).
Correction model: theory vs simulation
The closed-form equations (Eq. 5-6) use raw \(c\) as the catch rate. The paper's simulator uses \(c \times K/N\) as the effective catch rate, where \(K/N\) is the fraction of outputs reviewed. This means:
- Theoretical \(M^*\) = 1.83 (with \(c = 0.70\), Eq. 6)
- Simulated \(M^*\) ≈ 1.4 (with \(c \times K/N = 0.70 \times 0.50 = 0.35\))
If your observed \(M^*\) is lower than the theoretical prediction, check your effective review coverage.
Critical depth
There's a maximum useful depth beyond which adding layers hurts quality. The product-formula approximation gives:
This is a conservative lower bound (the recursive formula, Eq. 11, is more precise). For \(\sigma_\text{skill} = 0.55\), \(c = 0.65\), \(p_\text{min} = 0.50\): \(D_\text{max} \approx 3\text{–}4\).
Better correctors extend the useful depth significantly. At \(c = 0.90\): \(D_\text{max} \approx 12\).
The effective autonomy buffer
The buffer combines capacity, quality target, and workflow complexity into one number:
- \(B_\text{eff} > 0\): delegated autonomy is feasible
- \(B_\text{eff} = 0\): at the autonomy cliff
- \(B_\text{eff} < 0\): no governance policy can maintain quality
The complexity tax \(\lambda H(W)\) captures how routing entropy, tool-call variability, and timing uncertainty consume the quality margin. Each additional bit of process entropy costs approximately \(\lambda \approx 0.02\) in quality (Experiment 7).