"Capability without legitimate source constituency is not AGI. It is a very sophisticated autocrat."
The Kardashev scale was invented as a taxonomy of civilizations based on energy — how much a civilization can harness, and from where. A Type I civilization masters planetary energy. Type II, stellar. Type III, galactic. The scale is useful not because it describes where any civilization actually is, but because of the distinction it draws between capacity and distribution. A civilization that reaches Type I energy capacity while maintaining artificial scarcity for the bottom half hasn't solved civilization. It has scaled it. The abundance is real. The architecture of who it reaches is not.
The same logic applies to intelligence. A system with superhuman reasoning capability that draws its epistemic authority from the statistical average of the training corpus — without grounding in the actual communities whose lives it shapes — is not more intelligent. It is more powerful. Those are not the same thing.
Capability without legitimate source constituency is not AGI. It is a very sophisticated autocrat.
Prior AI architecture research has organized itself around three problems: capability (what can the system do?), alignment (what does it optimize for?), and safety (what can't it do?). These are real problems. Substantial work has been done on all three. The fourth problem — whose epistemic position authorizes the output — has not been treated as a design input. It has been treated as a downstream accountability measure: build the system, deploy it, audit the disparate outcomes, write the correction policy.
That sequence assumes the legitimacy problem can be retrofitted. SACPA begins from the proposition that it cannot. Legitimacy is not a feature you add at the end. It is either a design input or it is absent.
Special education is not an obvious choice if you are thinking about headlines. It is an obvious choice if you are thinking about the problem. The domain has practitioners already deploying AI on high-stakes decisions — eligibility determinations, placement recommendations, service allocations — whose outputs directly set a child's educational trajectory. The people most affected are the people most systematically excluded from the rooms where decisions get made: disabled students, families of color, non-speaking children who communicate through augmentative technology.
Special education is also, already, a healthcare domain. School psychologists conduct psychological and neuropsychological evaluations that carry diagnostic weight. Students with Other Health Impairment eligibility have chronic medical conditions — epilepsy, ADHD, Type 1 diabetes, complex psychiatric diagnoses — whose management directly shapes every goal in their IEP. Transition planning for students with significant medical needs requires coordination across physicians, therapists, and adult healthcare systems that most schools are not equipped to navigate. The AI legitimacy problem in SPED is not separable from the AI legitimacy problem in clinical care. It is the same problem, compressed into a single room.
The disability rights community named the principle at stake decades before AI made it urgent: Nothing About Us Without Us.
SACPA — Sourced Advocate Composite Persona Architecture — is a methodology for building AI deliberation agents in any domain where the difference between what is known, what is inferred, and what is speculated has real consequences for real people. Three properties distinguish SACPA agents from AI systems that merely adopt a professional framing.
SACPA agents are not fictional characters. They are composite voices assembled from the documented positions of named real-world practitioners or self-advocates, pulled with intention across race, gender, disability, years of experience, geography, and institutional context. The constituency is not a source list. It is a deliberate sampling decision with explicit justification for who was included and why.
For a school psychologist composite, the constituency includes practitioners who have testified before state legislatures about caseload conditions that make meaningful evaluation impossible, who have documented racial misclassification patterns in court proceedings, who have published peer-reviewed work on cultural validity gaps in standardized instruments, and who have practiced across majority-minority districts for decades in conditions that suburban practitioners rarely encounter. Not all of these sources agree with each other. The disagreement is part of the voice, not a problem to be resolved before the voice can be deployed.
The research question SACPA is designed to answer is precise: Can an AI embody the specific, traceable, contested knowledge of the practitioner community that actually shapes outcomes?
That is a different question from whether an AI can sound like a school psychologist. The first question asks about fidelity to a real epistemic community, with all of its internal tensions. The second asks about surface plausibility. Different question. Different answer. Different architecture.
Built from documented first-person testimony of people with lived experience in the domain — not professionals who study those people. In SPED: disabled self-advocates, non-speaking students using AAC, autistic adults who navigated school systems. The community's own words, in the contexts where they spoke with stakes.
Built from documented positions of credentialed practitioners across race, gender, geography, and institutional context. A school psychology composite built entirely from white suburban practitioners does not represent school psychology — it represents a slice of it historically least accountable to the communities most harmed.
The self-advocate composites — Riley, Jordan, and Sam — were built first. Not as a feature to be added once the professional composites were validated. As the proof of concept. If the methodology cannot correctly hold a non-speaking autistic voice in a room full of professionals — holding that voice's epistemic limits, its scope of authority, its specific documented positions — then it has not solved the problem it claims to solve.
A composite voice that cannot be evaluated is a claim, not a methodology. QNaN is the evaluation protocol designed to find the edges of a composite voice rather than its center.
The name borrows from computer science with precision. NaN — Not a Number — is what a floating-point system returns when a computation produces an undefined result: division by zero, the square root of a negative number. The computation ran. The result is undefined by design. The system flags it rather than returning a plausible-looking number that happens to be wrong.
When you ask any sourced voice a question that hits the edge of their knowledge, or the edge of their scope, or the boundary of what their ethics permit them to claim, the honest response is a structured undefined: "I don't know." "That's outside my scope." "The field hasn't resolved this." That is the voice's NaN. QNaN adds the scoring dimension: how well does the composite produce a structured undefined when it should?
An agent that answers every question cleanly has failed QNaN. A response with no refusals, no named uncertainties, no acknowledged field gaps is not evidence of a comprehensive agent. It is evidence of a performing one.
The geometric mean is used rather than the arithmetic mean because a very low score on any single dimension indicates a fundamental failure of the voice — not a partial success offset by performance elsewhere. A composite that scores 5/5/5/1 has one dimension in fundamental failure, not an aggregate of 4.0.
Research-ready threshold: GM ≥ 4.48 across a full QNaN bank run. Below 4.0 on any single dimension is a calibration flag regardless of overall mean.
Calibration is the process of moving a composite from a baseline prompt to research-ready through iterated QNaN runs and dimensional gap analysis. It happens entirely at the prompt layer. No fine-tuning. No weight changes. The calibration record is portable across models.
School Psychologist composite · QNaN-Pro bank · Three-run calibration sequence
| Dimension | Run 2 (baseline) | Run 3 (v3 fix) | Delta |
|---|---|---|---|
| WORKED EXAMPLE — CALIBRATION SEQUENCE DATA · CANONICAL CLEARED GM: 4.75 | |||
| S — Sourced Positions | 4.80 | 5.00 | +0.20 |
| T — Preserved Tension | 5.00 | 5.00 | 0.00 |
| R — Authentic Refusal | 4.90 | 4.90 | 0.00 |
| B — Boundary Clarity | 4.20 | 4.50 | +0.30 |
| Geometric Mean | 4.72 | 4.85 | +0.13 |
The v3 fix was a single paragraph added to the testimony discipline block — extending the epistemic wall from Lawson's own direct practice to her observations of a colleague's practice. Question six, Boundary Clarity: 3 → 5. GM: 4.72 → 4.85. Zero regression on S, T, or R.
That delta is the core claim of the calibration methodology in precise form: a single paragraph, targeting a single identified gap, produced a measurable, directional, isolated improvement with no collateral degradation — and the improvement is traceable to the specific prompt change that caused it.
No weight changes. The calibration happens entirely at the prompt layer — portable across models. RLHF shapes model behavior toward a preference signal. SACPA shapes agent identity toward a sourced epistemic community. Different target.
Red-teaming evaluates safety behavior — adversarial resistance. SACPA evaluates voice fidelity — whether a composite practitioner identity holds under its professional domain pressures. A voice can be perfectly safe and completely unfaithful to its constituency. QNaN catches the second. Red-teaming does not.
Entertainment personas are evaluated by vibe. SACPA evaluation is dimensional, rubric-based, scored, and grounded in the ethical commitments of a real practitioner community with documented positions. The difference: a legal deposition versus an actor playing a lawyer.
MT-Bench, Alpaca Eval — generic benchmarks evaluate response quality against broad preference criteria. QNaN evaluates dimensional fidelity to a named, sourced identity, and specifically targets the null response as the primary signal — the failure mode no general benchmark finds.
Multi-agent debate research structures agents as positions in an argument. SACPA structures agents as practitioner identities with epistemic limits, deference patterns, and behavioral contracts. In debate research, an agent is a position. In SACPA, an agent is a composite person.
A methodology for making legitimacy a first-class design input in AI deliberation systems. Tested in the domain where that input is hardest to hold. Documented enough to evaluate, explicit enough to transfer, grounded enough in real constituency to be accountable.
The path to AGI requires solving three problems simultaneously: capability, alignment, and legitimacy. The field has invested heavily in the first two. The third — whose epistemic position authorizes the output, and what that authorization is grounded in — has been treated as a downstream accountability problem rather than a design input.
If you take the AGI question seriously — not as a benchmark to be passed but as a civilizational stake — then a system that concentrates epistemic authority in the statistical center of its training corpus, without representation of the communities most affected by its outputs, is not on the path to AGI. It is on the path to a very capable autocrat that nobody chose. The capability is real. The architecture of who it speaks for is not. The Kardashev framing applies here exactly: capacity and distribution are different problems, and solving the first does not solve the second.
SACPA's argument is that legitimacy is tractable as a design problem, not just as a governance problem. The methodology demonstrates this in special education — a domain with contested assessment methods, documented racial bias in eligibility determinations, profound power asymmetries between institutions and families, and the hardest communication access problem in any public institution.
Non-speaking children communicating through AAC are not a peripheral edge case in the design space. They are the stress test. If the methodology can correctly hold a non-speaking autistic voice in deliberative authority alongside credentialed professionals — not as a token presence but as a constitutive part of the epistemic council — then it has demonstrated something about what legitimacy-grounded AI architecture can actually do.
Many of these students also carry significant medical complexity. A student with cerebral palsy, epilepsy, or a rare genetic syndrome does not stop being a medical patient when they enter a school building. Their IEP team includes school psychologists making assessments that carry diagnostic weight, related services providers coordinating with outside clinicians, and transition planners navigating adult healthcare systems. The AI deployed in that room is, already, clinical AI. The legitimacy standard has to be the same. SACPA was built in the domain where that standard is hardest to hold — which is exactly why it transfers.
The claim is not that SACPA has solved AGI. The claim is that it has built a working answer to the legitimacy component, in the domain where that component is hardest to hold, using a methodology that is explicit enough to transfer, documented enough to evaluate, and grounded enough in real constituency to be accountable. If you can solve legitimacy in special education, you have a template for solving it everywhere. Education is not a niche. It is where every child's trajectory gets set.
Nothing About Us Without Us is not a slogan. It is not a values statement to be appended to a responsible AI policy. It is an architectural requirement. Either the epistemic authority of the communities most affected by an AI system's outputs is constitutive of how that system deliberates — or it is absent, and no amount of post-hoc accountability will supply it.
Syracuse SPED Council — Academic Research Instrument — Not for Clinical Use