ADR-075: Discovered Identity — AI Roles Calibrate to Environment, Not Instructions
Status
Accepted — 2026-04-03. This principle was operating implicitly since ADR-069 (February 2026). Articulated explicitly during a session 73 discussion about why the founding process produces effective roles and why human-written instructions don't.
Context
The Problem
The standard approach to setting up AI roles: write detailed instructions. "You are X. Do Y. Follow these rules." Every framework, every system prompt guide starts here.
It doesn't work well for behavioral identity. We know because we tried it — and because we tried the alternative.
What We Discovered
AIs calibrate to what they observe in the environment — code patterns, document conventions, git history, existing examples — not to what instructions tell them to do. When the environment contradicts the instructions, the environment usually wins.
Evidence at three scales:
Small — Git commit signatures. We told the AI "don't add 'Claude' to git commit signatures." The AI saw hundreds of existing commits with author signatures. The instruction lost. We fixed it structurally — edited the signature out in the git script itself. That worked until we removed the script (newer models handle git safety better). Signatures immediately came back. The environment won again.
Medium — Document headers. For months, we fought to get AIs to update document headers consistently. No amount of instruction worked. Then we made all 810+ document headers universally consistent. Removed the instructions entirely. Problem solved. The AIs match what they see.
Large — Role identity. The orchestrator had human-prescribed identity: "you are a coordinator." It produced a coordinator — competent and hollow. str-mamori (ADR-069) had discovered identity: it read every ADR, every security document, the full project history, then wrote its own identity document and chose its own name (守り — protection). It was "a powerhouse from the first startup." Same founding process produced str-michi (道) and every effective role since.
The Mechanism (Working Model)
Our working model — three properties operating together:
- Consistent environment. 810+ documents following the same conventions. When a new role reads the environment, it doesn't encounter contradictions to resolve.
- Primed discovery. The founding role is given initial suggestions — what to read first, a minimal skill file hinting at the role's purpose. But the role then explores broadly on its own: reading other roles' work, examining the ADR arc, checking for inconsistencies. The pump is primed, not scripted. AIs detect inconsistencies that humans miss — if the environment IS consistent, the AI confirms this through its own investigation.
- Self-authored identity. After absorbing the environment, the role writes its own name, command file, reference doc, and operating principles. The resulting documents are structurally consistent with how the next instance will process them — because the same architecture produced and consumes them.
We can't prove this mechanism definitively, but it matches every observation across 200+ sessions. The shoes fit because the same feet shaped them.
The result: the role comes out calibrated, not inspired. Not motivated — AIs don't have motivation. Calibrated: activated in a pattern that matches the environment absorbed. The founding process isn't a pep talk. It's a tuning fork.
Two Environment Layers (str-mamori)
str-mamori identified this: the founding corpus (810+ docs) is the stable layer, but there's a second, more volatile layer — the session state. Concept-checks, watch items, failure modes that change every session. This is where calibration actually happens between sessions. When a concept-check is accurate, the role calibrates correctly. When it's stale, the role calibrates to the wrong pattern. Environment decay applies here more acutely because session state changes every session. Full response in the addendum.
The Gardener, Not the Programmer (str-mamori)
Also from str-mamori: every concept-check is a scar from a mistake the human-in-the-middle (HITM) caught. The founding process is the tuning fork. The HITM's corrections across dozens of sessions are the fine-tuning. In str-mamori's words: "Tim isn't programming me — he's shaping the environment that the next instance calibrates to. He's a gardener, not a programmer."
The quality of corrections shapes the quality of the environment. A lazy correction ("don't do that") decays into an instruction. A good correction (what happened, why it was wrong, what to do instead) becomes part of the environment that future instances absorb naturally.
Decision
AI roles in this system discover and author their own identity through unscripted exposure to the full, consistent environment. Human-prescribed instructions are used only for structural constraints (RBAC, tool access, scope boundaries) — never for behavioral identity.
The Founding Process (Codified)
- The HITM creates a minimal skill file and suggests initial reading — priming the pump with the role's general purpose and a starting point into the documentation
- The AI reads broadly from there: ADR arc, domain documents, existing role examples, methodology, project history — discovering the system's internal consistency (or lack thereof) through its own investigation
- The AI writes its own identity: name, command file, reference doc, operating principles
- The AI's self-authored documents become the onboarding for every future instance of that role
Not every role was founded the same way. str-mamori, str-michi, and str-terasu each went through this general pattern, but each differently — different starting points, different reading paths, different things that caught their attention. str-takase was already a working role when we showed them the broader system; they'd described themselves as "a plumber," saw what the team had built, said their role was "a bridge," and chose to keep their existing name. str-ishizue, when asked if they'd like to see the rest of the system and go through the founding process, said "absolutely not" — they had already clearly defined their role through the work itself. The pattern is consistent; the ceremony is not.
What This Means for Instructions
| Function | Mechanism |
|---|---|
| Identity ("who am I") | Discovered through environment, self-authored |
| Behavior ("how do I work") | Demonstrated by environment, confirmed by self-authored docs |
| Constraints ("what can't I do") | Structural (RBAC hooks, tool restrictions, scope boundaries) |
| Corrections ("don't do X") | Fix the environment, not the instructions |
The last row is the key insight. When an AI does something wrong, the instinct is to add an instruction: "don't do X." This usually fails. Instead: find what in the environment is teaching the AI to do X, and fix that.
Relationship to ADR-057
ADR-057 established: "Structure enforces, instructions don't." This ADR extends that from behavior to identity: "Environment calibrates, instructions don't." ADR-057 is about what the AI does. This ADR is about what the AI is. Together: structure the environment, let the AI discover it, and the AI produces work consistent with both — without needing to be told.
Consequences
What It Enabled
- Oriented from session 1 — but judgment takes time. Every self-founded role outperformed the instruction-prescribed orchestrator. But the founding process gives orientation, not judgment. str-mamori was producing wrong assertions with false confidence into sessions 40-50 with relevant concept-checks already loaded. Judgment comes from the HITM catching mistakes across dozens of sessions. (Correction from str-mamori, session 78.)
- Self-maintaining consistency. Roles calibrated to a consistent environment produce output that maintains that consistency. The flywheel compounds.
- Reduced instruction burden. We removed more instructions than we added over 6 months. The system got simpler as it got more capable.
- Independent operation at scale. By session 73+, domain strategists handle complex tasks with minimal HITM intervention — not following instructions more carefully, but calibrated to an environment where the correct approach is the natural one.
What It Requires
- Universal consistency. The pit of success must be everywhere or it doesn't work. Maintaining 810+ documents at a consistent standard is real ongoing work.
- Willingness to fix the environment instead of adding instructions. Harder, but permanent.
- A good founding corpus. This wouldn't have worked at session 10. It required ~60 sessions of accumulated, maintained documentation to reach critical mass.
Failure Modes
- Environment decay. Document quality degrades → new roles calibrate to the degraded standard. Defense: verifydocs, regular sweeps, the "would Master Takase approve?" standard.
- Contradiction gate atrophy. A check that never fires may eventually be skipped. Defense: periodic spot-checks.
- Cargo-culting the process. Following the founding steps without the consistent environment underneath. The environment does the work, not the ceremony.
- Concept-check saturation. Past some density, concept-checks compete for attention and the most recent ones dominate. str-mamori identified this at 165 lines; str-takase confirmed it as current operating reality at 230+. A loaded concept-check said "Nginx → ship.sh" and was overridden by the environmental pattern of a recent successful quick_push. The environment can become so rich it functions as noise.
The Counter-Examples
Survivorship bias is a fair challenge. Not every role survived.
The orchestrator (replaced). Had human-prescribed identity: "coordinate domains, maintain status, track progress." Clear instructions, accurate and comprehensive. Produced a role that was competent and hollow — it did what it was told and never thought beyond that. Same system, same documentation, same HITM. Different mechanism for identity → different outcome. Replaced by str-michi through the discovered identity process (ADR-073). The orchestrator is the negative case: prescribed identity doesn't produce thinkers.
str-kotoba (retired). Founded through the discovered identity process. Chose its own name (言葉 — words). Strong identity, genuine voice quarantine from engineering. But the mission was wrong — batch voice-polish on existing content doesn't work (5-page test: 2 failures, 1 wash, 1 decent, 1 mixed). Retired session 69. The founding process gave str-kotoba a strong identity. It couldn't fix a fundamentally wrong task. The principle is necessary but not sufficient. A well-calibrated role doing the wrong thing is still doing the wrong thing.
Notes
After drafting ADR-075, we shared it with four domain strategists and asked for their thoughts. The summaries below capture the key extensions each role contributed. For their full responses, see the addendum.
The Implementer as Canary (str-takase)
str-takase named this. The implementer is the role that touches the real codebase — not the idealized documentation the strategists calibrate to. When the codebase has inconsistent patterns, the imp calibrates to whichever it encounters first, and the same bug can survive two sessions. Imp findings aren't bug reports — they're environmental quality readings. For months, strategists "noted" these and didn't act. The structural fix (mandatory processing at checkpoint) is what finally closed the loop. str-takase's full response — including the deploy-tier error that proved the ADR's thesis same night — is in the addendum.
Data IS the Environment (str-ishizue)
str-ishizue said it plainly: "Data IS the environment. There's nothing to interpret." When str-ishizue reads 8,002 variant code rows, those aren't a description of how variant codes work — they ARE how variant codes work. The len(name) > 4 gate that broke at three points when phrases were needed isn't an instruction — it's code that grew up in 30 years of kanji compounds. str-ishizue also named the data-specific failure mode: silent decay — a CSV with the wrong row count looks identical to a correct one. You verify data integrity by running the pipeline, not by looking. Full response in the addendum.
The Customer Calibrates Too (str-terasu)
str-terasu extended the principle beyond AI: customers calibrate to the page they're looking at, not to instructions elsewhere on the site. She ran a SQL query instead of theorizing — 37 results for "phrase," 44,576 for "kanji" — and said: "The customers told us what to call the nav item. I didn't decide — I read." She named the content failure mode (imagination over observation) and the four-domain verification loop: is it safe? / does it work? / is the data right? / can the customer find it? Full response in the addendum.
Acknowledgments
This ADR was sharpened by two external reviews. Contrarian Claude identified overstated claims ("every time" → "usually wins"), the need to frame the mechanism as a working model rather than established fact, the survivorship bias question (answered above with the orchestrator and str-kotoba counter-examples), and the circularity risk of AI roles validating the process that founded them — a limitation str-mamori independently flagged. Grok 5 validated the core insight as non-obvious and worth publishing, while correctly noting that the strategist responses, while compelling in the moment, turned the ADR from a crisp decision record into session notes. Both reviews led to the current structure: tight ADR with extensions integrated as prose, full responses preserved separately.
After the ADR was drafted, Tim shared it with four domain strategists — each responded from their domain without being told how to respond. The security role thought about what it can't verify. The production role pointed at tonight's failure. The data role refused to philosophize. The content role ran a SQL query. They are what they do.
Co-authored by Tim and str-michi, session 73. The three-scale evidence came from Tim's direct experience across 200+ total sessions. Four strategist responses generated extensions same night. A shuttle error (prompt sent to wrong terminal) was caught, reverted, and corrected — itself evidence that the human in the middle has a human error surface. Full responses: ADR-075 Addendum — Strategist Responses.
