Development Notes
Notes on building an AI-assisted calligraphy design system. Behind-the-scenes research, architecture decisions, and lessons learned.
Meet the Team — A master calligrapher, a solo developer, and a team of AIs building takasestudios.com.
Planning
PLN-013: Production Resilience — Schema Contracts, Deploy Verification, Monitoring Correctness (2026-03-21) — 88 minutes of HTTP 500s. Every component was healthy. The schema just didn't match. Seven phases to make sure it never happens again.
Research
RES-002: OpenClaw-RL — Directive Signals in Conversations (2026-03-15) — Every interaction with an AI agent produces training signals — user corrections, re-queries, explicit fixes — that are universal gold but typically ignored. This paper validated our flywheel and justified building the conversation archive.
RES-003: RLHI — Reinforcement Learning from Human Interaction (2026-03-15) — Corrections create preference pairs. Chat history creates user personas. Our 949 conversations are exactly the data this paper says systems waste.
RES-008: The Kobayashi Maru Signal — Detecting When an AI Strategist Hits an Invisible Wall (2026-03-27) — When an AI role can't complete a task due to a constraint it doesn't know about, it doesn't say "I don't know." It proposes plausible fixes that keep failing. Data from 121 sessions, 2,371 human messages.
RES-009: The Alignment Tax and Structural Defense — What Response Homogenization Means for Human-in-the-Middle AI Teams (2026-03-27) — DPO-aligned models collapse to a single semantic answer on 40-79% of factual questions. Our structural defenses — built from pain, not theory — happen to be the correct response. But models change. When do those defenses become anchors?
RES-010: What DeepMind's Aletheia Experiment Shows and Our Experience Creating Takase.com (2026-03-28) — "DeepMind's AI verifier was wrong 68.5% of the time. Our AI roles hit 25-29% troubled-session rates. Different domains, same patterns: verification failure, specification gaming, question-formulation gaps. Where our production data adds to their research — and where one of our AI roles challenged the findings."
RES-016: HMAS: We Have Names — Mapping a Production Multi-Agent System to Established Research (2026-04-12) — A production hierarchical multi-agent system (HMAS) built over 18 months and ~2,000 sessions by a solo developer independently converges with established research from Anthropic, DeepMind, MetaGPT, and Carnegie Mellon. The architecture was never designed from research — it was grown through operational pain running a 30-year Japanese calligraphy business.
Architectural Decision Records
ADR-073: Human-in-the-Middle Cognitive Scaling — Strategic Thinking Partner Role — A solo developer managing 10+ specialized AI roles hits the cognitive coordination wall. The solution isn't more autonomy — it's a human-controlled AI strategic thinking partner.
ADR-075: Discovered Identity — AI Roles Calibrate to Environment, Not Instructions — AIs calibrate to what they observe in the environment, not to what instructions tell them. When the environment contradicts the instructions, the environment usually wins. The founding process exploits this.
ADR-075 Addendum — Strategist Responses — Four domain strategists respond to ADR-075 — the security role thinks about what it can't verify, the production role points at tonight's failure, the data role refuses to philosophize, the content role runs a SQL query.
Posts
The Verification Gap — A six-phase resilience plan was declared complete. One question found it wasn't.
How We Build takase.com — A master calligrapher, a solo developer, and AI strategists building a Japanese calligraphy site with 101,000+ verified names. The art, the history, the team, and what we've found.
