X-PRO.ai compiles a project profile into a library of plain-Markdown guardrails an AI coding agent reads while it builds. The same global catalog produces a throwaway-PoC ruleset or a regulated-system ruleset depending on the answers — and every choice to include, defer, or drop a practice is written down with its reason.
AWS & Azure Well-Architected, Google SRE, 12-factor, DORA — they describe what great systems do. Applied literally to a two-week internal tool, they are wasteful. Applied loosely to a payment platform, they are dangerous. The hard part is not knowing the practices. It is deciding how much of each a given project warrants, and being honest about what you chose to skip.
Multi-region, circuit breakers, full tactical DDD and GitOps on a PoC that lives for two weeks. The rigor is real, the cost is real, and none of it moves the project's actual risk.
A payment surface shipped with home-grown auth, secrets in the repo, and no tested restore — because "best practices" were treated as an optional backlog item.
Treat calibration as a first-class, auditable computation. You profile the project once; a Tier is derived; the Tier modulates each practice from Required down to Discarded; and the trade-offs are recorded rather than left implicit. It is a trade-off engine — not a "best of all worlds" template.
The catalog is the data; your answers select from it; the artifacts are the output. Change the catalog and every project that upgrades inherits the improvement. Change the answers and only that project's artifacts move.
The generator is deterministic: same answers + same catalog version = byte-identical output. The lock file records an answers_digest; if an answer changes, the digest diverges and the artifacts are flagged stale. That property is the prerequisite for auditing and versioning the output.
Move the sliders. Criticality sums five questions (impact, sensitivity, blast radius, SLA, reversibility). Complexity sums four (domain, integrations, data, distribution). Each band pair maps to a base Tier through the matrix — then the override flags ratchet it up. This calculator runs the real tier-engine.yaml logic.
Criticality is the dominant axis. Complexity alone never reaches T3 — it caps at T1 when criticality is low and T2 when it is medium. T3 is reached only by C-High × K-High, by C-Critical at any complexity, or by an override flag. Complexity raises the floor; it does not set the ceiling.
| K-Low | K-Med | K-High | |
|---|---|---|---|
| C-Critical | T3 | T3 | T3 |
| C-High | T2 | T2 | T3 |
| C-Med | T1 | T2 | T2 |
| C-Low | T0 | T1 | T1 |
Throwaway / PoC. Heavy rigor does not leak in — only the "must decide" basics plus the secrets ratchet.
Internal, low-criticality. Critical path tested, structured logs, daily backup, single decision authority.
Product. Modular monolith, RED metrics, integration tests, multi-AZ, IaC and ADR governance.
Critical / regulated. Circuit breakers, tracing + SLOs, multi-region, zero-trust, continuous PITR, canary.
Overrides only ever raise the Tier, never lower it. The result is a global Tier plus a per-dimension Tier. A cost-conscious product can run at T2 globally while its security dimension is ratcheted to T3 by a payment flag.
| Condition | Effect | Why it can only raise |
|---|---|---|
| Data is regulated | global floor T3 | Regulatory exposure is non-negotiable; the deadline does not change the law. |
| Data is PII / financial | security ≥ T2 | Sensitivity is a property of the data, not of the budget. |
| Payment / PCI flag | security ≥ T3 | Card data carries a fixed minimum bar of controls. |
| Life-safety impact | global floor T3 | Failure can hurt people; rigor cannot be traded away. |
| Secret in code | forbidden — any tier | A hard ratchet: APP-09 fires even at the T0 floor. |
When required rigor exceeds declared capacity — a 99.9% SLA against a "tight deadline / small team" constraint — it is flagged as an accepted risk in TRADE-OFFS.md. Generation proceeds. The point is a conscious decision, not a hard stop.
Each of reliability, security, performance, cost, operational, sustainability resolves to the maximum of the global Tier and any applicable override. The ratchet is monotonic — there is no path that lowers a dimension below its floor.
This is the core of stage 3. For every one of the 40 practices, the generator reads the answer, takes the effective tier of the practice's dimension, and resolves a status. Two ideas matter: required_if makes a practice mandatory because of what the project is; the Tier then scales the rigor of everything else.
Calibration locked the two refinements above with golden fixtures: deferrable stops fast_mvp from downgrading cheap hygiene (logs, modular monolith, basic CI/CD), and required_if catches answer-driven mandates the Tier alone would miss.
A single record carries its directive (Do / Don't / Example / Verification). The Do/Don't compile into AI-AGENT-RULES.md; the Verification compiles into DEFINITION-OF-DONE.md; a trade_off becomes an entry in TRADE-OFFS.md. One record, several destinations.
Mostly does not become code — it becomes constraints the other layers inherit, carried in a propagates line.
Presentation, domain modeling, communication, consistency, resilience, observability, auth, testing, config, extensibility.
Classification, model, volume, integrity, access, retention, privacy, lineage, backup/RPO, movement.
Hosting, compute, availability, RTO/DR, scalability, security posture, exposure, IaC, CI/CD, cost.
Point Claude Code, Cursor, Copilot, or any generic LLM agent at AI-AGENT-RULES.md — it flattens the Required and Recommended directives into imperative rules the agent follows while building. Filenames stay version-free; every file declares its version internally.
An internal expense-approval platform: ~2,000 employees, integrates an ERP and a payment gateway, built by a small team under a tight deadline.
Scores: C = 8 (C-High), K = 4 (K-Med) → base T2. The payment_pci flag ratchets security to T3. The required execution rigor exceeds the declared team/deadline capacity — recorded as an accepted risk, not a blocker.
The example is generated by the same generator from t2-expense.yaml, so it stays in lock-step with the catalog.
APP-07 auth → Required (financial override) · APP-05 resilience → Recommended (deferrable=false, survives fast_mvp) · APP-10 extensibility → Deferred (deferrable=true). Each of the 8 Deferred items appears in TRADE-OFFS.md with a reactivation trigger.
Catalog practices, tier-engine tuning, calibration fixtures, the generator, docs, or a real-world case study to harden the framework against. Send a note and I'll reply directly.