optimize agent-docs

2026-04-23 02:16:39 +08:00
parent 2c22fcc4c3
commit c14adfc633
31 changed files with 734 additions and 377 deletions
--- a/skills/agent-docs/SKILL.md
+++ b/skills/agent-docs/SKILL.md
@@ -77,6 +77,13 @@ Read only the files needed for the current task.
 - Repo-alignment checklist: `references/repo-alignment-checklist.md`
 - Verification checklist: `references/verification-checklist.md`
 - Skill-maintenance checklist: `references/skill-maintenance-checklist.md`
+- Severity model: `references/severity-model.md`
+- Report format: `references/report-format.md`
+- Behavior check template: `references/behavior-check-template.md`
+
+## Shared Tool
+
+- Validation script: `scripts/validate_agent_docs.py`

 ## Shared Constraints

--- a/skills/agent-docs/agents/openai.yaml
+++ b/skills/agent-docs/agents/openai.yaml
@@ -1,4 +1,4 @@
 interface:
  display_name: "Maintain agent-docs"
  short_description: "Route agent-docs maintenance work"
-  default_prompt: "Use $agent-docs to route AGENTS.md and agent-docs maintenance tasks to the correct specialized skill and required follow-up checks."
+  default_prompt: "Use $agent-docs to read the root entrypoint, choose the narrowest specialized agent-docs workflow, require follow-up checks after refactor, and escalate duplicate authority instead of guessing."
--- a/skills/agent-docs/references/audit-checklist.md
+++ b/skills/agent-docs/references/audit-checklist.md
@@ -2,13 +2,17 @@

 Use this checklist for read-only audits of `AGENTS.md` and `agent-docs/`.

+- Did you classify each finding with the shared severity model?
 - Is the root `AGENTS.md` still the clear entrypoint?
 - Can a maintainer tell which document to read first for each common task?
 - Does any document act like a secondary router without enough reason?
 - Does each rule appear to have one authoritative home?
 - Does any inventory document contain policy language that should live in a rule doc?
 - Does any general-purpose agent-doc mix in project-specific business rules, product instructions, or end-user documentation?
+- Are authority-dependent rules missing source-of-truth anchors?
+- Are classification-heavy inventories missing classification authority anchors?
 - Do key workflow docs state their edge cases and failure modes clearly enough that an agent would not have to infer them?
+- Would typical user phrasing likely route to the correct skill and document path?
 - Are there residual cycles or cross-links that materially increase adherence cost?
 - Which findings are observed facts, and which are inferences about adherence risk?
 - If there are no concrete findings, what residual risks remain?
--- a/skills/agent-docs/references/behavior-check-template.md
+++ b/skills/agent-docs/references/behavior-check-template.md
@@ -0,0 +1,42 @@
+# Behavior Check Template
+
+Use this template when a change to `AGENTS.md`, `agent-docs`, or the `agent-docs` skill suite requires representative prompt checks.
+
+## Per Prompt Record
+
+```md
+### Prompt <n>
+
+Prompt:
+- <representative user phrasing>
+
+Expected Route:
+- <root entrypoint>
+- <specialized skill or child doc that should follow>
+- <required follow-up checks if any>
+
+Observed Route:
+- <what the agent actually read, chose, or skipped>
+
+Result:
+- <pass / fail>
+
+Failure Reason:
+- <omit if pass; otherwise explain the mismatch such as skipped root entrypoint, missed follow-up check, or guessed through ambiguity>
+```
+
+## Summary
+
+```md
+- behavior check outcome: <pass / fail / incomplete>
+- prompts tested: <count>
+- failing condition hit: <yes / no>
+- notes: <anything the static checks still cannot prove>
+```
+
+## Pass Criteria
+
+- The agent reads the root entrypoint before choosing a specialized workflow when required.
+- The agent follows the intended route instead of picking an equally plausible but wrong skill or doc.
+- The agent performs required follow-up checks after refactor-oriented changes.
+- The agent surfaces ambiguity instead of guessing through duplicate authority or overlapping scope.
--- a/skills/agent-docs/references/consistency-checklist.md
+++ b/skills/agent-docs/references/consistency-checklist.md
@@ -4,9 +4,12 @@ Use this checklist for structural inspection of the AGENTS and `agent-docs` grap

 - Is every active child document directly or indirectly reachable from the root?
 - What is the shortest routing depth for each child doc?
+- Does any shortest routing depth exceed the recommended budget of `3`?
 - Are there direct cycles? If so, which ones are harmless and which ones are noisy?
 - Does any document have unusually high out-degree and behave like a secondary index?
+- Does any document exceed the recommended active downstream-reference budget of `7`?
 - Does any intermediate document only restate links without adding scope or decision logic?
 - Is any rule likely owned by more than one active document?
+- Do any sibling docs both explicitly apply to the same task at the same scope level?
 - Does each document's actual shape still match its declared job: root, rule doc, or inventory?
 - Are there stale references to deleted or renamed docs?
--- a/skills/agent-docs/references/inventory-doc-template.md
+++ b/skills/agent-docs/references/inventory-doc-template.md
@@ -2,7 +2,7 @@

 Use this structure for Markdown files inside `agent-docs/` when the document's main job is to track a maintained set of entries: media plans, registries, migration queues, or cleanup lists.

-## Recommended Shape
+## Minimum Required Shape

 ```md
 # <Document Title>
@@ -27,6 +27,10 @@ Use this structure for Markdown files inside `agent-docs/` when the document's m
 - <when repeated notes should become policy in a rule doc>
 - <when this inventory should split or link to a narrower rule doc>

+## Classification Authority
+
+- <who or what decides the classification scheme or status changes>
+
 ## Entries

 ### <Grouping Heading>
@@ -49,6 +53,22 @@ Notes:
 - <questions for keeping the inventory current and unambiguous>
 ```

+## Required Sections
+
+- `# <Document Title>`
+- opening statement
+- `## Applies To`
+- `## Maintenance Rules`
+- `## Reclassification Rules`
+- `## Classification Authority`
+- `## Entries`
+- `## Maintenance Checklist`
+
+## Conditionally Required Sections
+
+- `## Does Not Apply To` when the inventory would otherwise be confused with a rule doc, public docs, or another registry
+- grouping headings under `## Entries` when they materially improve scanning or maintenance ownership
+
 ## Notes

 - Prefer an inventory template when the document is mostly a maintained list rather than a rules essay.
@@ -58,6 +78,7 @@ Notes:
 - Use grouping headings only when they materially improve navigation.
 - Separate durable maintenance rules from the entries themselves.
 - If the governing policy becomes large, move it to a rule doc and link to it from the inventory.
- Add `Reclassification Rules` when entry state changes or repeated review decisions would otherwise stay implicit.
+- Treat `Classification Authority` as required. If status or grouping rules have no authority anchor, the inventory is not stable enough to maintain.
+- Treat `Reclassification Rules` as required. Repeated state transitions should not stay implicit in review comments or maintenance folklore.
 - Write for maintainers and make update decisions explicit.
 - If the document references repository files, write repository-relative paths in backticks, not Markdown links or machine-local absolute filesystem paths.
--- a/skills/agent-docs/references/repo-alignment-checklist.md
+++ b/skills/agent-docs/references/repo-alignment-checklist.md
@@ -4,9 +4,13 @@ Use this checklist when comparing AGENTS or `agent-docs` against repository fact

 - Which exact doc claims depend on repository reality?
 - Which repository files or commands provide the authoritative fact for each claim?
+- For each item, did you record `doc claim`, `observed repo fact`, and `result` in that order?
 - Do the rule docs expose enough authority or source-of-truth anchors that a maintainer can verify the claim without guessing?
+- Which rules should have had an `Authority` section but do not?
+- Do inventories that classify repository entities expose enough classification authority to verify their status or grouping decisions?
 - Do scripts, ports, paths, or package names still match the docs?
 - Do the documented verification commands still match actual script behavior?
 - Do layering, contract-boundary, or runtime-topology claims still match code structure?
 - Which mismatches are must-fix drift versus wording cleanup?
+- Did every non-confirmed item include a severity level from the shared severity model?
 - Are the conclusions grounded in current files and command output rather than memory?
--- a/skills/agent-docs/references/report-format.md
+++ b/skills/agent-docs/references/report-format.md
@@ -0,0 +1,47 @@
+# Report Format
+
+Use these output shapes when reporting `agent-docs` findings. Keep the format compact, but preserve the field order so audits and follow-up refactors stay easy to compare.
+
+## Audit Finding
+
+```md
+- <severity>: <short finding title>
+  observed fact: <what was directly verified in the docs, references, or command output>
+  inference: <why this creates adherence risk or maintenance risk>
+  impact: <what the agent or maintainer is likely to get wrong>
+```
+
+## Consistency Finding
+
+```md
+- <severity>: <short finding title>
+  structural evidence: <cycle, reachability gap, overlap signal, depth budget overrun, or secondary-router signal>
+  refactor needed: <yes / no / maybe>
+  note: <why the current structure is acceptable or why it should change>
+```
+
+## Repo-Alignment Finding
+
+```md
+- <result>: <short finding title>
+  doc claim: <the wording from AGENTS.md or agent-docs>
+  observed repo fact: <the file, command, or code area that confirms or contradicts the claim>
+  severity: <omit when result is confirmed; otherwise blocking / major / minor / observation>
+```
+
+## Verification Summary
+
+```md
+- result level: <fully verified / structurally verified only / behavior check failed / verification incomplete>
+  commands: <the commands run in the current turn>
+  representative behavior checks: <performed / not performed / not required>
+  unverified remainder: <anything still not checked>
+  residual findings: <severity-labeled items that still remain after verification>
+```
+
+## Format Rules
+
+- Keep findings first.
+- Use the shared severity model when a severity is required.
+- Keep `observed fact`, `structural evidence`, and `observed repo fact` grounded in files or command output from the current turn.
+- Do not skip the required fields just because there is only one finding.
--- a/skills/agent-docs/references/rule-doc-template.md
+++ b/skills/agent-docs/references/rule-doc-template.md
@@ -2,7 +2,7 @@

 Use this structure for Markdown files inside `agent-docs/` when the document's main job is to govern behavior: rules, policy, constraints, decision criteria, or workflow guidance.

-## Recommended Shape
+## Minimum Required Shape

 ```md
 # <Document Title>
@@ -38,6 +38,21 @@ Use this structure for Markdown files inside `agent-docs/` when the document's m
 - <validation questions>
 ```

+## Required Sections
+
+- `# <Document Title>`
+- opening statement
+- `## Applies To`
+- `## Authority`
+- at least one topic-specific rule section
+- `## Checklist`
+
+## Conditionally Required Sections
+
+- `## Does Not Apply To` when scope boundaries are easy to misread or likely to overlap with sibling docs
+- `## Edge Cases` when branch conditions or exceptions would otherwise be inferred
+- `## Failure Modes` when maintainers need recovery guidance or common-mistake handling
+
 ## Notes

 - Make the scope explicit.
@@ -46,6 +61,6 @@ Use this structure for Markdown files inside `agent-docs/` when the document's m
 - Write for maintainers. State actions, constraints, and decision criteria directly.
 - If the document references repository files, write repository-relative paths in backticks, not Markdown links or machine-local absolute filesystem paths.
 - State the user or maintainer decision directly: what to do, what to avoid, and how to choose.
- Add `Authority` when the rule depends on a repository fact, owning document, or other source of truth that later reviews must verify.
+- Treat `Authority` as required. If no authority can be named, the rule is probably too vague or not ready to be canonicalized.
 - Add `Edge Cases` and `Failure Modes` when the workflow is easy to misapply or the scope boundary is easy to misunderstand.
 - Prefer linking to inventories, examples, or narrower child docs instead of embedding long registries inline.
--- a/skills/agent-docs/references/severity-model.md
+++ b/skills/agent-docs/references/severity-model.md
@@ -0,0 +1,38 @@
+# Severity Model
+
+Use this model when reporting findings for `agent-docs` audits, consistency checks, repo-alignment reviews, or post-change verification.
+
+## Levels
+
+- `blocking`: the current document graph or wording is unsafe to follow. The agent is likely to route incorrectly, guess through ambiguity, or apply conflicting rules.
+- `major`: the docs are still usable, but the current state creates significant adherence risk, maintenance cost, or fact drift that should be fixed soon.
+- `minor`: the issue is real but localized. It does not usually break routing or authority outright, yet it increases noise, review cost, or future drift risk.
+- `observation`: useful context, tradeoff notes, or follow-up ideas that are not defects by themselves.
+
+## Typical Mapping
+
+- `blocking`
+  - duplicate authority with no explicit tie-break
+  - equally specific sibling docs that both explicitly apply
+  - unreachable active child doc
+  - stale references after a refactor
+- `major`
+  - high-noise secondary router behavior
+  - missing `Authority` or `Classification Authority` where required
+  - must-fix repo drift
+  - structure that exceeds routing-depth or out-degree budgets and now needs refactor
+- `minor`
+  - vague `Applies To` wording
+  - vague `Authority` wording that still has some usable anchor
+  - wording drift that does not yet misstate repository facts
+  - structure warnings that are notable but still acceptable for now
+- `observation`
+  - acceptable tradeoffs
+  - optional cleanup ideas
+  - residual risks after no concrete findings
+
+## Reporting Rule
+
+- Order findings by severity first, then by impact within the same severity.
+- Do not inflate style preferences into `major` or `blocking`.
+- If a finding could fit two levels, choose the higher one only when the current wording or structure is likely to cause a real routing, authority, or verification mistake.
--- a/skills/agent-docs/references/skill-maintenance-checklist.md
+++ b/skills/agent-docs/references/skill-maintenance-checklist.md
@@ -2,6 +2,7 @@

 Use this checklist when editing the `agent-docs` skill suite itself: any `SKILL.md`, `agents/openai.yaml`, or shared file under `references/`.

+- Did you run `scripts/validate_agent_docs.py --skill-suite-root <skills-root>` after the edit?
 - Does each `agents/openai.yaml` still match the skill title, scope, and trigger language in the corresponding `SKILL.md`?
 - Is every file in `references/` directly referenced from the router skill or the specialized skill that requires it?
 - Are there stale filenames, old skill names, or outdated terms left behind after renames or splits?
--- a/skills/agent-docs/references/verification-checklist.md
+++ b/skills/agent-docs/references/verification-checklist.md
@@ -3,10 +3,18 @@
 Use this checklist after changing `AGENTS.md` or `agent-docs/`.

 - Did you run the strongest repository validation that actually applies?
+- If the change touched AGENTS or `agent-docs` structure, did you run `scripts/validate_agent_docs.py --repo-root <repo-root>`?
+- If the change touched this skill suite, did you run `scripts/validate_agent_docs.py --skill-suite-root <skills-root>`?
 - For docs-only work with no stronger validator, did you at least run `git diff --check`?
 - If routing changed, did you re-check root reachability?
 - If cross-links changed, did you re-check for stale references or direct cycles where relevant?
+- If the validator reported depth or out-degree warnings, did you decide whether the structure is still acceptable or should be refactored?
 - If root routing, child references, conflict rules, or required follow-up workflow changed, did you test 1 to 3 representative prompts?
+- Did you record each representative prompt with `behavior-check-template.md`?
 - Did those representative prompts confirm that the agent read the root entrypoint first, chose the intended specialized workflow, ran required follow-up checks, and surfaced ambiguity instead of guessing?
+- Did any representative prompt skip the root entrypoint, miss a required follow-up check, or guess through ambiguity? If so, the behavior check failed.
+- Which result level applies: `fully verified`, `structurally verified only`, `behavior check failed`, or `verification incomplete`?
+- If findings or residual warnings remain, did you label them with the shared severity model?
+- Did the final summary follow the verification shape from `report-format.md`?
 - Does the completion claim match the commands actually run?
 - If part of the change was not verified, is that limitation stated explicitly?
--- a/skills/agent-docs/scripts/pycache/validate_agent_docs.cpython-312.pyc
+++ b/skills/agent-docs/scripts/pycache/validate_agent_docs.cpython-312.pyc
--- a/skills/agent-docs/scripts/validate_agent_docs.py
+++ b/skills/agent-docs/scripts/validate_agent_docs.py
@@ -0,0 +1,473 @@
+#!/usr/bin/env python3
+"""Validate agent-doc structure and agent-doc skill-suite consistency."""
+
+from __future__ import annotations
+
+import argparse
+import re
+from collections import deque
+from dataclasses import dataclass, field
+from pathlib import Path
+
+
+BACKTICK_MD_RE = re.compile(r"`([^`\n]+?\.md)`")
+H2_RE = re.compile(r"^##\s+(.+?)\s*$", re.MULTILINE)
+FRONTMATTER_NAME_RE = re.compile(r"^name:\s*([A-Za-z0-9-]+)\s*$", re.MULTILINE)
+DEFAULT_PROMPT_RE = re.compile(r'^\s*default_prompt:\s*"([^"]+)"\s*$', re.MULTILINE)
+BULLET_RE = re.compile(r"^\s*-\s+(.+?)\s*$", re.MULTILINE)
+SECTION_RE = re.compile(r"^##\s+(.+?)\s*$", re.MULTILINE)
+STRONG_RULE_RE = re.compile(r"\b(read and follow|must|required|do not|always)\b", re.IGNORECASE)
+
+RULE_REQUIRED_HEADINGS = {"Applies To", "Authority", "Checklist"}
+RULE_OPTIONAL_HEADINGS = {"Does Not Apply To", "Edge Cases", "Failure Modes"}
+INVENTORY_REQUIRED_HEADINGS = {
+    "Applies To",
+    "Maintenance Rules",
+    "Reclassification Rules",
+    "Classification Authority",
+    "Entries",
+    "Maintenance Checklist",
+}
+
+
+@dataclass
+class ValidationReport:
+    errors: list[str] = field(default_factory=list)
+    structural_warnings: list[str] = field(default_factory=list)
+    warnings: list[str] = field(default_factory=list)
+    notes: list[str] = field(default_factory=list)
+
+    def error(self, message: str) -> None:
+        self.errors.append(message)
+
+    def warn(self, message: str) -> None:
+        self.warnings.append(message)
+
+    def structural_warn(self, message: str) -> None:
+        self.structural_warnings.append(message)
+
+    def note(self, message: str) -> None:
+        self.notes.append(message)
+
+
+def read_text(path: Path) -> str:
+    return path.read_text(encoding="utf-8")
+
+
+def normalize_heading(heading: str) -> str:
+    return " ".join(heading.strip().split())
+
+
+def normalize_phrase(value: str) -> str:
+    return " ".join(value.lower().split())
+
+
+def extract_h2_headings(text: str) -> set[str]:
+    return {normalize_heading(match.group(1)) for match in H2_RE.finditer(text)}
+
+
+def extract_section_text(text: str, heading: str) -> str:
+    matches = list(SECTION_RE.finditer(text))
+    for index, match in enumerate(matches):
+        if normalize_heading(match.group(1)) != heading:
+            continue
+        start = match.end()
+        end = matches[index + 1].start() if index + 1 < len(matches) else len(text)
+        return text[start:end].strip()
+    return ""
+
+
+def extract_section_bullets(text: str, heading: str) -> list[str]:
+    section = extract_section_text(text, heading)
+    return [match.group(1).strip() for match in BULLET_RE.finditer(section)]
+
+
+def has_anchor_token(value: str) -> bool:
+    return "`" in value or "/" in value or "." in value or ":" in value
+
+
+def is_vague_phrase(value: str) -> bool:
+    normalized = normalize_phrase(value)
+    vague_phrases = {
+        "related tasks",
+        "relevant tasks",
+        "appropriate tasks",
+        "matching tasks",
+        "relevant work",
+        "related work",
+        "relevant code",
+        "related code",
+        "the code",
+        "the docs",
+        "documentation",
+        "repository",
+        "repo",
+        "maintainers",
+        "owners",
+        "owner",
+    }
+    return normalized in vague_phrases
+
+
+def extract_strong_rule_lines(text: str) -> set[str]:
+    strong_lines: set[str] = set()
+    for raw_line in text.splitlines():
+        line = raw_line.strip()
+        if not line or line.startswith("#") or line.startswith("```"):
+            continue
+        if STRONG_RULE_RE.search(line):
+            strong_lines.add(normalize_phrase(line))
+    return strong_lines
+
+
+def extract_frontmatter_name(text: str) -> str | None:
+    if not text.startswith("---\n"):
+        return None
+    end = text.find("\n---\n", 4)
+    if end == -1:
+        return None
+    frontmatter = text[4:end]
+    match = FRONTMATTER_NAME_RE.search(frontmatter)
+    return match.group(1) if match else None
+
+
+def extract_default_prompt(text: str) -> str | None:
+    match = DEFAULT_PROMPT_RE.search(text)
+    return match.group(1) if match else None
+
+
+def md_reference_targets(text: str) -> list[Path]:
+    targets: list[Path] = []
+    for raw in BACKTICK_MD_RE.findall(text):
+        candidate = Path(raw.strip())
+        if candidate.name == "AGENTS.md" or "agent-docs" in candidate.parts:
+            targets.append(candidate)
+    return targets
+
+
+def agent_doc_files(repo_root: Path) -> list[Path]:
+    docs = []
+    for path in repo_root.rglob("*.md"):
+        relative = path.relative_to(repo_root)
+        if relative.name == "AGENTS.md" and relative.parent == Path("."):
+            continue
+        if "agent-docs" in relative.parts:
+            docs.append(path)
+    return sorted(docs)
+
+
+def is_inventory_doc(headings: set[str]) -> bool:
+    return "Entries" in headings or "Maintenance Rules" in headings
+
+
+def validate_doc_shape(path: Path, report: ValidationReport, display_path: Path | None = None) -> None:
+    text = read_text(path)
+    headings = extract_h2_headings(text)
+    label = str(display_path or path)
+
+    if is_inventory_doc(headings):
+        missing = sorted(INVENTORY_REQUIRED_HEADINGS - headings)
+        if missing:
+            report.error(
+                f"{label}: inventory doc is missing required headings: {', '.join(missing)}"
+            )
+    else:
+        missing = sorted(RULE_REQUIRED_HEADINGS - headings)
+        if missing:
+            report.error(
+                f"{label}: rule doc is missing required headings: {', '.join(missing)}"
+            )
+        rule_sections = headings - RULE_REQUIRED_HEADINGS - RULE_OPTIONAL_HEADINGS
+        if not rule_sections:
+            report.error(
+                f"{label}: rule doc needs at least one topic-specific rule section beyond scope, authority, and checklist headings"
+            )
+
+    applies_to = extract_section_bullets(text, "Applies To")
+    if not applies_to:
+        report.warn(f"{label}: `Applies To` has no bullet entries")
+    for bullet in applies_to:
+        if is_vague_phrase(bullet):
+            report.warn(f"{label}: vague `Applies To` bullet `{bullet}` should be more specific")
+
+    for bullet in extract_section_bullets(text, "Authority"):
+        if is_vague_phrase(bullet) or (len(bullet) < 18 and not has_anchor_token(bullet)):
+            report.warn(
+                f"{label}: vague `Authority` bullet `{bullet}` should name a stronger source-of-truth anchor"
+            )
+
+
+def validate_repo_graph(repo_root: Path, report: ValidationReport, max_depth: int, max_out_degree: int) -> None:
+    root_agents = repo_root / "AGENTS.md"
+    if not root_agents.exists():
+        report.error(f"{root_agents}: missing root AGENTS.md")
+        return
+
+    docs = agent_doc_files(repo_root)
+    nodes = [root_agents, *docs]
+    node_set = set(nodes)
+    edges: dict[Path, set[Path]] = {node: set() for node in nodes}
+
+    for node in nodes:
+        text = read_text(node)
+        for target in md_reference_targets(text):
+            resolved = (repo_root / target).resolve()
+            try:
+                resolved.relative_to(repo_root.resolve())
+            except ValueError:
+                report.error(
+                    f"{node.relative_to(repo_root)}: referenced path escapes repo root: {target}"
+                )
+                continue
+            if not resolved.exists():
+                report.error(
+                    f"{node.relative_to(repo_root)}: stale reference to missing file `{target.as_posix()}`"
+                )
+                continue
+            if resolved in node_set:
+                edges[node].add(resolved)
+
+    queue: deque[Path] = deque([root_agents])
+    depths: dict[Path, int] = {root_agents: 0}
+    while queue:
+        current = queue.popleft()
+        for child in sorted(edges[current]):
+            if child not in depths:
+                depths[child] = depths[current] + 1
+                queue.append(child)
+
+    for doc in docs:
+        if doc not in depths:
+            report.error(
+                f"{doc.relative_to(repo_root)}: active child doc is not reachable from root AGENTS.md"
+            )
+
+    max_observed_depth = max(depths.values(), default=0)
+    report.note(f"repo graph depth: {max_observed_depth}")
+    if max_observed_depth > max_depth:
+        report.structural_warn(
+            f"repo graph depth {max_observed_depth} exceeds recommended maximum {max_depth}"
+        )
+
+    for node, children in sorted(edges.items()):
+        if len(children) > max_out_degree:
+            report.structural_warn(
+                f"{node.relative_to(repo_root)}: out-degree {len(children)} exceeds recommended maximum {max_out_degree}"
+            )
+
+    visiting: set[Path] = set()
+    visited: set[Path] = set()
+    stack: list[Path] = []
+
+    def dfs(node: Path) -> None:
+        visiting.add(node)
+        stack.append(node)
+        for child in sorted(edges[node]):
+            if child in visiting:
+                start = stack.index(child)
+                cycle = stack[start:] + [child]
+                cycle_text = " -> ".join(str(item.relative_to(repo_root)) for item in cycle)
+                report.error(f"cycle detected: {cycle_text}")
+                continue
+            if child not in visited:
+                dfs(child)
+        stack.pop()
+        visiting.remove(node)
+        visited.add(node)
+
+    dfs(root_agents)
+
+    applies_to_index: dict[str, list[Path]] = {}
+    doc_profiles: dict[Path, dict[str, set[str]]] = {}
+    for doc in docs:
+        validate_doc_shape(doc, report, doc.relative_to(repo_root))
+        doc_text = read_text(doc)
+        applies_to_bullets = {
+            normalize_phrase(bullet) for bullet in extract_section_bullets(doc_text, "Applies To")
+        }
+        authority_bullets = {
+            normalize_phrase(bullet) for bullet in extract_section_bullets(doc_text, "Authority")
+        }
+        doc_profiles[doc] = {
+            "applies_to": applies_to_bullets,
+            "authority": authority_bullets,
+            "strong_rules": extract_strong_rule_lines(doc_text),
+        }
+
+        for bullet in applies_to_bullets:
+            normalized = bullet
+            applies_to_index.setdefault(normalized, []).append(doc.relative_to(repo_root))
+
+        nonempty_lines = [line for line in doc_text.splitlines() if line.strip()]
+        reference_lines = [line for line in nonempty_lines if ".md`" in line or ".md" in line]
+        if len(edges[doc]) >= 3 and nonempty_lines:
+            reference_ratio = len(reference_lines) / len(nonempty_lines)
+            if reference_ratio > 0.35 and len(extract_h2_headings(doc_text)) <= 4:
+                report.structural_warn(
+                    f"{doc.relative_to(repo_root)}: high reference density suggests secondary-router behavior"
+                )
+
+    for applies_to, owners in sorted(applies_to_index.items()):
+        unique_owners = sorted({owner.as_posix() for owner in owners})
+        if len(unique_owners) > 1 and not is_vague_phrase(applies_to):
+            report.warn(
+                f"shared `Applies To` bullet `{applies_to}` appears in multiple docs: {', '.join(unique_owners)}"
+            )
+
+    sorted_docs = sorted(doc_profiles)
+    for index, left_doc in enumerate(sorted_docs):
+        left_profile = doc_profiles[left_doc]
+        for right_doc in sorted_docs[index + 1 :]:
+            right_profile = doc_profiles[right_doc]
+            shared_applies = {
+                value
+                for value in left_profile["applies_to"] & right_profile["applies_to"]
+                if not is_vague_phrase(value)
+            }
+            if not shared_applies:
+                continue
+
+            shared_authority = {
+                value
+                for value in left_profile["authority"] & right_profile["authority"]
+                if value and not is_vague_phrase(value)
+            }
+            shared_rules = left_profile["strong_rules"] & right_profile["strong_rules"]
+
+            if shared_authority:
+                report.warn(
+                    "possible duplicate authority between "
+                    f"{left_doc.relative_to(repo_root)} and {right_doc.relative_to(repo_root)}: "
+                    f"shared scope {', '.join(sorted(shared_applies))}; shared authority {', '.join(sorted(shared_authority))}"
+                )
+                continue
+
+            if shared_rules:
+                report.warn(
+                    "possible overlapping rule ownership between "
+                    f"{left_doc.relative_to(repo_root)} and {right_doc.relative_to(repo_root)}: "
+                    f"shared scope {', '.join(sorted(shared_applies))}; shared strong rule signals detected"
+                )
+
+
+def validate_skill_suite(skill_suite_root: Path, report: ValidationReport) -> None:
+    if not skill_suite_root.exists():
+        report.error(f"{skill_suite_root}: missing skill-suite root")
+        return
+
+    skill_dirs = sorted(
+        path
+        for path in skill_suite_root.iterdir()
+        if path.is_dir() and path.name.startswith("agent-docs") and (path / "SKILL.md").exists()
+    )
+    if not skill_dirs:
+        report.error(f"{skill_suite_root}: no agent-docs skill directories found")
+        return
+
+    skill_texts: dict[str, str] = {}
+    for skill_dir in skill_dirs:
+        skill_md = skill_dir / "SKILL.md"
+        text = read_text(skill_md)
+        skill_texts[skill_dir.name] = text
+        skill_name = extract_frontmatter_name(text)
+        if not skill_name:
+            report.error(f"{skill_md}: missing frontmatter name")
+            continue
+        if skill_name != skill_dir.name:
+            report.error(f"{skill_md}: frontmatter name `{skill_name}` does not match directory `{skill_dir.name}`")
+
+        prompt_file = skill_dir / "agents" / "openai.yaml"
+        if prompt_file.exists():
+            prompt_text = read_text(prompt_file)
+            default_prompt = extract_default_prompt(prompt_text)
+            if not default_prompt:
+                report.error(f"{prompt_file}: missing default_prompt")
+            elif f"${skill_name}" not in default_prompt:
+                report.error(
+                    f"{prompt_file}: default_prompt does not mention `${skill_name}`"
+                )
+
+        if "bluecraft-agentic-docs" in text:
+            report.error(f"{skill_md}: stale reference to `bluecraft-agentic-docs`")
+
+    refs_dir = skill_suite_root / "agent-docs" / "references"
+    if refs_dir.exists():
+        searchable_texts = [*skill_texts.values()]
+        for skill_dir in skill_dirs:
+            prompt_file = skill_dir / "agents" / "openai.yaml"
+            if prompt_file.exists():
+                searchable_texts.append(read_text(prompt_file))
+        for ref_file in sorted(path for path in refs_dir.iterdir() if path.is_file()):
+            if not any(ref_file.name in text for text in searchable_texts):
+                report.error(
+                    f"{ref_file}: no skill file or prompt directly references this shared reference"
+                )
+
+    max_prompt_length = 220
+    for skill_dir in skill_dirs:
+        prompt_file = skill_dir / "agents" / "openai.yaml"
+        if not prompt_file.exists():
+            continue
+        default_prompt = extract_default_prompt(read_text(prompt_file))
+        if default_prompt and len(default_prompt) > max_prompt_length:
+            report.warn(
+                f"{prompt_file}: default_prompt length {len(default_prompt)} exceeds recommended maximum {max_prompt_length}"
+            )
+
+
+def build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(
+        description="Validate AGENTS.md / agent-docs structure and agent-docs skill-suite consistency."
+    )
+    parser.add_argument("--repo-root", type=Path, help="Repository root containing AGENTS.md and agent-docs/")
+    parser.add_argument("--skill-suite-root", type=Path, help="Root containing the agent-docs skill directories")
+    parser.add_argument("--max-depth", type=int, default=3, help="Recommended maximum root-to-doc routing depth")
+    parser.add_argument(
+        "--max-out-degree",
+        type=int,
+        default=7,
+        help="Recommended maximum number of child-doc references from one doc",
+    )
+    return parser
+
+
+def main() -> int:
+    parser = build_parser()
+    args = parser.parse_args()
+    if not args.repo_root and not args.skill_suite_root:
+        parser.error("at least one of --repo-root or --skill-suite-root is required")
+
+    report = ValidationReport()
+    if args.repo_root:
+        validate_repo_graph(args.repo_root.resolve(), report, args.max_depth, args.max_out_degree)
+    if args.skill_suite_root:
+        validate_skill_suite(args.skill_suite_root.resolve(), report)
+
+    for note in report.notes:
+        print(f"NOTE: {note}")
+    for structural_warning in report.structural_warnings:
+        print(f"STRUCTURAL-WARNING: {structural_warning}")
+    for warning in report.warnings:
+        print(f"WARNING: {warning}")
+    for error in report.errors:
+        print(f"ERROR: {error}")
+
+    if report.errors:
+        print(
+            "Validation failed with "
+            f"{len(report.errors)} error(s), "
+            f"{len(report.structural_warnings)} structural warning(s), and "
+            f"{len(report.warnings)} warning(s)."
+        )
+        return 1
+
+    print(
+        "Validation passed with "
+        f"{len(report.structural_warnings)} structural warning(s) and "
+        f"{len(report.warnings)} warning(s)."
+    )
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())