optimize agent-docs
This commit is contained in:
@@ -2,13 +2,17 @@
|
||||
|
||||
Use this checklist for read-only audits of `AGENTS.md` and `agent-docs/`.
|
||||
|
||||
- Did you classify each finding with the shared severity model?
|
||||
- Is the root `AGENTS.md` still the clear entrypoint?
|
||||
- Can a maintainer tell which document to read first for each common task?
|
||||
- Does any document act like a secondary router without enough reason?
|
||||
- Does each rule appear to have one authoritative home?
|
||||
- Does any inventory document contain policy language that should live in a rule doc?
|
||||
- Does any general-purpose agent-doc mix in project-specific business rules, product instructions, or end-user documentation?
|
||||
- Are authority-dependent rules missing source-of-truth anchors?
|
||||
- Are classification-heavy inventories missing classification authority anchors?
|
||||
- Do key workflow docs state their edge cases and failure modes clearly enough that an agent would not have to infer them?
|
||||
- Would typical user phrasing likely route to the correct skill and document path?
|
||||
- Are there residual cycles or cross-links that materially increase adherence cost?
|
||||
- Which findings are observed facts, and which are inferences about adherence risk?
|
||||
- If there are no concrete findings, what residual risks remain?
|
||||
|
||||
42
skills/agent-docs/references/behavior-check-template.md
Normal file
42
skills/agent-docs/references/behavior-check-template.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# Behavior Check Template
|
||||
|
||||
Use this template when a change to `AGENTS.md`, `agent-docs`, or the `agent-docs` skill suite requires representative prompt checks.
|
||||
|
||||
## Per Prompt Record
|
||||
|
||||
```md
|
||||
### Prompt <n>
|
||||
|
||||
Prompt:
|
||||
- <representative user phrasing>
|
||||
|
||||
Expected Route:
|
||||
- <root entrypoint>
|
||||
- <specialized skill or child doc that should follow>
|
||||
- <required follow-up checks if any>
|
||||
|
||||
Observed Route:
|
||||
- <what the agent actually read, chose, or skipped>
|
||||
|
||||
Result:
|
||||
- <pass / fail>
|
||||
|
||||
Failure Reason:
|
||||
- <omit if pass; otherwise explain the mismatch such as skipped root entrypoint, missed follow-up check, or guessed through ambiguity>
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
```md
|
||||
- behavior check outcome: <pass / fail / incomplete>
|
||||
- prompts tested: <count>
|
||||
- failing condition hit: <yes / no>
|
||||
- notes: <anything the static checks still cannot prove>
|
||||
```
|
||||
|
||||
## Pass Criteria
|
||||
|
||||
- The agent reads the root entrypoint before choosing a specialized workflow when required.
|
||||
- The agent follows the intended route instead of picking an equally plausible but wrong skill or doc.
|
||||
- The agent performs required follow-up checks after refactor-oriented changes.
|
||||
- The agent surfaces ambiguity instead of guessing through duplicate authority or overlapping scope.
|
||||
@@ -4,9 +4,12 @@ Use this checklist for structural inspection of the AGENTS and `agent-docs` grap
|
||||
|
||||
- Is every active child document directly or indirectly reachable from the root?
|
||||
- What is the shortest routing depth for each child doc?
|
||||
- Does any shortest routing depth exceed the recommended budget of `3`?
|
||||
- Are there direct cycles? If so, which ones are harmless and which ones are noisy?
|
||||
- Does any document have unusually high out-degree and behave like a secondary index?
|
||||
- Does any document exceed the recommended active downstream-reference budget of `7`?
|
||||
- Does any intermediate document only restate links without adding scope or decision logic?
|
||||
- Is any rule likely owned by more than one active document?
|
||||
- Do any sibling docs both explicitly apply to the same task at the same scope level?
|
||||
- Does each document's actual shape still match its declared job: root, rule doc, or inventory?
|
||||
- Are there stale references to deleted or renamed docs?
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
Use this structure for Markdown files inside `agent-docs/` when the document's main job is to track a maintained set of entries: media plans, registries, migration queues, or cleanup lists.
|
||||
|
||||
## Recommended Shape
|
||||
## Minimum Required Shape
|
||||
|
||||
```md
|
||||
# <Document Title>
|
||||
@@ -27,6 +27,10 @@ Use this structure for Markdown files inside `agent-docs/` when the document's m
|
||||
- <when repeated notes should become policy in a rule doc>
|
||||
- <when this inventory should split or link to a narrower rule doc>
|
||||
|
||||
## Classification Authority
|
||||
|
||||
- <who or what decides the classification scheme or status changes>
|
||||
|
||||
## Entries
|
||||
|
||||
### <Grouping Heading>
|
||||
@@ -49,6 +53,22 @@ Notes:
|
||||
- <questions for keeping the inventory current and unambiguous>
|
||||
```
|
||||
|
||||
## Required Sections
|
||||
|
||||
- `# <Document Title>`
|
||||
- opening statement
|
||||
- `## Applies To`
|
||||
- `## Maintenance Rules`
|
||||
- `## Reclassification Rules`
|
||||
- `## Classification Authority`
|
||||
- `## Entries`
|
||||
- `## Maintenance Checklist`
|
||||
|
||||
## Conditionally Required Sections
|
||||
|
||||
- `## Does Not Apply To` when the inventory would otherwise be confused with a rule doc, public docs, or another registry
|
||||
- grouping headings under `## Entries` when they materially improve scanning or maintenance ownership
|
||||
|
||||
## Notes
|
||||
|
||||
- Prefer an inventory template when the document is mostly a maintained list rather than a rules essay.
|
||||
@@ -58,6 +78,7 @@ Notes:
|
||||
- Use grouping headings only when they materially improve navigation.
|
||||
- Separate durable maintenance rules from the entries themselves.
|
||||
- If the governing policy becomes large, move it to a rule doc and link to it from the inventory.
|
||||
- Add `Reclassification Rules` when entry state changes or repeated review decisions would otherwise stay implicit.
|
||||
- Treat `Classification Authority` as required. If status or grouping rules have no authority anchor, the inventory is not stable enough to maintain.
|
||||
- Treat `Reclassification Rules` as required. Repeated state transitions should not stay implicit in review comments or maintenance folklore.
|
||||
- Write for maintainers and make update decisions explicit.
|
||||
- If the document references repository files, write repository-relative paths in backticks, not Markdown links or machine-local absolute filesystem paths.
|
||||
|
||||
@@ -4,9 +4,13 @@ Use this checklist when comparing AGENTS or `agent-docs` against repository fact
|
||||
|
||||
- Which exact doc claims depend on repository reality?
|
||||
- Which repository files or commands provide the authoritative fact for each claim?
|
||||
- For each item, did you record `doc claim`, `observed repo fact`, and `result` in that order?
|
||||
- Do the rule docs expose enough authority or source-of-truth anchors that a maintainer can verify the claim without guessing?
|
||||
- Which rules should have had an `Authority` section but do not?
|
||||
- Do inventories that classify repository entities expose enough classification authority to verify their status or grouping decisions?
|
||||
- Do scripts, ports, paths, or package names still match the docs?
|
||||
- Do the documented verification commands still match actual script behavior?
|
||||
- Do layering, contract-boundary, or runtime-topology claims still match code structure?
|
||||
- Which mismatches are must-fix drift versus wording cleanup?
|
||||
- Did every non-confirmed item include a severity level from the shared severity model?
|
||||
- Are the conclusions grounded in current files and command output rather than memory?
|
||||
|
||||
47
skills/agent-docs/references/report-format.md
Normal file
47
skills/agent-docs/references/report-format.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# Report Format
|
||||
|
||||
Use these output shapes when reporting `agent-docs` findings. Keep the format compact, but preserve the field order so audits and follow-up refactors stay easy to compare.
|
||||
|
||||
## Audit Finding
|
||||
|
||||
```md
|
||||
- <severity>: <short finding title>
|
||||
observed fact: <what was directly verified in the docs, references, or command output>
|
||||
inference: <why this creates adherence risk or maintenance risk>
|
||||
impact: <what the agent or maintainer is likely to get wrong>
|
||||
```
|
||||
|
||||
## Consistency Finding
|
||||
|
||||
```md
|
||||
- <severity>: <short finding title>
|
||||
structural evidence: <cycle, reachability gap, overlap signal, depth budget overrun, or secondary-router signal>
|
||||
refactor needed: <yes / no / maybe>
|
||||
note: <why the current structure is acceptable or why it should change>
|
||||
```
|
||||
|
||||
## Repo-Alignment Finding
|
||||
|
||||
```md
|
||||
- <result>: <short finding title>
|
||||
doc claim: <the wording from AGENTS.md or agent-docs>
|
||||
observed repo fact: <the file, command, or code area that confirms or contradicts the claim>
|
||||
severity: <omit when result is confirmed; otherwise blocking / major / minor / observation>
|
||||
```
|
||||
|
||||
## Verification Summary
|
||||
|
||||
```md
|
||||
- result level: <fully verified / structurally verified only / behavior check failed / verification incomplete>
|
||||
commands: <the commands run in the current turn>
|
||||
representative behavior checks: <performed / not performed / not required>
|
||||
unverified remainder: <anything still not checked>
|
||||
residual findings: <severity-labeled items that still remain after verification>
|
||||
```
|
||||
|
||||
## Format Rules
|
||||
|
||||
- Keep findings first.
|
||||
- Use the shared severity model when a severity is required.
|
||||
- Keep `observed fact`, `structural evidence`, and `observed repo fact` grounded in files or command output from the current turn.
|
||||
- Do not skip the required fields just because there is only one finding.
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
Use this structure for Markdown files inside `agent-docs/` when the document's main job is to govern behavior: rules, policy, constraints, decision criteria, or workflow guidance.
|
||||
|
||||
## Recommended Shape
|
||||
## Minimum Required Shape
|
||||
|
||||
```md
|
||||
# <Document Title>
|
||||
@@ -38,6 +38,21 @@ Use this structure for Markdown files inside `agent-docs/` when the document's m
|
||||
- <validation questions>
|
||||
```
|
||||
|
||||
## Required Sections
|
||||
|
||||
- `# <Document Title>`
|
||||
- opening statement
|
||||
- `## Applies To`
|
||||
- `## Authority`
|
||||
- at least one topic-specific rule section
|
||||
- `## Checklist`
|
||||
|
||||
## Conditionally Required Sections
|
||||
|
||||
- `## Does Not Apply To` when scope boundaries are easy to misread or likely to overlap with sibling docs
|
||||
- `## Edge Cases` when branch conditions or exceptions would otherwise be inferred
|
||||
- `## Failure Modes` when maintainers need recovery guidance or common-mistake handling
|
||||
|
||||
## Notes
|
||||
|
||||
- Make the scope explicit.
|
||||
@@ -46,6 +61,6 @@ Use this structure for Markdown files inside `agent-docs/` when the document's m
|
||||
- Write for maintainers. State actions, constraints, and decision criteria directly.
|
||||
- If the document references repository files, write repository-relative paths in backticks, not Markdown links or machine-local absolute filesystem paths.
|
||||
- State the user or maintainer decision directly: what to do, what to avoid, and how to choose.
|
||||
- Add `Authority` when the rule depends on a repository fact, owning document, or other source of truth that later reviews must verify.
|
||||
- Treat `Authority` as required. If no authority can be named, the rule is probably too vague or not ready to be canonicalized.
|
||||
- Add `Edge Cases` and `Failure Modes` when the workflow is easy to misapply or the scope boundary is easy to misunderstand.
|
||||
- Prefer linking to inventories, examples, or narrower child docs instead of embedding long registries inline.
|
||||
|
||||
38
skills/agent-docs/references/severity-model.md
Normal file
38
skills/agent-docs/references/severity-model.md
Normal file
@@ -0,0 +1,38 @@
|
||||
# Severity Model
|
||||
|
||||
Use this model when reporting findings for `agent-docs` audits, consistency checks, repo-alignment reviews, or post-change verification.
|
||||
|
||||
## Levels
|
||||
|
||||
- `blocking`: the current document graph or wording is unsafe to follow. The agent is likely to route incorrectly, guess through ambiguity, or apply conflicting rules.
|
||||
- `major`: the docs are still usable, but the current state creates significant adherence risk, maintenance cost, or fact drift that should be fixed soon.
|
||||
- `minor`: the issue is real but localized. It does not usually break routing or authority outright, yet it increases noise, review cost, or future drift risk.
|
||||
- `observation`: useful context, tradeoff notes, or follow-up ideas that are not defects by themselves.
|
||||
|
||||
## Typical Mapping
|
||||
|
||||
- `blocking`
|
||||
- duplicate authority with no explicit tie-break
|
||||
- equally specific sibling docs that both explicitly apply
|
||||
- unreachable active child doc
|
||||
- stale references after a refactor
|
||||
- `major`
|
||||
- high-noise secondary router behavior
|
||||
- missing `Authority` or `Classification Authority` where required
|
||||
- must-fix repo drift
|
||||
- structure that exceeds routing-depth or out-degree budgets and now needs refactor
|
||||
- `minor`
|
||||
- vague `Applies To` wording
|
||||
- vague `Authority` wording that still has some usable anchor
|
||||
- wording drift that does not yet misstate repository facts
|
||||
- structure warnings that are notable but still acceptable for now
|
||||
- `observation`
|
||||
- acceptable tradeoffs
|
||||
- optional cleanup ideas
|
||||
- residual risks after no concrete findings
|
||||
|
||||
## Reporting Rule
|
||||
|
||||
- Order findings by severity first, then by impact within the same severity.
|
||||
- Do not inflate style preferences into `major` or `blocking`.
|
||||
- If a finding could fit two levels, choose the higher one only when the current wording or structure is likely to cause a real routing, authority, or verification mistake.
|
||||
@@ -2,6 +2,7 @@
|
||||
|
||||
Use this checklist when editing the `agent-docs` skill suite itself: any `SKILL.md`, `agents/openai.yaml`, or shared file under `references/`.
|
||||
|
||||
- Did you run `scripts/validate_agent_docs.py --skill-suite-root <skills-root>` after the edit?
|
||||
- Does each `agents/openai.yaml` still match the skill title, scope, and trigger language in the corresponding `SKILL.md`?
|
||||
- Is every file in `references/` directly referenced from the router skill or the specialized skill that requires it?
|
||||
- Are there stale filenames, old skill names, or outdated terms left behind after renames or splits?
|
||||
|
||||
@@ -3,10 +3,18 @@
|
||||
Use this checklist after changing `AGENTS.md` or `agent-docs/`.
|
||||
|
||||
- Did you run the strongest repository validation that actually applies?
|
||||
- If the change touched AGENTS or `agent-docs` structure, did you run `scripts/validate_agent_docs.py --repo-root <repo-root>`?
|
||||
- If the change touched this skill suite, did you run `scripts/validate_agent_docs.py --skill-suite-root <skills-root>`?
|
||||
- For docs-only work with no stronger validator, did you at least run `git diff --check`?
|
||||
- If routing changed, did you re-check root reachability?
|
||||
- If cross-links changed, did you re-check for stale references or direct cycles where relevant?
|
||||
- If the validator reported depth or out-degree warnings, did you decide whether the structure is still acceptable or should be refactored?
|
||||
- If root routing, child references, conflict rules, or required follow-up workflow changed, did you test 1 to 3 representative prompts?
|
||||
- Did you record each representative prompt with `behavior-check-template.md`?
|
||||
- Did those representative prompts confirm that the agent read the root entrypoint first, chose the intended specialized workflow, ran required follow-up checks, and surfaced ambiguity instead of guessing?
|
||||
- Did any representative prompt skip the root entrypoint, miss a required follow-up check, or guess through ambiguity? If so, the behavior check failed.
|
||||
- Which result level applies: `fully verified`, `structurally verified only`, `behavior check failed`, or `verification incomplete`?
|
||||
- If findings or residual warnings remain, did you label them with the shared severity model?
|
||||
- Did the final summary follow the verification shape from `report-format.md`?
|
||||
- Does the completion claim match the commands actually run?
|
||||
- If part of the change was not verified, is that limitation stated explicitly?
|
||||
|
||||
Reference in New Issue
Block a user