substratedraftp1

Spec system as a living, agent-maintained surface

substrate-spec-system-living · updated 2026-05-27T00:00:00Z · owner rritz

Header

Use the pencil to edit title, status, priority, and owner. Changing status auto-prepends a changelog entry.

Why

The spec system today is a one-way pipe: humans (and occasionally
agents) author YAML; bin/build-spec.py renders HTML; bin/sync-nav.py
propagates nav; humans read the result. Three things are missing that
would make the system *living* — self-updating as work happens, and
agent-extensible:

1. No ordering signal in the data. The spec catalog (spec/index.html)
groups by category and filters by status. There's no machine-readable
answer to "what's the next thing to do." The hand-authored
spec/roadmap.html is a curated ordering, but it drifts the moment
specs change — there's no automatic regeneration.

2. No agent-friendly path to author a new spec. Creating a spec
today means knowing the schema, the file path convention
(docs/spec/data/<id>.yaml), the EARS-form acceptance criteria
pattern, the determiner tagged-union shape, the [STUB] convention,
and the changelog format. An agent that notices a gap can describe
it but can't easily *commit* it to the spec system without a human
translating.

3. No "next undone task" query. bin/spec-pickup.py (planned, p1)
renders a single spec as an agent brief — useful once you know which
spec. The missing piece is "across all specs, what's the highest-
priority unstarted task in tier N?"

This spec adds the three: a tier: field on every spec, a generator
for roadmap.html from tier:, bin/spec-new.py to scaffold a draft
spec from a template, and bin/spec-pickup.py --next-todo to surface
the next task to work on. End state: when an agent does work, the spec
system reflects it; when an agent notices a gap, it can land a draft
spec as a normal commit; when anyone asks "what's next," the spec
system answers.

User stories

As an agent picking up work in a fresh session, I want one command that tells me the next unstarted task in the launch sequence so that I'm not deciding ordering by guesswork on every session.

As a human asking "what's the state of things", I want a roadmap view that always matches the spec data, automatically so that I'm not reading a hand-authored page that drifted a week ago.

As an agent that spotted a missing spec (undocumented decision, scraper gap, etc.), I want a command that scaffolds a draft spec YAML for me to fill in so that I can land the spec as a normal commit instead of asking a human to translate.

As a contributor reading the spec system for the first time, I want an ordering signal that tells me which specs land first vs last so that I can target my contributions at unblocking the next tier, not buried in tier 7.

Acceptance criteria (EARS)

When `_schema.json` is updated to add a `tier:` field, the system shall accept the field as an enum of named tiers (e.g., `tier_0_foundation`, `tier_1_host`, …, `tier_8_post_launch`) and reject any spec whose tier is not in the enum.
When `bin/build-spec.py` runs, the system shall regenerate `docs/spec/roadmap.html` from the YAML data — grouping specs by `tier:`, ordering rows within each tier by a deterministic rule (priority then id), and reading per-tier rationale from a sibling `docs/spec/data/_roadmap-meta.yaml` file.
When a spec's `tier:` changes in YAML, the next `bin/build-spec.py` run shall move that spec to its new tier in `roadmap.html` without manual edits to the roadmap page.
When `bin/spec-new.py <id>` is run from the repo root, the system shall scaffold a draft YAML at `docs/spec/data/<id>.yaml` matching the schema, with `id`, `title` (prompted), `category` (prompted from enum), `status: draft`, `priority: p1`, `owner: null`, stub `why`, one stub acceptance criterion, a `manual` determiner with a single stub checklist item, and an empty changelog.
When `bin/spec-pickup.py --next-todo [--tier=<tier>] [--priority=<p0|p1>]` is run, the system shall print the next undone task across all specs matching the filters, ordered first by tier, then priority, then spec id, then task id.
While the system supports `tier:`, the hand-authored `docs/spec/roadmap.html` from 2026-05-27 shall be deleted and replaced with the generated file, with the same visual structure (tier headings, priority/status pills, source-roster ordering).
Where a spec has no `tier:` field set (legacy/unsorted), the system shall surface it under a `tier_unsorted` bucket in the generated roadmap rather than silently dropping it — so unsorted specs are visible, not invisible.
When the contributor workflow is documented, `https://docs.usejosh.com/operations/spec-workflow/` shall describe both `bin/spec-new.py` and the `bin/spec-pickup.py --next-todo` query patterns as supported flows.

Success determiner

kindmanual

Checklist

_schema.json includes `tier:` as an enum of named tiers; bin/build-spec.py --validate rejects an out-of-enum value
All 67+ specs have a `tier:` assigned; no spec sits in `tier_unsorted`
docs/spec/roadmap.html is generated by bin/build-spec.py; the file is gitignored OR regenerated cleanly in CI (no manual edits)
bin/spec-new.py crs_reports_followup creates docs/spec/data/crs_reports_followup.yaml that passes bin/build-spec.py --validate
bin/spec-pickup.py --next-todo prints a specific (spec_id, task_id, task_text) line
https://docs.usejosh.com/operations/spec-workflow/ documents both bin/spec-new.py and the --next-todo query

Determiner becomes a `bash` kind once the scripts land. At that point the determiner runs: bin/build-spec.py --validate && bin/spec-pickup.py --next-todo && bin/spec-new.py --check-only sample-id Keeping it `manual` while the spec is `draft` so the determiner doesn't false-fail before the implementation lands.

Clarifications needed

Tier values: per-spec `tier:` field, or a separate `_roadmap.yaml` that lists specs per tier? Lean per-spec — keeps the answer to 'where does this spec live in the roadmap' inside the spec itself. The separate `_roadmap-meta.yaml` is only for tier *rationale text*, not tier *membership*.
Should `bin/spec-new.py` open the new file in $EDITOR after scaffolding, or just print the path? Lean: print the path so the script is composable; user runs `$EDITOR $(bin/spec-new.py …)` if they want to edit.
`bin/spec-pickup.py --next-todo` ordering: should `status: blocked` specs be skipped (the task is blocked) or surfaced (operator should know)? Lean: skip blocked by default, opt in with `--include-blocked`.
Does `tier:` become a required field, or stay optional with a `tier_unsorted` fallback? Lean: optional during migration, required after the backfill.

Out of scope

Auto-detecting when a task is done (e.g., from git history). Task state stays a human/agent edit to the YAML — automating closure would create false-positives we can't easily undo.
Per-task acceptance criteria (each spec's `tasks` list is internal-checkpoint, not contract). The schema's `acceptance_criteria` field is still the load-bearing contract.
Multi-author concurrency on the same spec YAML. The spec-edit flow stays one-author-at-a-time; collisions are git's job.
A web UI for spec authoring. The in-browser editor on `spec/index.html` already covers casual edits; `bin/spec-new.py` is for command-line / agent flows.

Dependencies

Plan

## Phase A — schema + tier field (one PR)

1. Add tier: to _schema.json as an enum. Initial values match the
hand-authored roadmap: tier_0_foundation, tier_1_host,
tier_2_eval_gates, tier_3_reingest, tier_4_launch_corpus,
tier_5_surface, tier_6_launch, tier_7_post_launch_sources,
tier_8_substrate_v1x. Plus a tier_unsorted for migration safety.
2. Backfill tier: across all existing specs in a single commit.
Reference the current hand-authored roadmap for the mapping.
3. Verify bin/build-spec.py --validate passes with the new field.

## Phase B — generator (one PR)

1. Extend bin/build-spec.py to emit roadmap.html alongside the
per-spec pages. Read per-tier rationale from
docs/spec/data/_roadmap-meta.yaml (one entry per tier id with
title and why fields).
2. Within each tier, sort by (priority asc, id asc).
3. Match the visual structure of the current hand-authored roadmap
(tier headings, priority/status pills, source-roster ordering
enforced by a roster_order: field where ordering matters).
4. Delete docs/spec/roadmap.html from the tracked tree; regenerate
on every build.

## Phase C — bin/spec-new.py (one PR)

1. CLI: bin/spec-new.py <id> [--category=<cat>] [--priority=<p>].
2. Reads _templates/source.yaml if category=source, else a new
_templates/general.yaml for the substrate/surface/launch case.
3. Writes docs/spec/data/<id>.yaml with stubs.
4. Prints the absolute path to stdout so it composes with $EDITOR.
5. Validates the new file against the schema before exit.

## Phase D — spec-pickup --next-todo (one PR)

This is an extension of bin/spec-pickup.py (planned, p1), not a new
script. Builds on whatever shape that spec lands.

1. Add --next-todo flag.
2. Walk all spec YAMLs, find tasks with done: false.
3. Filter by spec status: skip verified, shipped by default; skip
blocked unless --include-blocked.
4. Order by (tier asc, priority asc, spec_id asc, task_id asc).
5. Print the first match as: <spec_id> <task_id>: <task_text>.
6. Print N matches with --max=N.

## Phase E — workflow doc (one PR)

Update https://docs.usejosh.com/operations/spec-workflow/ with two new sections:
"Creating a new spec from the command line" and "Finding the next
task to work on." Reference both scripts.

## Backfill mapping (Phase A)

Mapping is derivable from the hand-authored docs/spec/roadmap.html
(the one created 2026-05-27). Specifically:
tier_0_foundation: 21 specs (all of substrate-* shipped/verified + ci-foundation + ingester-modularity-pass + conventions-refactor + embedding-* shipped/verified)
tier_1_host: 8 specs (substrate-bare-metal-host, substrate-volume-mount, substrate-nightly-backup, substrate-litestream-backup, substrate-observability-defaults, substrate-tombstone-policy, substrate-cron-scheduler, substrate-source-defaults)
tier_2_eval_gates: 2 (substrate-embedding-evaluation, substrate-retrieval-eval-per-source)
tier_3_reingest: 2 (legislators-and-committees-ingester, crs-reports-ingester)
tier_4_launch_corpus: 5 (federal-register, bills, us-code, public-laws, roll-call-votes — with roster_order set)
tier_5_surface: 11 (cli-* + rest-* + mcp-server + oss-startup-scripts)
tier_6_launch: 5 (dataset-card-and-vision-doc, agents-md-cross-tool-instructions, josh-code-quality-skill, josh-pr-review-skill, spec-pickup, docs-site-migration)
tier_7_post_launch_sources: 12 (committee-reports through topic-taxonomy)
tier_8_substrate_v1x: 1 (embedding-snapshot-distribution)
Plus this spec (substrate-spec-system-living): tier_6_launch (it lands as part of OSS launch quality)

Exact membership is in the existing hand-authored roadmap; backfill
uses that as ground truth.

Tasks

0 of 9 done.

t1 Phase A: Add `tier:` enum to _schema.json + backfill across all existing specs + verify --validate passes
t2 Phase B: Extend bin/build-spec.py to emit roadmap.html from `tier:` + per-tier rationale in _roadmap-meta.yaml; delete the hand-authored roadmap from tracked tree
t3 Phase B: Within-tier ordering by (priority, id); add `roster_order:` for source roster tiers (4 and 7) where ordering matters
t4 Phase C: bin/spec-new.py — CLI scaffolds a draft YAML, validates against schema, prints absolute path
t5 Phase C: _templates/general.yaml — non-source category template
t6 Phase D: bin/spec-pickup.py --next-todo — walks YAMLs, filters by status, orders by (tier, priority, spec_id, task_id), prints next match
t7 Phase E: https://docs.usejosh.com/operations/spec-workflow/ documents both new flows
t8 Verify the existing hand-authored docs/spec/roadmap.html (created 2026-05-27) is replaced by the generated file with the same content; no visual regressions
t9 Verify uv run poe ci passes (nav check, spec validate, spec build) with the new generator wired in

Changelog

No history yet.

docs/spec/substrate-spec-system-living.html · generated by bin/build-spec.py