Architecture
The substrate that everything hangs off — server, volume, containers, and the SQLite-end-to-end storage stack. Read it.
Josh is an open-source substrate for U.S. federal policy data. It ingests public sources — bills, the Federal Register, hearings, CRS reports, and more — normalizes them into a single queryable store, and exposes them over a REST API and an MCP server so agents and humans can search and cite primary federal data.
This site is the project's documentation: how the substrate is built, how each data source is ingested, and the living spec catalog that drives the work.
Architecture
The substrate that everything hangs off — server, volume, containers, and the SQLite-end-to-end storage stack. Read it.
Data sources
Per-source ingestion specs — schema, chunker, citation IDs, status, and probe findings for every federal source. Browse sources.
Spec catalog
Every unit of work is a spec (YAML → HTML). Browse what Josh v1 ships. Open the catalog.
Operations
Runbooks for building and running the substrate — ingestion, embedding, chunking, migrations, backups. See operations.
The Spec section is served verbatim from the existing YAML → HTML generator
(bin/build-spec.py); everything else is Markdown you can edit in a single file
and push as a PR — the navigation, breadcrumbs, and styling regenerate.