MCP server · Josh spec

Header

Use the pencil to edit title, status, priority, and owner. Changing status auto-prepends a changelog entry.

Why

MCP is the destination shape for Josh's agent-facing surface. Anthropic
"Connectors" (Claude.ai + Cowork), ChatGPT "Apps", Claude Code, Cursor —
all install a remote MCP server by URL. Locking in that one server is
what lets Josh be installable in every major agent host with a single
codebase, no per-vendor adapter.

This spec wraps the REST surface (rest-api-search,
rest-api-resource-endpoints) in MCP tools, hosted in the same
josh-core process. The REST API stays the canonical, versioned contract.
MCP is a thin presentation layer that calls the REST handlers internally,
so business logic, auth, rate-limiting, and citation construction are
not duplicated. Pattern matches Linear / GitHub / Notion / Stripe.

The shape forces three discipline points on the REST specs:
(1) every search hit must carry an id that round-trips through a
fetch-by-id call; (2) ID grammar (rest-api-conventions §5) must be
unambiguous enough that one fetch tool can route any ID to the right
resource handler; (3) tool-name compatibility — search and fetch are
mandatory tool names for ChatGPT's retrieval pipeline.

User stories

As a Cowork admin, I want to install Josh as a remote MCP connector by giving Claude one HTTPS URL so that my analysts get federal-policy lookups in their existing Claude workspace with zero glue code.

As a ChatGPT user (Plus or above), I want to add Josh as a custom App and have it plug into "company knowledge" so that ChatGPT can search and cite federal sources without me copying URLs into the chat.

As a Claude Code user, I want to add Josh to `.mcp.json` with one line so that the editor agent can look up bills, members, and committee rosters during tasks.

As an agent author building a custom client against MCP, I want typed tools beyond `search`/`fetch` — `get_bill`, `list_committee_members`, `list_cosponsors` so that the LLM picks a deterministic structured lookup when the question has an identifier and only falls back to semantic search when it doesn't.

As an OSS self-hoster, I want the MCP server to ship in the same container as the REST API so that I run one process, one bind-mount, one TLS cert.

Acceptance criteria (EARS)

When a client connects to `<base>/mcp` over HTTPS, the system shall speak MCP over Streamable HTTP per spec revision 2025-11-25 (no stdio, no deprecated SSE transport).
When a client issues an `initialize` request, the server shall advertise `protocolVersion: '2025-11-25'`, server name `josh`, and version string read from the `josh-core` package metadata.
Where the MCP server is co-located with the REST API in `josh-core`, the two shall share the same FastAPI process, request-id middleware, and `JOSH_AUTH` posture.
When the client calls `tools/list`, the response shall include tools named exactly `search` and `fetch` (lowercase, no prefix), in addition to typed convenience tools.
When `search` is called with `{query, source?, since?, until?, limit?}`, the system shall return the same payload shape as `GET /v1/search` — `{results: [...], total, took_ms}` — with no MCP-specific reshaping.
When `fetch` is called with `{id}`, the system shall route the ID by its type prefix (per `rest-api-conventions` §5) to the appropriate resource handler and return the full record.
When a search result is returned from `search`, its `id` field shall be a valid input to `fetch` (round-trip property — enforced by contract test).
Where a citation-style query has an identifier (bill ID, bioguide ID, committee ID), the typed tool shall route to the SQL resource handler without invoking the FTS5 or vector path.
When `tools/list` is returned, it shall include at minimum: `search`, `fetch`, `get_bill`, `get_legislator`, `list_committee_members`, `list_cosponsors`, `lexical_search`, `semantic_search`.
When `semantic_search` is called with `{query, source?, filters?}`, filters shall be applied as pre-filters to the candidate set before vector scoring, never as post-filters on the top-K.
When `lexical_search` is called with `{query, source?, filters?}`, only the FTS5 BM25 path shall execute (no vector retrieval).
Where `semantic_search`, `lexical_search`, or `search` is called with one or more `source` values that are registry-shape (per the matrix in `https://docs.usejosh.com/operations/query-flows/`), the system shall return an MCP tool result with `isError: true` whose content names `source_not_searchable` and `_meta.hint.use_tool` shall be `resolve_entity`.
Where `semantic_search` is called with a short-body source (has body FTS5 but no vectors), the system shall return `isError: true` with `source_not_searchable_semantically` and `_meta.hint.valid_sources`.
Where the `semantic_search` tool description is rendered for `tools/list`, it shall include the current list of body-bearing source IDs derived from the live substrate schema — not a hard-coded list.
When `resolve_entity` is called with `{query: <text>, entity_type: <one_of_registry_sources>, filters?: {<field>: <value>}}`, the system shall route to `GET /v1/<entity_type>?q=<query>&<filters>` per `rest-api-entity-resolution` and return the ranked list of matching records as an MCP tool result.
Where the `resolve_entity` tool description is rendered for `tools/list`, the `entity_type` enum shall be the live list of registry sources from the substrate schema.
When `count`, `sum`, or `time_series` tool is called, the system shall route to `GET /v1/<resource>?aggregate=<op>&...` per `rest-api-aggregations` and return the aggregate response (`{value, query_cost}` or `{buckets, total_buckets, query_cost}`) verbatim in the MCP envelope.
Where an aggregation tool description is rendered, the `group_by` and `sum_field` enums shall be the per-resource AggregationPlan-derived eligible fields.
Where an aggregate query exceeds the row-touch budget or timeout (per `rest-api-aggregations`), the MCP tool shall return `isError: true` with `aggregate_too_broad` or `aggregate_timeout` and surface `_meta.hint.suggest_narrowing`.
When `get_bill_dossier(bill_id)`, `get_legislator_dossier(bioguide_id, congress?)`, `get_committee_dossier(committee_id, congress?)`, or `get_public_law_dossier(id)` is called, the system shall route to `GET /v1/<resource>/{id}/dossier` per `rest-api-dossiers` and return the full envelope (all sections) in one MCP tool result.
Where a dossier tool's response is truncated in any nested section, the truncation surface (`more_url`) shall be preserved in the MCP envelope so the agent can chain to the canonical `rest-api-resource-endpoints` sub-resource list.
When a REST handler raises an error, the MCP server shall translate the error envelope (`rest-api-conventions` §3) into an MCP `isError: true` tool result whose `content[0].text` includes `error.code`, `error.message`, and `error.request_id`.
When the input to a tool fails schema validation, the server shall return MCP error code `-32602 (Invalid params)` with a message naming the offending field.
Where the deployment runs with `JOSH_AUTH=disabled` (OSS self-host, public data), the MCP server shall accept unauthenticated tool calls and shall serve a static `/.well-known/oauth-protected-resource` document with an empty `authorization_servers` array, signaling no OAuth requirement.
Where the deployment runs with `JOSH_AUTH=oauth`, the MCP server shall implement OAuth 2.1 Resource Server discovery per RFC 9728 and shall require a valid bearer token (`Authorization: Bearer ...`) matching the `api_keys` table on every tool call.
When a tool call completes (success or error), the server shall emit a structured log line with `tool_name`, `request_id`, `duration_ms`, `result_status` (`ok` | `error`), and `caller_kind` (`anonymous` | `api_key:<id>`).
Where the MCP server serves the manifest endpoint, it shall advertise the project name `io.github.<org>/josh` (matching the official MCP Registry namespace), a one-line description, the documented tool list, and a link to public install docs.

Success determiner

kindtest_file

Path

josh-core/tests/test_mcp_server.py

Runner

uv run pytest josh-core/tests/test_mcp_server.py -v

Contract test that boots the FastAPI app in-process and drives the `/mcp` endpoint via an MCP client (the official `mcp` Python SDK's `Client` class over Streamable HTTP, talking to the same ASGI app via httpx). Each acceptance criterion has at least one test: - `initialize` handshake shape, protocol version, server name. - `tools/list` includes the mandatory `search` and `fetch` plus every typed tool by exact name. - Round-trip: `search → fetch(result.id)` returns a record. - Typed lookup path: `get_bill('hr:119:1')` does not touch FTS5 or vector tables (assert via SQL trace). - `semantic_search` pre-filter behavior: filters applied before scoring, not after. - Error mapping: REST 404 → MCP `isError: true` with `record_not_found`. - Schema validation: malformed input → `-32602` with field name. - Auth: `JOSH_AUTH=disabled` and `JOSH_AUTH=oauth` modes both covered. - `/.well-known/oauth-protected-resource` shape in both modes. - Tool-call audit log line emitted with all required fields. Determiner currently fails because: (a) the test file does not yet exist, and (b) `josh-core/josh_core/mcp/` is not yet created. Flips to passing once `josh-core` mounts FastMCP at `/mcp` and the REST handlers it wraps are themselves live (chained dependency on `rest-api-search` and `rest-api-resource-endpoints`).

Clarifications needed

Framework choice: FastMCP (Python) is the leading candidate — built-in Streamable HTTP, OAuth 2.1 support, ASGI-mountable. Alternative: hand-rolled JSON-RPC over a FastAPI router. Lean FastMCP unless a contract test reveals a blocker.
Should `fetch` accept any substrate ID (`hr:119:1`, `S000033`, `R47892`, `2026-08558`) and dispatch by prefix, or should we split into `fetch_bill` / `fetch_legislator` / etc.? Single `fetch` is the ChatGPT contract; typed tools are the Claude-friendly redundancy.
Tool descriptions need to be tight enough that the LLM picks the structured tool for identifier-bearing queries and `semantic_search` only for genuine free-text questions. Need an offline routing-eval (small Q&A set) before locking the description text.
Pagination through MCP tool calls — pass `cursor` / `offset` through transparently, or surface a `search_next` follow-up tool? Lean transparent for v1.

Out of scope

MCP server's own resource-endpoint behavior (response shape, IDs, pagination) — inherited verbatim from `rest-api-conventions`, defined in `rest-api-resource-endpoints` and `rest-api-search`.
OAuth Authorization Server implementation — Josh is a Resource Server only. Managed deployments delegate to their own AS; OSS self-host runs authless.
Cowork plugin packaging (`.plugin` bundle with skills + slash commands wrapping this connector) — separate spec when the connector is verified.
ChatGPT Apps SDK registration (TOS, listing) — operational task, not a substrate spec.
Telemetry/billing per tool call — managed-deployment concern.
stdio transport — explicitly not supported; Josh is always a remote MCP server.

Dependencies

Plan

Locked decisions.

## 1. Where it runs

Mounted at /mcp inside the existing josh-core FastAPI process.
One container, one TLS cert, one auth posture, one rate-limit pool.
No separate josh-mcp service.

Rationale: every MCP tool is a thin call into a REST handler; running
them as separate processes would mean either an HTTP hop per tool call
(latency penalty) or copying business logic across two codebases
(drift penalty). Co-location wins.

## 2. Framework

FastMCP (Python) mounted as a sub-app under FastAPI. Provides:
Streamable HTTP transport, tool/resource/prompt decorators, OAuth 2.1
Resource Server scaffolding, and SDK-compatible session handling.
Tools live in josh-core/josh_core/mcp/tools/.

Decorator style:
``python @mcp.tool() async def get_bill(bill_id: str) -> Bill: return await rest.bills.get(bill_id) # same handler the REST router uses``

## 3. Tool surface

Three classes of tools, in this order of LLM preference for citation-style
questions:

Class A — typed structured lookups (no FTS5, no vector):

| Tool | Wraps | Returns |
|----------------------------|---------------------------------------------|---------|
| get_bill(bill_id) | GET /v1/bills/{bill_id} | Bill record + citation |
| get_legislator(bioguide) | GET /v1/legislators/{bioguide_id} | Legislator + citation |
| get_committee(id) | GET /v1/committees/{id} | Committee + citation |
| list_committee_members(id, congress?) | GET /v1/committees/{id}/members?congress= | List of memberships |
| list_cosponsors(bill_id) | GET /v1/bills/{bill_id}/cosponsors | List of legislators |
| get_bill_text(bill_id) | GET /v1/bills/{bill_id}/body | Bill body (text + meta) |
| list_member_votes(bioguide, congress?) | GET /v1/legislators/{bioguide}/votes | Votes (paginated) |

Class B — text search (FTS5 + vector, separately addressable):

| Tool | Wraps | Path |
|-------------------------------------|----------------------------------------------|------|
| lexical_search(query, source?, filters?) | GET /v1/search?mode=lexical | BM25 only |
| semantic_search(query, source?, filters?) | GET /v1/search?mode=semantic | Vector only (pre-filtered) |
| search(query, source?, ...) | GET /v1/search (default mode) | Hybrid (BM25 + vec, RRF) — ChatGPT contract |

Source eligibility is the load-bearing constraint on Class B
(matrix in https://docs.usejosh.com/operations/query-flows/):

- Body-bearing sources (11 at v1) — accept all three Class B
tools.
- Short-body (SAPs) — accept lexical_search and search
(hybrid downgrades to lexical with degraded block); reject
semantic_search.
- Registry / lookup (9 sources) — REJECTED outright by all three
Class B tools with isError: true, source_not_searchable, and
_meta.hint.use_tool: resolve_entity. Class B is surface-disjoint
from Class D.

semantic_search tool description (rendered at tools/list time)
enumerates the eligible source IDs from the live schema, so an agent
reading the tool catalog can see which sources are valid arguments
without trial-and-error.

Class C — fetch by ID (universal):

| Tool | Wraps | Notes |
|---------------|----------------------------------------|-------|
| fetch(id) | dispatches to the right GET /v1/<resource>/{id} by ID prefix | ChatGPT contract; redundant with Class A for Claude |

Class D — entity resolution (fuzzy name → canonical ID):

| Tool | Wraps | Notes |
|--------------------------------------------|--------------------------------------------------------|-------|
| resolve_entity(query, entity_type, filters?) | GET /v1/<entity_type>?q=<query>&<filters> per rest-api-entity-resolution | Only over registry sources. Returns full records, not search cards. Surface-disjoint from Class B. |

entity_type enum is the live list of registry sources (legislators,
committees, staff-directories, lda-filings, roll-call-votes, hearings,
regulations-dot-gov-dockets, topic-taxonomy).

Class E — aggregations (counts, top-N, time-series):

| Tool | Wraps | Notes |
|---------------------------------------------------|-----------------------------------------------------------------------|-------|
| count(resource, group_by?, filters?) | GET /v1/<resource>?aggregate=count&group_by=<>&<filters> | Scalar count or bucketed |
| sum(resource, sum_field, group_by?, filters?) | GET /v1/<resource>?aggregate=sum&sum_field=<>&group_by=<>&<filters> | Per-resource summable fields per AggregationPlan |
| time_series(resource, time_field, interval, group_by?, filters?) | GET /v1/<resource>?bucket_by_time=<>&time_interval=<>&group_by=<> | day/week/month/quarter/year |

Each tool description enumerates the eligible group_by and
sum_field values from the per-resource AggregationPlan.

Class F — dossiers (cross-source fan-out):

| Tool | Wraps | Returns |
|-----------------------------------------------|------------------------------------------------|---------|
| get_bill_dossier(bill_id) | GET /v1/bills/{id}/dossier | Bill + cosponsors + actions + CRS + CBO + SAP + reports + hearings + votes + amendments |
| get_legislator_dossier(bioguide_id, congress?) | GET /v1/legislators/{id}/dossier | Legislator + terms + committees + bill counts + leadership + offices + speeches |
| get_committee_dossier(committee_id, congress?) | GET /v1/committees/{id}/dossier | Committee + members + subcommittees + hearings + bills + reports |
| get_public_law_dossier(id) | GET /v1/public-laws/{id}/dossier | Public law + originating bill + CFR sections + implementing rules |

Dossier tools collapse what would otherwise be 5-12 separate tool
calls into one envelope. Truncated sections expose a more_url so
the agent can chain to the canonical resource-endpoint sub-resource
list when needed.

## 3a. Tool-class routing matrix

Which class an agent picks by query shape:

| Query shape | Class | Example tool |
|------------------------------------------------|-------|--------------------------------|
| "Get HR103" (exact ID) | A / C | get_bill, fetch |
| "Find Sen. Markey from MA" (noisy name) | D | resolve_entity |
| "Bills mentioning child tax credit" (keyword) | B | lexical_search |
| "Bills similar to HR103" (semantic) | B | semantic_search |
| "EPA PFAS regs in 2026" (mixed) | B | search (hybrid) |
| "Top 20 lobbyist spenders in 2025" | E | sum |
| "Monthly bill volume on AI 2018+" | E | time_series |
| "Everything about HR1" | F | get_bill_dossier |
| "Senator X's profile in 119th" | F | get_legislator_dossier |
| "Cross-source: lobbyists + sponsors + bills" | A + E | orchestrate |

The agent picks tools by description, not by a learned classifier. Tool
descriptions are tuned in josh-core/josh_core/mcp/descriptions.py and
pinned by an offline routing-eval fixture
(josh-core/tests/fixtures/mcp_routing_eval.jsonl) before the spec moves
to verified. The eval fixture covers all six classes.

## 4. ID routing for fetch

Single dispatch table keyed on the type prefix from rest-api-conventions
§5:

``hr:|s:|hjres:|sjres:|hres:|sres:|hconres:|sconres: → bills house:|senate: → roll_call_votes crec: → congressional_record pl: → public_laws ^[A-Z]\d{6}$ → legislators (bioguide) ^R\d{5}$ → crs_reports ^\d{4}-\d{5}$ → federal_register``

Unknown prefix → MCP isError: true with error.code='unrecognized_id_format'
and a hint.valid_prefixes list.

## 5. Auth posture

Two modes, picked at boot via JOSH_AUTH:

- disabled (OSS default): authless. Static
/.well-known/oauth-protected-resource returns
{authorization_servers: []}. MCP clients (Claude.ai, ChatGPT) see
a public connector. Suitable because Josh data is public-domain
federal data.
- oauth (managed deployments): Resource Server only. RFC 9728
discovery document points at the managed deployment's chosen
Authorization Server (e.g., Stytch, WorkOS, Auth0). Bearer tokens
validated per request; sub claim mapped to api_keys row.

No OAuth client implementation in v1 — Anthropic explicitly supports
authless remote MCP for public-data servers.

## 6. Wire protocol

Streamable HTTP per [MCP spec 2025-11-25](https://modelcontextprotocol.io/specification/2025-03-26/basic/transports).
No stdio (would require packaging a binary). No SSE (deprecated).

Single endpoint: POST /mcp accepts a JSON-RPC envelope; long-running
responses stream as chunked HTTP. GET /mcp returns a tiny landing
page pointing at install docs (so curling the URL doesn't look broken).

## 7. Error mapping

REST error envelope from rest-api-conventions §3 maps to MCP:

``REST 4xx/5xx with {error: {type, code, message, request_id, hint?}} → MCP tool result with isError: true content[0].text: <message> content[1].text: <code> + request_id _meta: {error_code, error_type, request_id, hint?}``

Validation errors at the JSON-RPC boundary (before reaching a REST
handler) use MCP's native -32602 Invalid params instead.

## 8. Implementation surface

``josh-core/josh_core/ mcp/ __init__.py # FastMCP() instance, mounted in app.main tools/ search.py # search, lexical_search, semantic_search fetch.py # fetch (universal dispatcher) bills.py # get_bill, list_cosponsors, get_bill_text legislators.py # get_legislator, list_member_votes committees.py # get_committee, list_committee_members descriptions.py # tool descriptions — tuned for routing error_map.py # REST envelope → MCP result well_known.py # /.well-known/oauth-protected-resource tests/ test_mcp_server.py # the success determiner fixtures/ mcp_routing_eval.jsonl # 30 Q&A pairs, expected tool name per Q``

## 9. Registry & distribution

Once the determiner runs green, publish to the official MCP Registry
(registry.modelcontextprotocol.io) under
io.github.<org>/josh. Cowork distribution wraps this connector in
a .plugin bundle (separate spec).

Tasks

0 of 18 done.

Changelog

2026-05-13T12:00:00Z planned→planned Spec authored after researching the 2026 connector landscape (Anthropic Connectors / Cowork plugins, ChatGPT Apps, Claude Code, MCP spec rev 2025-11-25). Key decisions: co-locate in josh-core (not a separate service), use FastMCP, mandate `search` + `fetch` tool names for ChatGPT compatibility, expose typed Class A tools to bias the agent toward structured lookups for citation-style questions (HR103, Ways and Means roster), authless by default with OAuth 2.1 Resource Server as the managed-deployment path. Hard dependency on rest-api-search and rest-api-resource-endpoints — the MCP server adds no new behavior, only a new transport.
2026-05-13T14:00:00Z planned→planned Source eligibility added. The MCP server inherits the source-shape classification (body-bearing / short-body / registry) from rest-api-search and surfaces it in the tool catalog: `semantic_search` tool description lists eligible source IDs at `tools/list` time; calling it against an ineligible source returns `isError: true` with `source_not_searchable_semantically`. `search` (hybrid default) auto-downgrades to lexical for registry sources and passes the `degraded` block through. Class A typed tools are unaffected — they route to the structured path regardless of source shape.
2026-05-13T15:00:00Z planned→planned Tool surface expanded after the 64-query coverage analysis. Added: - **Class D — resolve_entity** for fuzzy registry lookup (Postgres `pg_trgm` analog; wraps `rest-api-entity-resolution`). Class B (text search) now rejects registry sources outright with a `use_tool: resolve_entity` hint — surface-disjoint from Class D, intent-disjoint from Class A. - **Class E — count, sum, time_series** for analytical queries (wraps `rest-api-aggregations`); closes the 17% aggregation gap surfaced in the coverage analysis. - **Class F — get_*_dossier** tools for cross-source fan-out (wraps `rest-api-dossiers`); collapses 5-12 tool calls into one envelope for the most common bill/legislator/committee/public-law questions. The Class B auto-downgrade for registry sources is removed — those now error out cleanly with a redirect hint to Class D. Short-body sources (SAPs) retain auto-downgrade for `mode=hybrid` since they're genuine body sources without vectors. Tool-class routing matrix added to the plan to make agent intent mapping explicit. Routing eval fixture now covers all six classes, not just A/B/C.