/health/embedding observability endpoint
Header
Use the pencil to edit title, status, priority, and owner. Changing status auto-prepends a changelog entry.
Why
Operators need a single cheap endpoint that reports queue depth by
status, the active query-time provider's identity, and a provider
health probe. Without this, "is the embedding pipeline keeping up?"
is a manual SQL query against the queue table — fine for triage,
bad for dashboards.
User stories
As an operator, I want one HTTP endpoint that reports queue depth + provider state so that a dashboard alert fires when pending grows or failed spikes.
As a monitoring setup, I want structured JSON I can scrape on a 30s interval so that I get backlog / health trends without joining the SQLite.
Acceptance criteria (EARS)
- When `GET /health/embedding` is called, the system shall return HTTP 200 with a JSON body containing `provider` (model_id, model_version, dim, healthy) and `queue` (pending, running, done, failed counts).
- Where the queue is empty for the configured (model_id, model_version), all counts shall be 0 (not absent).
- When the provider's `health_check` returns False, the response shall report `provider.healthy = false` but still return HTTP 200 (this is observability, not gating).
- Where the configured query-time model differs from the worker's bulk-embed model, the queue counts shall reflect ONLY the query-time model (consistent with what the user-facing path sees).
Success determiner
Path
Runner
Clarifications needed
None.
Out of scope
None.
Dependencies
Plan
josh-core/app/routes/health_embedding.py. Inline SQL (no dep onjosh-embedder's queue helpers — substrate stays the contract).
Two tests: empty-queue case + pending-counted case.
Tasks
3 of 3 done.
- t1 GET /health/embedding endpoint with JSON response shape
- t2 Always-200 semantics (failures show in body, not status)
- t3 Tests against real SQLite (migration applied) + injected fake provider
Changelog
-
2026-05-10T11:00:00Z
planned→verifiedEndpoint wired into josh-core; 2 tests pass.