josh-embedder service — drains the queue into vec0 tables
Header
Use the pencil to edit title, status, priority, and owner. Changing status auto-prepends a changelog entry.
Why
A new top-level service peer to josh-core and josh-ingester,
responsible for one job: read pending rows from chunk_embedding_jobs,
embed them via the configured provider, write vectors into the
per-source vec0 virtual table, mark done. Decoupled from ingestion
so a failing GPU backend doesn't take ingestion down with it; runs
in its own Kamal app with its own scaling envelope.
User stories
As an ingestion pipeline, I want chunks I write to the queue eventually become vectors so that I don't have to know about embedding to do my job.
As an operator, I want a worker I can throttle independently of ingestion so that GPU spend doesn't ride my source-scraping cadence.
Acceptance criteria (EARS)
- When the worker polls the queue, the system shall claim up to `batch_size` jobs in a single `UPDATE ... RETURNING` statement, atomically transitioning them to `running`.
- Where the chunk row referenced by a claimed job no longer exists, the system shall mark that job `failed` with `last_error='source chunk row missing at claim time'`.
- When the provider raises `ProviderConfigError`, the system shall mark every claimed job `failed` (terminal) and continue serving subsequent batches.
- When the provider raises `ProviderUnavailableError`, the system shall revert claimed jobs to `pending` without consuming attempt budget.
- When the provider raises `ProviderTransientError`, the system shall revert to `pending` if `attempts < max_attempts`; otherwise mark `failed`.
- When the worker successfully embeds a batch, the system shall write `INSERT OR REPLACE` to `<chunk_table>_vec0`, stamp the chunk row's `embedded_model_*` columns, and mark the job `done` — all in one transaction.
- Where `chunk_table` contains characters outside `[A-Za-z0-9_]`, the system shall raise `ValueError` (defends against SQL-injection-shaped queue rows).
Success determiner
Path
Runner
Clarifications needed
None.
Out of scope
None.
Dependencies
Plan
Service lives at josh-embedder/. Module split:settings.py (frozen dataclass, env-driven),factory.py (provider switch),jobs.py (queue I/O — claim, mark_done, mark_failed, queue_depth),worker.py (poll loop, batch processing, error semantics),cli.py (typer entrypoints: daemon, drain-once, status).
vec0 writes use vec_quantize_binary(?) for the bit[] companion to
avoid tagged-binary-format coupling.
Tasks
7 of 7 done.
- t1 Service skeleton with pyproject.toml, src/, tests/
- t2 Atomic claim via UPDATE...RETURNING with subquery
- t3 vec0 + chunk attribution + queue update in one transaction
- t4 Three error classes mapped to terminal/non-terminal/budget-aware retries
- t5 CLI with daemon, drain-once, status commands
- t6 8 worker tests covering happy path, idempotency, resumability, all error paths
- t7 chunk_table name validation (rejects injection-shaped values)
Changelog
-
2026-05-10T11:00:00Z
planned→verifiedWorker + CLI + 8 tests all pass.