EmbeddingProvider protocol + uniform error hierarchy
Header
Use the pencil to edit title, status, priority, and owner. Changing status auto-prepends a changelog entry.
Why
Worker, query-time, and adapter code all need a single shape to talk
about embeddings. The Protocol (inshared/josh_substrate/embedding/protocol.py) defines three callables
and four required attributes; the error hierarchy (in errors.py)
gives the worker uniform retry/terminate semantics regardless of which
backend failed. Without this layer every adapter is an island.
User stories
As an embedding-worker author, I want one protocol every adapter satisfies so that switching providers is a config flag, not a code change.
As an adapter author, I want a typed contract enforced by mypy so that drift is caught at PR time, not in production.
Acceptance criteria (EARS)
- Where an adapter implements `EmbeddingProvider`, it shall expose `model_id`, `model_version`, `dim`, `max_batch` as attributes and `embed_documents`, `embed_query`, `health_check` as async methods.
- When a backend call fails, the adapter shall raise a subclass of `EmbeddingError` (`ProviderConfigError`, `ProviderTransientError`, or `ProviderUnavailableError`); raw backend exceptions shall not cross the provider boundary.
- When `embed_documents` is called with `[]`, the adapter shall return `EmbeddingResult(vectors=[], ...)` without contacting the backend.
- When `embed_documents` returns, the result's `model_id` and `model_version` shall match `provider.model_id`/`model_version`.
Success determiner
Path
Runner
Clarifications needed
None.
Out of scope
None.
Dependencies
Plan
Pure-typing layer. protocol.py defines EmbeddingProvider (Protocol,
runtime_checkable) and EmbeddingResult (dataclass). errors.py is a
4-class hierarchy with a shared base. Both export fromjosh_substrate.embedding. New code in this package opts into strict
mypy via [[tool.mypy.overrides]].
Tasks
4 of 4 done.
- t1 Protocol + EmbeddingResult + Embedding type alias
- t2 EmbeddingError + 3 subclasses with documented retry semantics
- t3 Strict-mypy override for josh_substrate.embedding.*
- t4 Public surface re-exported from package __init__
Changelog
-
2026-05-10T11:00:00Z
planned→verifiedProtocol + errors written; mypy strict checks pass on the package.