HttpProvider — generic OpenAI-/TEI-compatible HTTP adapter
Header
Use the pencil to edit title, status, priority, and owner. Changing status auto-prepends a changelog entry.
Why
Covers OpenAI's /v1/embeddings, HuggingFace TEI, vLLM with--task embed, and any self-hosted server speaking either format.
This is the escape hatch for deployments that already pay for an
embedding API or run their own embedding service: no need to touch
the substrate, just point JOSH_HTTP_EMBED_URL at it.
User stories
As an OpenAI-API customer, I want to use the embedding API I already pay for so that I don't run a parallel Modal account just for Josh.
As a self-hosted-TEI operator, I want to point Josh at my internal embedding server so that my data never leaves my network.
Acceptance criteria (EARS)
- When `JOSH_HTTP_FORMAT=openai`, the adapter shall send `{"input": [...], "model": ...}` and parse `{"data": [{"embedding": [...]}, ...]}`.
- When `JOSH_HTTP_FORMAT=tei`, the adapter shall send `{"inputs": [...]}` and parse a bare list of lists.
- When the endpoint returns 401/403, the adapter shall raise `ProviderConfigError`; 429/5xx → `ProviderTransientError`; network failure → `ProviderUnavailableError`.
- Where `JOSH_HTTP_API_KEY` is set, the adapter shall send it as `Authorization: Bearer <key>`; otherwise no auth header.
- When the response shape is unparseable for the configured format, the adapter shall raise `ProviderTransientError` with the format name and parse-failure detail.
Success determiner
Path
Runner
Skipped by default. Set JOSH_HTTP_EMBED_URL to a reachable OpenAI-compatible or TEI endpoint to run live. Default factory pytest.skip preserves the contract that adapters degrade gracefully in unconfigured envs.
Clarifications needed
None.
Out of scope
None.
Dependencies
Plan
shared/josh_substrate/embedding/providers/http_provider.py. httpx
in josh-substrate[http] extra. Two payload/parse functions
(_parse_openai, _parse_tei) selected at construction. Vector
shape verified after each response.
Tasks
4 of 4 done.
- t1 HttpProvider class implementing the Protocol
- t2 openai + tei format support, env-driven selection
- t3 Status-code mapping to the three EmbeddingError flavours
- t4 health_check via HEAD on the embed URL
Changelog
-
2026-05-10T11:00:00Z
planned→verifiedAdapter shipped; live test gated on JOSH_HTTP_EMBED_URL env.