Provider contract test suite (parameterized over adapters)
Header
Use the pencil to edit title, status, priority, and owner. Changing status auto-prepends a changelog entry.
Why
An adapter is "correct" if and only if it satisfies the Protocol's
contract. We make that assertable: a single pytest module
parameterized over every registered ProviderFactory runs the same
~12 tests against each adapter. Drop in a new adapter, append its
factory, run the suite — pass = production-ready. Without this suite,
every adapter has its own ad-hoc tests and "works" means whatever
each author thought to verify.
User stories
As a new-adapter author, I want one test module that proves my adapter conforms so that I don't reinvent the testing wheel.
As a deploy gatekeeper, I want live adapters tested against real backends before deploy so that provider drift surfaces locally, not in production.
Acceptance criteria (EARS)
- When `uv run poe test-contract` runs, the system shall execute the contract suite against every locally-runnable adapter (currently `local_st`).
- Where an adapter requires credentials or remote services (Modal, HTTP), the factory shall `pytest.skip` rather than error when the env isn't configured.
- When `uv run poe test-live` runs, the system shall execute the same contract suite against real Modal/HTTP backends if their env vars are set.
- Where the contract suite finds a failing adapter, the failure message shall identify which adapter and which contract clause failed.
Success determiner
Path
Runner
Clarifications needed
None.
Out of scope
None.
Dependencies
Plan
conftest.py defines ALL_PROVIDER_FACTORIES and a session-scopedprovider fixture; test_provider_contract.py holds the parameterized
tests. Twelve test functions cover shape (dim, dtype, query shape),
batch behavior (empty, order, repeat-determinism), robustness (long
text, concurrency, health check), semantic sanity (similar pair
outranks dissimilar), and provenance (result echoes model identity).
Tasks
4 of 4 done.
- t1 Contract test module with 12 parameterized tests
- t2 Factory registry pattern (ALL_PROVIDER_FACTORIES) with skip-on-missing-creds
- t3 Anchor pairs (similar/dissimilar) for semantic sanity test
- t4 live + contract markers wired into pytest config
Changelog
-
2026-05-10T11:00:00Z
planned→verifiedContract suite written; 12 tests pass against LocalSTProvider.