substrateverifiedp0

Provider contract test suite (parameterized over adapters)

embedding-contract-tests · updated — · owner rritz

Use the pencil to edit title, status, priority, and owner. Changing status auto-prepends a changelog entry.

An adapter is "correct" if and only if it satisfies the Protocol's
contract. We make that assertable: a single pytest module
parameterized over every registered ProviderFactory runs the same
~12 tests against each adapter. Drop in a new adapter, append its
factory, run the suite — pass = production-ready. Without this suite,
every adapter has its own ad-hoc tests and "works" means whatever
each author thought to verify.

As a new-adapter author, I want one test module that proves my adapter conforms so that I don't reinvent the testing wheel.

As a deploy gatekeeper, I want live adapters tested against real backends before deploy so that provider drift surfaces locally, not in production.

  1. When `uv run poe test-contract` runs, the system shall execute the contract suite against every locally-runnable adapter (currently `local_st`).
  2. Where an adapter requires credentials or remote services (Modal, HTTP), the factory shall `pytest.skip` rather than error when the env isn't configured.
  3. When `uv run poe test-live` runs, the system shall execute the same contract suite against real Modal/HTTP backends if their env vars are set.
  4. Where the contract suite finds a failing adapter, the failure message shall identify which adapter and which contract clause failed.
kindtest_file

Path

shared/josh_substrate/tests/embedding/test_provider_contract.py

Runner

uv run pytest shared/josh_substrate/tests/embedding -m 'contract and not live' -v

None.

None.

conftest.py defines ALL_PROVIDER_FACTORIES and a session-scoped
provider fixture; test_provider_contract.py holds the parameterized
tests. Twelve test functions cover shape (dim, dtype, query shape),
batch behavior (empty, order, repeat-determinism), robustness (long
text, concurrency, health check), semantic sanity (similar pair
outranks dissimilar), and provenance (result echoes model identity).

4 of 4 done.

  • t1 Contract test module with 12 parameterized tests
  • t2 Factory registry pattern (ALL_PROVIDER_FACTORIES) with skip-on-missing-creds
  • t3 Anchor pairs (similar/dissimilar) for semantic sanity test
  • t4 live + contract markers wired into pytest config
  • 2026-05-10T11:00:00Z plannedverified Contract suite written; 12 tests pass against LocalSTProvider.

docs/spec/embedding-contract-tests.html · generated by bin/build-spec.py