Volume-as-data-host bind mount
Header
Use the pencil to edit title, status, priority, and owner. Changing status auto-prepends a changelog entry.
Why
Every byte of substrate state — the SQLite file, raw payloads, normalized
Markdown, advisory locks, backup staging — lives at /data on the host
and is bind-mounted into containers as /data:/data. The host-side/data lives on a durable, redundant device (a block-storage volume on
cloud, a RAID array on bare metal); the container's root filesystem is
treated as throwaway. This is what makes the substrate portable across
compute and what makes Kamal redeploys safe — recreate the container
freely, the substrate is unaffected.
This spec was previously satisfied by a DigitalOcean 500 GB block-storage
volume bind-mounted into the droplet; that droplet was destroyed
2026-05-10. The architectural commitment carries forward unchanged onto
the OVHcloud bare-metal host (see substrate-bare-metal-host); the only
mechanical change is that the durable /data device is now a Soft RAID
10 NVMe array rather than a detachable block-storage volume.
User stories
As an operator redeploying or recreating a container, I want substrate state to survive container swap/recreate so that I never lose substrate data to a routine `kamal deploy` or `docker rm`.
As a Kamal-deployed container (josh-core, josh-ingester, josh-embedder), I want to read and write substrate state at /data inside the container so that my writes land on the host's durable device without leaning on Docker named volumes.
As an OSS self-hoster picking a host shape, I want a clear "everything durable goes under /data on a redundant device" so that I can pick any host (cloud volume, bare-metal RAID, NAS) that satisfies that contract.
Acceptance criteria (EARS)
- The host shall expose `/data` as a path backed by a redundant durable device (block-storage volume mounted via UUID in `/etc/fstab`, OR a RAID array, OR an equivalent — the contract is 'durable + redundant', not the specific layout).
- While the substrate is running, the SQLite file `/data/josh.db` and its `-wal` and `-shm` companions shall live on `/data`, not on a non-durable device.
- When `kamal deploy` runs for josh-core, josh-ingester, or josh-embedder, the resulting container's `/data` shall be a Docker bind mount of the host's `/data` (not a named volume).
- Where a service writes to a non-/data path (e.g. /tmp), that data shall be treated as throwaway.
- If the device underlying /data uses fstab (e.g. a separate partition), then fstab shall reference it by UUID, not device name, so a re-detect doesn't break the mount.
Success determiner
Command
set -euo pipefail
# /data is a real directory on the host
ssh josh 'test -d /data && test -f /data/josh.db && echo OK'
# Each substrate container bind-mounts /data
for svc in josh-core josh-ingester josh-embedder; do
ssh josh "docker inspect \$(docker ps -qf name=$svc) \
--format '{{ range .Mounts }}{{ .Source }} -> {{ .Destination }} ({{ .Type }}){{ println }}{{ end }}'" \
| grep "^/data -> /data (bind)$"
done
Expect
The grep'd lines are the load-bearing ones — if any inspect shows `volume` instead of `bind`, the deploy is misconfigured per CLAUDE.md.
Clarifications needed
None.
Out of scope
- Off-volume backup — separate spec (`substrate-nightly-backup`).
- Multi-region replication — Step 2 territory.
- The specific durable-device choice (DO volume vs OVHcloud RAID vs other) — covered by `substrate-bare-metal-host` for the v1 host shape.
Dependencies
None.
Plan
## The contract
- Host: /data exists, is writable by the docker daemon's effective user,
and lives on a durable + redundant device (cloud block-storage volume
OR RAID array OR equivalent).
- Container: each substrate service mounts /data:/data as a Docker bind
(never a named volume). Env vars SUBSTRATE_DB_PATH=/data/josh.db and
CORPUS_DIR=/data/corpus resolve into that path.
- State layout under /data:
- /data/josh.db (+ -wal, -shm) — SQLite substrate
- /data/corpus/<source>/bodies/{raw,markdown}/... — bodies
- /data/locks/ingest-<source>.lock — flock files
- /data/backups/ — local snapshot staging if used (the nightly backup
script writes to a mktemp -d under PrivateTmp by default)
## Why bind-mount, not named volume
Docker named volumes live under /var/lib/docker/volumes/ on the host's
root filesystem. That defeats the portability goal: the substrate would
be tied to the specific droplet/server's lifecycle, recoverable only by
copying out of the daemon's storage area. A bind mount is just "this
host directory is that container directory" — the substrate lives where
we put it, the container is the throwaway part.
## How this maps onto the v1 host
The v1 host is OVHcloud Advance-1 2024 with Soft RAID 10 across 4 × 960
GB NVMe drives (see substrate-bare-metal-host). /data is either a
separate ext4 partition on the array (mounted via fstab UUID withdefaults,noatime,discard,nofail) or a directory on the root partition
if the OVHcloud installer carved a single partition. Either satisfies
the durability contract.
Each service's config/deploy.yml carries volumes: ["/data:/data"].
Env vars SUBSTRATE_DB_PATH=/data/josh.db and CORPUS_DIR=/data/corpus
are set in the container env. There are no Docker named volumes anywhere.
See CLAUDE.md "Where data lives — volume-as-data-host" and
"Kamal / Docker mount pattern" for the canonical writeup.
Tasks
0 of 6 done.
- t1 Host /data path provisioned on the OVHcloud RAID 10 array (separate partition with UUID fstab entry OR directory on root — either fine per the contract)
- t2 josh-core deploy.yml volume mount + env vars (`SUBSTRATE_DB_PATH=/data/josh.db`, `CORPUS_DIR=/data/corpus`)
- t3 josh-ingester deploy.yml volume mount + env vars
- t4 josh-embedder deploy.yml volume mount + env vars
- t5 Verification command in success_determiner run post-deploy on each service
- t6 CLAUDE.md sections updated to reflect bare-metal RAID 10 backing instead of DO block-storage volume
Changelog
-
2026-05-10T18:00:00Z
shipped→plannedDigitalOcean droplet (which carried the prior bind-mount setup) was destroyed 2026-05-10 ahead of the OVHcloud move. Spec rewritten to drop DO-specific framing — the architectural commitment (durable `/data` bind-mounted into containers) is unchanged; the underlying device flips from "DO 500 GB block-storage volume" to "OVHcloud Soft RAID 10 NVMe array." Tasks reset to false; will land via `substrate-bare-metal-host` provision sequence. -
2026-05-09T08:00:00Z
verified→shippeda414587 Production-stable for ~24h on the prior DO droplet. CLAUDE.md sections polished and merged. CRS smoke backfill ran cleanly against /data/josh.db with no issues. -
2026-05-08T20:30:00Z
in_progress→verifieda184f09 josh-core + josh-ingester deployed on the prior DO droplet; bind-mount confirmed on both via docker inspect. -
2026-05-08T18:00:00Z
(new)→in_progress26b3b56 Provisioned DO droplet + 500 GB volume; mounted at /data via fstab/UUID.