Database migrations
How Josh runs schema migrations against the production SQLite file under Kamal.
Storage backend: SQLite + FTS5 +
sqlite-vec. Alembic still runs the same way as it would against Postgres; only the driver and target change.
Use a Kamal pre-deploy hook that runs Alembic against the new image before the new container boots. The exact pattern:
#!/bin/bash -ekamal app exec -p -q -d "$KAMAL_DESTINATION" --version "$KAMAL_VERSION" "alembic upgrade head"This is the canonical Kamal pattern, what the Rails community canonicalized first (gist by bibstha), and what Alembic + FastAPI + Kamal projects use today. The pattern is identical for SQLite — Alembic doesn't care about the target backend at the orchestration level.
What's different from a Postgres flow
Section titled “What's different from a Postgres flow”The substrate is a single SQLite file at /data/josh.db (with WAL companions), not a Postgres accessory. So:
- No
kamal accessorycommands. No accessory exists. - The DB target is a path, not a host:port. Alembic env reads
SUBSTRATE_DB_PATH=/data/josh.dbfrom env (or a SQLAlchemy URL likesqlite+aiosqlite:////data/josh.db). josh-coreandjosh-ingesterboth bind-mount/datafrom the host so the same file is visible from each container.- Migrations run from
josh-core(the canonical image with Alembic + the shared package), and the file is shared via volume mount with the ingester.
Why pre-deploy, not post-deploy
Section titled “Why pre-deploy, not post-deploy”Pre-deploy (recommended): the hook runs after the new image is built and pulled, but before the new app container replaces the old one. If migration fails, the deploy aborts — the old container keeps serving traffic against the old schema, no inconsistency.
Post-deploy (alternative): runs against the already-swapped new container with --reuse. Risk: brief window where new code runs against old schema. Fine for additive migrations, dangerous for breaking ones.
We standardize on pre-deploy.
Why --version $KAMAL_VERSION, not --reuse
Section titled “Why --version $KAMAL_VERSION, not --reuse”| Flag | What it does | When to use |
|---|---|---|
--version $KAMAL_VERSION | Spins up an ephemeral container from the new image (the one Kamal just built and pulled) | Pre-deploy hooks — the new image has the new migrations |
--reuse | Runs in the currently running app container | Manual ad-hoc commands, post-deploy hooks |
In a pre-deploy hook the old container is still running. It has the old migration files. We want the new migrations from the new image. So --version $KAMAL_VERSION.
Flag breakdown
Section titled “Flag breakdown”kamal app exec -p -q -d "$KAMAL_DESTINATION" --version "$KAMAL_VERSION" "alembic upgrade head"| Flag | Purpose |
|---|---|
-p (alias --primary) | Run on primary server only. Without this the migration runs on every host, racing for the lock and producing duplicate "already applied" errors. |
-q (alias --quiet) | Suppress verbose Kamal output; still prints command output. |
-d "$KAMAL_DESTINATION" | Destination env (e.g. production). Optional today (single destination) but future-proof. |
--version "$KAMAL_VERSION" | Pin to the deploy's image. Set automatically by Kamal in the hook env. |
What the hook does in deploy order
Section titled “What the hook does in deploy order”1. Local: kamal deploy2. Local: build image on ritz remote builder, push to Docker Hub3. SSH: pull image on josh server4. Local: → run .kamal/hooks/pre-deploy ← ↳ kamal app exec --version <new> "alembic upgrade head" ↳ The ephemeral migration container mounts /data the same way the running app does ↳ Schema in /data/josh.db now matches the version the new image expects ↳ Old josh-core container still serving requests against the same file (SQLite serializes the writer; the running app's reads are fine in WAL) ↳ If migration fails, hook returns nonzero, deploy aborts5. SSH: start new josh-core container (will see the new schema)6. Local: kamal-proxy waits for new container's /health, then swaps traffic7. SSH: stop old containerSetup status
Section titled “Setup status”Pre-deploy hook is wired and live as of 2026-05-09. Two SQLite migrations are applied in production: 0001_state_tables (ingestion state) and 0002_crs (CRS reports + FTS5 + vec0). Future deploys auto-apply migration #3 onward via the hook.
Migrations live in the shared/josh_substrate/ package — every service installs this package and has access to the same migrations. See repo-structure for the rationale.
Files in place (or to be re-authored):
shared/josh_substrate/pyproject.toml— needs deps swap:aiosqlite(driver) +sqlalchemy[asyncio]+alembic+pydantic+sqlite-vec(PyPI wheel that bundles the extension binary). Droppsycopg[binary].shared/josh_substrate/alembic.ini— keepscript_location = josh_substrate:migrations.shared/josh_substrate/src/josh_substrate/migrations/env.py— async migration runner readingSUBSTRATE_DB_PATHfrom env, usingsqlite+aiosqlitedriver. SetsPRAGMA foreign_keys = ONandPRAGMA journal_mode = WALon connection.shared/josh_substrate/src/josh_substrate/migrations/versions/0001_state_tables.py— first migration: state tables in SQLite-flavored DDL.- Project-root
Dockerfile— copiesshared/josh_substrateand pip-installs it; copiesalembic.inito/app/alembic.ini. Every role-container ships from this image, so any role has access to alembic + migrations.
In the running container, alembic upgrade head (run from /app) reads /app/alembic.ini, follows josh_substrate:migrations, and applies migrations from inside the installed package against the SQLite file at SUBSTRATE_DB_PATH.
Pre-deploy hook lives at .kamal/hooks/pre-deploy (project root). The hook targets --roles web so alembic runs once across the role-containers (not once per role). Web is the conceptual schema owner; ingester (and embedder, when activated) pick up the new schema implicitly via the shared /data/josh.db bind mount.
First deploy on a fresh host
Section titled “First deploy on a fresh host”The pre-deploy hook breaks kamal setup on a fresh host. Reason: the hook's kamal app exec --version <new> "alembic upgrade head" internally does docker run --env-file .kamal/apps/josh/env/roles/web.env, but that env file isn't pushed to the host until kamal starts the first container — which only happens after the pre-deploy hook. Chicken and egg.
The deploy fails with:
docker: open .kamal/apps/josh/env/roles/web.env: no such file or directoryERROR (Kamal::Cli::HookError): Hook `pre-deploy` failedThis is harmless for subsequent deploys (env files persist on the host once written), but it blocks every fresh-host bootstrap. The workaround:
# Run from project root.
# 1. Skip the hook on first setup — env files get pushed during this step,# container starts, healthcheck passes, but /data/josh.db is empty.env LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 kamal setup --skip-hooks
# 2. Apply migrations manually against the now-running web container.env LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 kamal app exec --reuse --roles web "alembic upgrade head"
# 3. Confirm.env LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 kamal app exec --reuse --roles web "alembic current"# Expected: e.g. "0005 (head)"After this one-time bootstrap, all subsequent kamal deploy runs apply migrations cleanly via the pre-deploy hook. The hook's logic is correct; only the first-deploy ordering is broken.
Why not fix the hook? The hook would need to ssh to each host, check for the env file, and conditionally run the migration — which adds a non-trivial probe step on every deploy to handle a once-per-host edge case. Documenting the bootstrap workaround is the cheaper solution. Revisit if this trips up multiple operators.
Manual / ad-hoc migration runs
Section titled “Manual / ad-hoc migration runs”For now, applying migrations is a manual step after each deploy that contains schema changes. Run from project root (kamal looks for config/deploy.yml relative to cwd). Pass --roles web so the command runs against the web role's container — the conceptual schema owner.
# Run from project root.## CRITICAL: Always wrap kamal commands with locale env vars on macOS — kamal v2.9.0# hits a Ruby US-ASCII encoding error without them. Skip and you get:# ERROR (ArgumentError): Exception while executing on host: invalid byte sequence in US-ASCIIenv LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 kamal app exec --reuse --roles web "alembic upgrade head"
# Inspect current schema versionenv LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 kamal app exec --reuse --roles web "alembic current"
# Author a new migration (use script.py.mako template; edit the generated file# under shared/josh_substrate/src/josh_substrate/migrations/versions/)env LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 kamal app exec --reuse --roles web "alembic revision -m 'add fr_documents'"
# Roll back one revision (rarely needed; usually fix-forward)env LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 kamal app exec --reuse --roles web "alembic downgrade -1"Gotcha 1: chained commands inside the quoted string get parsed by your local shell, not the container's. Don't write
"alembic current && alembic upgrade head"— the&&runs locally andalembicisn't on your local PATH. Run two separatekamal app execcalls, or wrap insh -c '...'.
Gotcha 2:
alembic revisionwrites the new migration into the installed package inside the container. That file is inside the Docker image, not in your repo. You'll need to copy it out (docker cp) or — far better — author migrations by hand inshared/josh_substrate/src/josh_substrate/migrations/versions/and rebuild the image. We may add a small developer script for this when it becomes friction.
The wired pre-deploy hook
Section titled “The wired pre-deploy hook”The hook script (committed at .kamal/hooks/pre-deploy):
#!/bin/bashset -e
service_dir=$(basename "$PWD")
if [ "$service_dir" != "josh-core" ]; then echo "[pre-deploy] cwd=$service_dir — skipping alembic (only josh-core owns migrations)" exit 0fi
echo "[pre-deploy] [$KAMAL_VERSION] running alembic upgrade head against new image…"kamal app exec -p -q -d "${KAMAL_DESTINATION:-production}" --version "$KAMAL_VERSION" "alembic upgrade head"echo "[pre-deploy] [$KAMAL_VERSION] migration complete"Why the cwd gate: .kamal is shared via symlink. kamal deploy from josh-ingester/ would also find this hook, but josh-ingester's Dockerfile doesn't COPY alembic.ini — running alembic there would fail with "no config file". The check on basename "$PWD" keeps josh-ingester deploys clean.
For local iteration on a migration without a full deploy, run Alembic against a local SQLite file:
cd shared/josh_substrateSUBSTRATE_DB_PATH=/tmp/josh-dev.db alembic upgrade headSUBSTRATE_DB_PATH=/tmp/josh-dev.db alembic revision --autogenerate -m "msg"The SUBSTRATE_DB_PATH env points the Alembic env at a throwaway DB so you can iterate on migration shape before pushing to a real environment.
Failure modes
Section titled “Failure modes”| Failure | Effect | Recovery |
|---|---|---|
| Migration fails (SQL error) | Pre-deploy hook exits nonzero, deploy aborts before swap. Old container still serving against old schema. | Fix migration, redeploy. |
| Hook script not executable | Kamal silently skips it. Migration doesn't run. | chmod +x .kamal/hooks/pre-deploy. |
Hook runs but no -p flag | Migration fires on every host, races. With single-host deploy this is harmless; with multi-host it produces duplicate-application errors. | Add -p. |
--reuse used in pre-deploy | Old container's old code/migrations run. New schema not applied. New container will fail healthcheck or run against wrong schema. | Use --version $KAMAL_VERSION in pre-deploy. |
| Migration runs while ingester is mid-write | SQLite serializes — migration's BEGIN IMMEDIATE will wait for the ingester's transaction to commit, then run. With WAL it's a small pause, not a failure. | Schedule migrations during ingester quiet windows if any single migration is slow. |
| Bad migration leaves SQLite in mixed state | SQLite has DDL transactions for most statements; partial application is rare. If a migration hits a non-transactional statement and fails, schema may be partly applied. | Hand-write a recovery migration to align the schema with what alembic_version says. |
v2 considerations
Section titled “v2 considerations”- Multi-environment support. When we add staging, the
-d "$KAMAL_DESTINATION"already targets the right env. Each destination would have its own deploy.yml. - Backups before risky migrations. Take a Litestream snapshot before any destructive migration. With Litestream, this is automatic — every migration is implicitly captured by the streaming WAL.
- Long migrations. SQLite doesn't have
CREATE INDEX CONCURRENTLY. Build large indexes during low-traffic windows; for very large tables, build the index on a side copy and swap the file (rare; not a v1 concern).