Architecture

The substrate that everything else hangs off: the server, the volume, the containers, and the storage stack. This is the load-bearing design of Josh Foundation v1.

Containers

Step 1 ships two Docker containers (deployed via Kamal). A future stage will add a third (josh-web for the agent UI). SQLite is embedded — no separate database container.

Container	Role
`josh-core`	FastAPI. Substrate REST API + MCP server. Owns the SQLite file at `/data/josh.db` (volume-mounted). What external agents (Cowork, Cursor, ChatGPT desktop, custom) call with a token.
`josh-ingester`	Headless ETL workers. Pulls public federal data on a schedule, parses, normalizes, writes into the same SQLite file (volume mounted from the host). Run state lives in SQLite tables (`ingestion_runs`, `ingestion_logs`, etc.). No UI of its own.
`josh-web` (planned)	Next.js. Future agent UI layered on the substrate. Not in v1.

Naming choice: josh-core, not josh-api — half its job is non-AI substrate access. The name reflects that.

The CLI (josh) is a separate binary, not a container — it's a client of josh-core's REST API.

Concurrent access to one SQLite file from two containers is supported via WAL mode + filesystem-level locking — both containers share /data/josh.db via a host bind mount. SQLite serializes writers (busy_timeout=10000ms, BEGIN IMMEDIATE for write transactions) and our ingester is batch-shaped, so this is a non-issue at v1 scale.

Server

OVHcloud Advance-1 2024 dedicated bare-metal server, Ubuntu 24.04 LTS, Vint Hill VA datacenter. Single node for v1. Ordered 2026-05-10, provisioned 2026-05-11 after the prior DigitalOcean droplet was destroyed; the rationale (cost-per-TB at our projected backfill volumes, runway under conservative + aggressive scenarios, drive-failure workflow) lives in substrate-bare-metal-host.

Resource	Size
CPU	AMD EPYC 4244P (6c / 12t, 3.8–5.1 GHz boost)
RAM	32 GB DDR5 ECC at 5200 MHz (upgradable to 192 GB later, ~hours of downtime)
Swap	none (add if memory pressure shows)
Storage	4 × 960 GB NVMe SSD Enterprise in Soft RAID 10 (mdadm) → ~1.92 TB usable. `/` and `/data` both ride the same array; `/data` is a directory inside the root filesystem.
Network	3 Gbps public unmetered + 25 Gbps private unmetered, anti-DDoS included
Cost	~$165/mo (free install fee on 2024 chipset)

SQLite has a much smaller memory footprint than Postgres would, so the 32 GB ceiling is generous — most of it is page cache for the substrate file and headroom for the embedder when query-time embeddings load. The RAID 10 array is the data host (see below); we don't separate root and data into different physical devices because there isn't one.

SSH

ssh josh                        # interactive
ssh josh 'command'              # one-off command

Configured in ~/.ssh/config as user root, key ~/.ssh/id_rsa.

Where data lives

Architectural commitment: every byte of durable substrate state lives under /data; everything else on the host is treated as reproducible from kamal deploy. On bare metal both / and /data sit on the same RAID 10 array, so the boundary is now a convention rather than a physically separate device — but the convention is what keeps the substrate portable. To migrate to a successor host: rsync -avzP josh-old:/data/ josh-new:/data/, repoint Kamal, restart containers. Nothing outside /data needs to come along.

Concretely:

Path on host	What it is
`/data/josh.db` (+ `josh.db-wal`, `josh.db-shm`)	The substrate SQLite file in WAL mode
`/data/corpus/<source>/bodies/{raw,markdown}/...`	Raw fetched payloads + normalized Markdown bodies (per ingestion architecture)
`/data/locks/ingest-<source>.lock`	Per-source flock advisory locks for the ingester
`/data/backups/`	Local snapshot staging before `restic` ships to S3-compatible target

Redundancy comes from Soft RAID 10 (mdadm) across all four NVMe drives — striped mirrors, single-drive fault tolerance, double the IOPS of RAID 1. Drive failure workflow is "file ticket with OVHcloud → they swap the failed disk in the chassis → mdadm --add /dev/md0 /dev/<new> rebuilds the array." A few hours degraded, then back to healthy. cat /proc/mdstat is the canonical health check.

Anything not under /data is throwaway — OS files, Docker images, the cloned repo, and runtime caches all rebuild from a fresh kamal setup. The OS partition uses ~5% of the array, so this is a small ask in practice. The previous DigitalOcean shape physically enforced this boundary with a separable block-storage volume; on bare metal we trade that physical guarantee for substantially more runway and IOPS, and rely on the convention plus the nightly backup as the durability story.

Kamal / Docker mount pattern

Containers must bind-mount the host's /data directory into the container at /data so writes go to the redundant array, not into the container's writable layer or a Docker named volume under /var/lib/docker/volumes/.

In the project-root config/deploy.yml (one Kamal service josh with multiple roles built from the same image):

volumes:
  - "/data:/data"

env:
  clear:
    SUBSTRATE_DB_PATH: /data/josh.db
    CORPUS_DIR: /data/corpus
    LOCK_DIR: /data/locks
    HF_HOME: /data/cache/huggingface

What NOT to do:

❌ Named Docker volumes (postgres_data:/var/lib/postgresql/data). Docker stores these under /var/lib/docker/volumes/, which doesn't carry across hosts on rsync and breaks the "everything durable lives in /data" convention.
❌ In-container writes to non-mounted paths. A container that writes to /app/data/josh.db puts the file in the container's writable layer, which is destroyed on container swap/recreate. Always write to a bind-mounted path.
❌ Bind-mounting subdirectories instead of /data. Mounting /data/josh.db:/app/josh.db works but fragments the contract. Mount the whole /data and let the container access the substrate, corpus, and locks through it.

The earlier josh-postgres/ container (now removed) used a named volume — one of the reasons it was dropped alongside the SQLite swap.

Verification after each Kamal deploy:

ssh josh 'docker inspect <container> --format "{{ range .Mounts }}{{ .Source }} -> {{ .Destination }} ({{ .Type }}){{ println }}{{ end }}"'

Expected output includes /data -> /data (bind). If it shows volume instead of bind, the deploy is misconfigured.

Where the schema lives

Migrations under shared/josh_substrate/src/josh_substrate/migrations/versions/ are the source of truth. Per-source docs in docs/sources/ capture probe findings, endpoint specs, parser notes, and indicative schema sketches — but indicative DDL there is illustrative, not canonical. When per-source docs and migrations disagree, migrations win. This avoids 30 places to update for any cross-cutting schema change.

Storage stack

The substrate is SQLite, end to end. Same file format, same query surface, same migrations across every deployment. Decision locked 2026-05-07.

What we use

Concern	Stack
Database	SQLite 3 in WAL mode, `busy_timeout=10000`, `BEGIN IMMEDIATE` for writers
Full-text search	FTS5 with BM25 + per-column weights via `bm25(table, w_title, w_abstract, w_action, w_body)`. Native phrase queries, AND/OR/NOT, prefix match.
Vector search (today)	`sqlite-vec` stable releases — brute-force vectors with binary quantization (BQ) + rescore. ~10–50ms latency at 1M chunks; ~95% recall vs full-precision float.
Vector search (later)	`vec1` (Dan Kennedy / sqlite.org, IVFADC + OPQ). Adopt when it cuts a first release. Migration is a `CREATE TABLE … vec1(…)` swap, not a schema redesign. Until then we ship on `sqlite-vec`.
Migrations	Alembic with the `sqlite+aiosqlite://` driver. Same package layout as the Postgres-shaped iteration.
Backup	`restic` — nightly snapshot of the whole `/data` tree to cold-tier S3-compatible object storage. The substrate is regenerable public data, so a 24h RPO is acceptable; recovery is "restore last night + re-run the day's ingester delta." See substrate-nightly-backup. Litestream WAL streaming was evaluated and declined for the substrate — substrate-litestream-backup records the rationale.
Replication (future)	Substrate read scale-out is unsolved and unscheduled — a single node is sufficient for v1. Evaluate the options (read replica, successor host, or libSQL) if and when measured read load demands it.

Why SQLite, why now

FTS5 BM25 ranks better than Postgres ts_rank_cd — ts_rank_cd lacks IDF and document-length normalization. FTS5 ships native BM25 with column weights. We upgrade search ranking by switching, not downgrade.
sqlite-vec BQ+rescore handles our scale. At our projected ~50M chunks (1024-dim Arctic-L), brute-force float would be infeasible, but BQ scan (32× compression, ~95% recall) is interactive. vec1 will give pgvector-class ANN performance when it releases.
Single-file deploy. "The federal policy substrate that ships as a single file" — pip install josh-substrate && josh init instead of provisioning Postgres + accessory containers. Easy to demo, easy to verify.
Lower memory footprint — leaves plenty of the host's 32 GB free for page cache and the embedder workload.
Operational simplicity — cp josh.db for a cold copy, restic for nightly off-host backup. No accessory container, no auth setup, no port management.
Schema porting cost is small now because the previous Postgres-shaped iteration was just rolled back. Locking in SQLite before per-source ingestion is rebuilt avoids paying the cost twice.

Tradeoffs we accept

Single writer with WAL. The ingester is batch-shaped (one source at a time, can serialize) and the harness is read-mostly. Manageable today; revisit if a multi-tenant deployment ever needs concurrent writes from many users (Turso libSQL MVCC or Bedrock-style coordination are escape hatches).
No native arrays. Junction tables instead of text[] (e.g., fr_document_rins(fr_document_id, rin)). Standard SQL pattern.
No JSONB. SQLite's JSON1 covers most needs; path indexing is rarer in our access pattern (we mostly read raw_json whole for re-parse).
No pg_trgm. Not used in current schema; FTS5 prefix matching covers fuzzy needs.
sqlite-vec ANN is alpha; we ship on BQ+rescore until vec1 releases. Acceptable tradeoff given recall numbers.

When to reconsider

Don't lightly. Reasons we'd genuinely re-evaluate:

Vector scale or latency stops being workable even with vec1 IVFADC+OPQ — i.e., we measure unacceptable agent retrieval times at our actual corpus size.
A procurement requirement hard-blocks on "Postgres only" (perception-driven, but real if it shows up).
A multi-tenant concurrent-write profile turns out to demand more than libSQL MVCC or shard-per-project can give us.

Deployment

Development happens on the server for runtime work, but the repo lives in your working directory and is the source of truth for code, configs, and docs. Kamal builds on ritz (remote builder) and deploys to the josh server.

One image, multiple roles. A single root Dockerfile installs shared/josh_substrate + all service packages; the root config/deploy.yml defines roles (web, ingester, and an opt-in embedder) with different CMDs against the same image. kamal setup from project root brings up every active role; kamal deploy --roles=web (etc.) deploys selectively. See repo structure for the rationale.

Deployment is Kamal-only. We removed docker-compose.yml — local development happens against the deployed substrate, and production happens via Kamal. When the OSS Foundation packaging ships at Step 1 launch, we'll add an OSS-friendly path (single docker run, prebuilt image, or compose file then) sized for self-hosters. Until then, Kamal is the only deploy path. See new host setup for provisioning a new host from scratch.