Architecture
The substrate that everything else hangs off: the server, the volume, the containers, and the storage stack. This is the load-bearing design of Josh Foundation v1.
Containers
Section titled “Containers”Step 1 ships two Docker containers (deployed via Kamal). A future stage will
add a third (josh-web for the agent UI). SQLite is embedded — no separate
database container.
| Container | Role |
|---|---|
josh-core | FastAPI. Substrate REST API + MCP server. Owns the SQLite file at /data/josh.db (volume-mounted). What external agents (Cowork, Cursor, ChatGPT desktop, custom) call with a token. |
josh-ingester | Headless ETL workers. Pulls public federal data on a schedule, parses, normalizes, writes into the same SQLite file (volume mounted from the host). Run state lives in SQLite tables (ingestion_runs, ingestion_logs, etc.). No UI of its own. |
josh-web (planned) | Next.js. Future agent UI layered on the substrate. Not in v1. |
Naming choice: josh-core, not josh-api — half its job is non-AI
substrate access. The name reflects that.
The CLI (josh) is a separate binary, not a container — it's a client of
josh-core's REST API.
Concurrent access to one SQLite file from two containers is supported via WAL
mode + filesystem-level locking — both containers share /data/josh.db via a
host bind mount. SQLite serializes writers (busy_timeout=10000ms,
BEGIN IMMEDIATE for write transactions) and our ingester is batch-shaped, so
this is a non-issue at v1 scale.
Server
Section titled “Server”OVHcloud Advance-1 2024 dedicated bare-metal server, Ubuntu 24.04 LTS, Vint Hill VA datacenter. Single node for v1. Ordered 2026-05-10, provisioned 2026-05-11 after the prior DigitalOcean droplet was destroyed; the rationale (cost-per-TB at our projected backfill volumes, runway under conservative + aggressive scenarios, drive-failure workflow) lives in substrate-bare-metal-host.
| Resource | Size |
|---|---|
| CPU | AMD EPYC 4244P (6c / 12t, 3.8–5.1 GHz boost) |
| RAM | 32 GB DDR5 ECC at 5200 MHz (upgradable to 192 GB later, ~hours of downtime) |
| Swap | none (add if memory pressure shows) |
| Storage | 4 × 960 GB NVMe SSD Enterprise in Soft RAID 10 (mdadm) → ~1.92 TB usable. / and /data both ride the same array; /data is a directory inside the root filesystem. |
| Network | 3 Gbps public unmetered + 25 Gbps private unmetered, anti-DDoS included |
| Cost | ~$165/mo (free install fee on 2024 chipset) |
SQLite has a much smaller memory footprint than Postgres would, so the 32 GB ceiling is generous — most of it is page cache for the substrate file and headroom for the embedder when query-time embeddings load. The RAID 10 array is the data host (see below); we don't separate root and data into different physical devices because there isn't one.
ssh josh # interactivessh josh 'command' # one-off commandConfigured in ~/.ssh/config as user root, key ~/.ssh/id_rsa.
Where data lives
Section titled “Where data lives”Architectural commitment: every byte of durable substrate state lives under
/data; everything else on the host is treated as reproducible from
kamal deploy. On bare metal both / and /data sit on the same RAID 10
array, so the boundary is now a convention rather than a physically separate
device — but the convention is what keeps the substrate portable. To migrate to
a successor host: rsync -avzP josh-old:/data/ josh-new:/data/, repoint Kamal,
restart containers. Nothing outside /data needs to come along.
Concretely:
| Path on host | What it is |
|---|---|
/data/josh.db (+ josh.db-wal, josh.db-shm) | The substrate SQLite file in WAL mode |
/data/corpus/<source>/bodies/{raw,markdown}/... | Raw fetched payloads + normalized Markdown bodies (per ingestion architecture) |
/data/locks/ingest-<source>.lock | Per-source flock advisory locks for the ingester |
/data/backups/ | Local snapshot staging before restic ships to S3-compatible target |
Redundancy comes from Soft RAID 10 (mdadm) across all four NVMe drives —
striped mirrors, single-drive fault tolerance, double the IOPS of RAID 1. Drive
failure workflow is "file ticket with OVHcloud → they swap the failed disk in the
chassis → mdadm --add /dev/md0 /dev/<new> rebuilds the array." A few hours
degraded, then back to healthy. cat /proc/mdstat is the canonical health check.
Anything not under /data is throwaway — OS files, Docker images, the cloned
repo, and runtime caches all rebuild from a fresh kamal setup. The OS partition
uses ~5% of the array, so this is a small ask in practice. The previous
DigitalOcean shape physically enforced this boundary with a separable
block-storage volume; on bare metal we trade that physical guarantee for
substantially more runway and IOPS, and rely on the convention plus the nightly
backup as the durability story.
Kamal / Docker mount pattern
Section titled “Kamal / Docker mount pattern”Containers must bind-mount the host's /data directory into the container at
/data so writes go to the redundant array, not into the container's writable
layer or a Docker named volume under /var/lib/docker/volumes/.
In the project-root config/deploy.yml (one Kamal service josh with multiple
roles built from the same image):
volumes: - "/data:/data"
env: clear: SUBSTRATE_DB_PATH: /data/josh.db CORPUS_DIR: /data/corpus LOCK_DIR: /data/locks HF_HOME: /data/cache/huggingfaceWhat NOT to do:
- ❌ Named Docker volumes (
postgres_data:/var/lib/postgresql/data). Docker stores these under/var/lib/docker/volumes/, which doesn't carry across hosts on rsync and breaks the "everything durable lives in/data" convention. - ❌ In-container writes to non-mounted paths. A container that writes to
/app/data/josh.dbputs the file in the container's writable layer, which is destroyed on container swap/recreate. Always write to a bind-mounted path. - ❌ Bind-mounting subdirectories instead of
/data. Mounting/data/josh.db:/app/josh.dbworks but fragments the contract. Mount the whole/dataand let the container access the substrate, corpus, and locks through it.
The earlier josh-postgres/ container (now removed) used a named volume — one of
the reasons it was dropped alongside the SQLite swap.
Verification after each Kamal deploy:
ssh josh 'docker inspect <container> --format "{{ range .Mounts }}{{ .Source }} -> {{ .Destination }} ({{ .Type }}){{ println }}{{ end }}"'Expected output includes /data -> /data (bind). If it shows volume instead of
bind, the deploy is misconfigured.
Where the schema lives
Section titled “Where the schema lives”Migrations under shared/josh_substrate/src/josh_substrate/migrations/versions/
are the source of truth. Per-source docs in docs/sources/ capture probe
findings, endpoint specs, parser notes, and indicative schema sketches — but
indicative DDL there is illustrative, not canonical. When per-source docs and
migrations disagree, migrations win. This avoids 30 places to update for any
cross-cutting schema change.
Storage stack
Section titled “Storage stack”The substrate is SQLite, end to end. Same file format, same query surface, same migrations across every deployment. Decision locked 2026-05-07.
What we use
Section titled “What we use”| Concern | Stack |
|---|---|
| Database | SQLite 3 in WAL mode, busy_timeout=10000, BEGIN IMMEDIATE for writers |
| Full-text search | FTS5 with BM25 + per-column weights via bm25(table, w_title, w_abstract, w_action, w_body). Native phrase queries, AND/OR/NOT, prefix match. |
| Vector search (today) | sqlite-vec stable releases — brute-force vectors with binary quantization (BQ) + rescore. ~10–50ms latency at 1M chunks; ~95% recall vs full-precision float. |
| Vector search (later) | vec1 (Dan Kennedy / sqlite.org, IVFADC + OPQ). Adopt when it cuts a first release. Migration is a CREATE TABLE … vec1(…) swap, not a schema redesign. Until then we ship on sqlite-vec. |
| Migrations | Alembic with the sqlite+aiosqlite:// driver. Same package layout as the Postgres-shaped iteration. |
| Backup | restic — nightly snapshot of the whole /data tree to cold-tier S3-compatible object storage. The substrate is regenerable public data, so a 24h RPO is acceptable; recovery is "restore last night + re-run the day's ingester delta." See substrate-nightly-backup. Litestream WAL streaming was evaluated and declined for the substrate — substrate-litestream-backup records the rationale. |
| Replication (future) | Substrate read scale-out is unsolved and unscheduled — a single node is sufficient for v1. Evaluate the options (read replica, successor host, or libSQL) if and when measured read load demands it. |
Why SQLite, why now
Section titled “Why SQLite, why now”- FTS5 BM25 ranks better than Postgres
ts_rank_cd—ts_rank_cdlacks IDF and document-length normalization. FTS5 ships native BM25 with column weights. We upgrade search ranking by switching, not downgrade. sqlite-vecBQ+rescore handles our scale. At our projected ~50M chunks (1024-dim Arctic-L), brute-force float would be infeasible, but BQ scan (32× compression, ~95% recall) is interactive.vec1will give pgvector-class ANN performance when it releases.- Single-file deploy. "The federal policy substrate that ships as a single
file" —
pip install josh-substrate && josh initinstead of provisioning Postgres + accessory containers. Easy to demo, easy to verify. - Lower memory footprint — leaves plenty of the host's 32 GB free for page cache and the embedder workload.
- Operational simplicity —
cp josh.dbfor a cold copy,resticfor nightly off-host backup. No accessory container, no auth setup, no port management. - Schema porting cost is small now because the previous Postgres-shaped iteration was just rolled back. Locking in SQLite before per-source ingestion is rebuilt avoids paying the cost twice.
Tradeoffs we accept
Section titled “Tradeoffs we accept”- Single writer with WAL. The ingester is batch-shaped (one source at a time, can serialize) and the harness is read-mostly. Manageable today; revisit if a multi-tenant deployment ever needs concurrent writes from many users (Turso libSQL MVCC or Bedrock-style coordination are escape hatches).
- No native arrays. Junction tables instead of
text[](e.g.,fr_document_rins(fr_document_id, rin)). Standard SQL pattern. - No JSONB. SQLite's
JSON1covers most needs; path indexing is rarer in our access pattern (we mostly readraw_jsonwhole for re-parse). - No
pg_trgm. Not used in current schema; FTS5 prefix matching covers fuzzy needs. sqlite-vecANN is alpha; we ship on BQ+rescore untilvec1releases. Acceptable tradeoff given recall numbers.
When to reconsider
Section titled “When to reconsider”Don't lightly. Reasons we'd genuinely re-evaluate:
- Vector scale or latency stops being workable even with
vec1IVFADC+OPQ — i.e., we measure unacceptable agent retrieval times at our actual corpus size. - A procurement requirement hard-blocks on "Postgres only" (perception-driven, but real if it shows up).
- A multi-tenant concurrent-write profile turns out to demand more than libSQL MVCC or shard-per-project can give us.
Deployment
Section titled “Deployment”Development happens on the server for runtime work, but the repo lives in your
working directory and is the source of truth for code, configs, and docs. Kamal
builds on ritz (remote builder) and deploys to the josh server.
One image, multiple roles. A single root Dockerfile installs
shared/josh_substrate + all service packages; the root config/deploy.yml
defines roles (web, ingester, and an opt-in embedder) with different CMDs
against the same image. kamal setup from project root brings up every active
role; kamal deploy --roles=web (etc.) deploys selectively. See
repo structure for the rationale.
Deployment is Kamal-only. We removed docker-compose.yml — local development
happens against the deployed substrate, and production happens via Kamal. When the
OSS Foundation packaging ships at Step 1 launch, we'll add an OSS-friendly path
(single docker run, prebuilt image, or compose file then) sized for
self-hosters. Until then, Kamal is the only deploy path. See
new host setup for provisioning a new host from
scratch.