New ARCHITECTURE.md pulls the desktop app (TECHNICAL.md) and the
license server (LICENSE-SERVER.md) into a single picture — the two
were never reconciled into an end-to-end view before.
Contents:
§1. System diagram (ASCII) showing operator laptop, license
server stack (nginx → FastAPI → Postgres), Postmark, Gumroad,
and the buyer's machine — with the three primary flows
(sale, manual mint, offline activation) traced through it.
§2. Tech stack diagram, layered: desktop / server / operator /
external SaaS, with version pins.
§3. Trust + isolation boundaries table — what crosses each one
and what the threat model is.
§4. "Where things are stored" — paths, tables, files.
§5. Pointers to the deeper per-component docs.
ASCII over Mermaid since the repo's Gitea version is unknown and
plain text renders in every viewer / IDE / raw `cat`.
LICENSE-SERVER.md status flipped from "design proposal, not built"
to "deployed (PR 1 + PR 2 code merged)" — that was stale since
the PR 1 deploy yesterday.
TECHNICAL.md and ADMIN.md gain one-line pointers to ARCHITECTURE.md
so people land at the unified view when looking for "how does it
all fit together".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
20 KiB
ARCHITECTURE — end-to-end view
Stitches the desktop app (TECHNICAL.md) and the license server
(LICENSE-SERVER.md) into a single picture. Read this first for "how
does it all fit together"; drill into the per-component docs for
detail.
1. System diagram
┌────────────────────────────────────────────────────────────────────────┐
│ OPERATOR / DEVELOPER LAPTOP │
│ │
│ git clone / push ←─── code lives in git.invixiom.com │
│ datatools-admin CLI ─── manual mints, list, revoke ─────┐ │
│ ssh -L 8090:127.0.0.1:8090 ───── tunnel for /internal/* ─────┤ │
└────────────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────┘
│
│ internal Bearer-auth API (over SSH tunnel only)
▼
┌────────────────────────────────────────────────────────────────────────┐
│ LICENSE SERVER — 46.225.166.142 │
│ ───────────────────────────────────────────────────────────────── │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ nginx 1.24 (TLS termination, public reverse proxy) │ │
│ │ │ │
│ │ datatools.unalogix.com → static placeholder │ │
│ │ licenses.datatools.unalogix.com → 127.0.0.1:8090 (FastAPI) │ │
│ │ /internal/* on public surface → blocked (404) │ │
│ └────────────────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────▼─────────────────────────────────────┐ │
│ │ FastAPI app — datatools-api (Docker container, UID 10001) │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ ┌───────────────┐ │ │
│ │ │ /webhooks/* │ │ /internal/* │ │ /health │ │ │
│ │ │ (storefronts) │ │ (Bearer-auth) │ │ (public) │ │ │
│ │ └────────┬─────────┘ └────────┬─────────┘ └───────────────┘ │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ ┌────────────────────────────────────────┐ │ │
│ │ │ SourceAdapter (Protocol) — normalized │ │ │
│ │ │ • ManualAdapter • GumroadAdapter │ │ │
│ │ │ • (LemonSqueezy, Stripe — future) │ │ │
│ │ └────────────────┬───────────────────────┘ │ │
│ │ │ SaleEvent / RefundEvent │ │
│ │ ▼ │ │
│ │ ┌────────────────────────────────────────┐ │ │
│ │ │ mint_from_sale() │ │ │
│ │ │ • Ed25519 sign via PyCA cryptography │ │ │
│ │ │ • idempotent on (source, order_id) │ │ │
│ │ └────────────────┬───────────────────────┘ │ │
│ └────────────────────┼─────────────────────────────────────────────┘ │
│ │ SQL │
│ ┌────────────────────▼─────────────────────────────────────────────┐ │
│ │ Postgres 16 — datatools-postgres (container, vol pg_data) │ │
│ │ • licenses — authoritative customer record │ │
│ │ • gumroad_events — webhook audit log (idempotency, replay) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
└───────────────────────┬────────────────────────────────┬───────────────┘
│ │
┌───────────┘ └──────────┐
│ POST /email (httpx) Gumroad Ping│
▼ POST │
┌───────────────────┐ ┌─────────────▼──┐
│ Postmark │ │ Gumroad │
│ (transactional │ │ (storefront, │
│ email) │ │ payments) │
└───────┬───────────┘ └────────────────┘
│ DKIM-signed email with license blob ▲
▼ │
┌────────────────────────────────────────────────────────────────┴───────┐
│ BUYER'S MACHINE │
│ │
│ Receives email ──► copies DTLIC2: blob ──► pastes into desktop app │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ DataTools desktop (Python 3.12 + Streamlit + Typer CLIs) │ │
│ │ │ │
│ │ ┌────────────────────────────────────────────────────────────┐ │ │
│ │ │ Activate screen — verifies blob signature │ │ │
│ │ │ against EMBEDDED Ed25519 public key │ │ │
│ │ │ (NO network call to the license server, ever) │ │ │
│ │ └─────────────────────────┬──────────────────────────────────┘ │ │
│ │ ▼ │ │
│ │ ~/.datatools/license.json (signed blob, mode 644, on disk) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ Pays via web browser ─────► Gumroad ────► (kicks off the Ping) │
└────────────────────────────────────────────────────────────────────────┘
Three primary flows, distinguishable by where the green arrows start in the diagram:
-
Sale → fulfillment (the automated path) Buyer pays at Gumroad → Gumroad fires Ping to
licenses.datatools.unalogix.com/webhooks/gumroad?secret=…→ nginx → FastAPI → audit-log row → adapter normalizes payload →mint_from_salewrites thelicensesrow + Ed25519-signs the blob → Postmark emails the buyer their blob. End-to-end latency: a few hundred milliseconds. -
Manual mint (operator path — comps, support replacements) Operator opens SSH tunnel →
datatools-admin mint→/internal/mint(Bearer-authed, never publicly reachable) → samemint_from_salepath → blob returned in HTTP response. Operator delivers to buyer out-of-band. -
Activation (buyer path — fully offline) Buyer pastes blob into desktop's Activate screen → desktop verifies the Ed25519 signature against the public key embedded in the shipped binary → license written to
~/.datatools/license.json. The desktop app makes no network calls to the license server at any point. This preserves the "your data never leaves your computer" promise (DECISIONS.md §9b).
2. Tech stack
Layered view of what technology lives where. "External SaaS" entries are services we depend on but don't operate.
┌────────────────────────────────────────────────────────────────────────┐
│ DESKTOP APP (shipped binary, runs on buyer's box) │
├──────────────────┬─────────────────────────────────────────────────────┤
│ GUI │ Streamlit 1.35 — local web server, browser opens │
│ CLI │ Typer 0.12 — per-tool entry points │
│ Core logic │ pandas 2.x, numpy, rapidfuzz, charset-normalizer │
│ Crypto (verify) │ PyCA cryptography — Ed25519 public-key verify only │
│ Storage │ ~/.datatools/license.json (file, mode 644) │
│ Internationalization │ i18n via JSON catalogs in src/i18n/ │
│ Build │ PyInstaller — one-file binary, per OS │
│ Runtimes │ Python 3.12 (bundled into installer) │
│ Platforms │ Windows · macOS · Linux │
└──────────────────┴─────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ LICENSE SERVER (this box; non-buyer-facing) │
├──────────────────┬─────────────────────────────────────────────────────┤
│ Edge │ nginx 1.24 + Let's Encrypt (auto-renew via timer) │
│ HTTP framework │ FastAPI 0.119 + Starlette + Pydantic v2 │
│ ASGI server │ uvicorn 0.39 (+uvloop, +httptools, +watchfiles) │
│ Form parsing │ python-multipart (for Gumroad form-encoded Pings) │
│ ORM │ SQLAlchemy 2.0 │
│ Migrations │ Alembic 1.18 (one initial migration so far) │
│ Database │ Postgres 16-alpine (containerized, single node) │
│ Database driver │ psycopg 3.3 (with binary wheel) │
│ Crypto (sign) │ PyCA cryptography — Ed25519 private-key sign │
│ HTTP client │ httpx 0.28 (Postmark calls, test mocking) │
│ Config │ Pydantic Settings + YAML (products.yaml) │
│ Container │ Docker + Docker Compose v2 plugin │
│ Image base │ python:3.12-slim │
│ Process user │ UID 10001 (non-root `app` user defined in image) │
│ Logging │ stdlib `logging` to container stdout → docker logs │
│ Host OS │ Ubuntu 24.04 LTS │
└──────────────────┴─────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ OPERATOR / DEVELOPER MACHINE │
├──────────────────┬─────────────────────────────────────────────────────┤
│ Source control │ git → self-hosted Gitea (git.invixiom.com) │
│ Admin CLI │ Typer (src/admin_cli.py) │
│ Server access │ SSH tunnel for /internal/* (no public exposure) │
│ Break-glass │ scripts/generate_license.py (offline-only mints, │
│ │ used when the license server is unreachable) │
│ Test runner │ pytest 8.3 + SQLite in-memory (no docker required) │
│ Smoke test │ bash + docker compose (server/scripts/smoke.sh) │
└──────────────────┴─────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ EXTERNAL SaaS / dependencies │
├──────────────────┬─────────────────────────────────────────────────────┤
│ Storefront │ Gumroad — Ping webhook to /webhooks/gumroad │
│ Transactional │ Postmark — HTTP API for license-delivery emails │
│ email │ (LoggingEmailService fallback when token unset) │
│ TLS CA │ Let's Encrypt — ACME HTTP-01 challenge via certbot │
│ Authoritative │ supercp / cPanel (your DNS host for unalogix.com) │
│ DNS │ — Cloudflare front-door deferred │
│ Source hosting │ Self-hosted Gitea (git.invixiom.com) — not on the │
│ │ datatools box; shares the same physical host │
└──────────────────┴─────────────────────────────────────────────────────┘
3. Trust + isolation boundaries
Worth tracing explicitly because the threat model differs at each boundary:
| Boundary | What crosses it | Trust model |
|---|---|---|
| Buyer ↔ Gumroad | Payment, buyer details | Out of scope — Gumroad's problem |
| Gumroad → license server (webhook) | Signed-by-shared-secret POST | URL secret check; non-matching = 404 (no info leak); audit-log everything regardless |
| License server → Postmark | DKIM-signed transactional mail | Postmark verified-sender domain; HTTP API auth via server token |
| License server → Postgres | SQL over local docker bridge | Same compose project; password from on-disk secret file |
Operator → license server (/internal/*) |
Bearer token over SSH tunnel | Token only on disk + in the operator's env; nginx blocks /internal/* publicly as defense-in-depth |
| License server → buyer (email) | Plaintext blob in inbox | Buyer's email account hygiene; we deliberately don't encrypt — blob is self-protecting (signature) |
| Buyer → desktop app (activation) | Signed blob pasted in | Verified against pubkey embedded in the shipped binary; no network call |
The single most important property to preserve: the desktop app never talks to the license server. All trust in the desktop comes from the embedded public key + the signed blob. This is what makes the offline activation guarantee real, and what keeps a license-server outage from breaking buyers who've already activated.
4. Where things are stored
| Lives on… | Path / location | Contents |
|---|---|---|
| Buyer's machine | ~/.datatools/license.json |
Activated license blob |
| Buyer's machine | Postmark email | Delivery copy of the blob |
| License server | licenses table (Postgres) |
Authoritative customer record — name, email, tier, blob, source, order ID, promotion, amount paid |
| License server | gumroad_events table |
Append-only webhook delivery audit log |
| License server | /srv/datatools-license/secrets/ |
Postgres password, admin Bearer token, (PR 2) Postmark token + Gumroad secret |
| License server | /etc/letsencrypt/live/datatools.unalogix.com/ |
TLS cert + key |
| Operator's laptop | ~/.datatools-creator/issued.jsonl |
Creator-side issuance log (pre-server era, kept as a break-glass backup) |
| Operator's laptop | Git clone of this repo | Source code, including server/config/products.yaml |
| Gitea | This repo's commits | Everything except secrets |
5. Related docs
| Doc | Scope |
|---|---|
TECHNICAL.md |
Desktop app internals (core libs, GUI, CLIs) |
LICENSE-SERVER.md |
Server architecture rationale + DB schema |
SETUP-LICENSE-SERVER.md |
Server install runbook (DNS, packages, nginx, TLS, Postgres) |
ADMIN.md |
Day-2 operations (minting, rotation, inspection) |
DECISIONS.md |
Architecture decision records — §9b = no online activation check |
USER-GUIDE.md |
Buyer-facing documentation |