From 624f99653e756d74b5b500c1ebb1cd268362277d Mon Sep 17 00:00:00 2001 From: Michael Date: Thu, 14 May 2026 01:59:05 +0000 Subject: [PATCH] docs(arch): end-to-end system + tech-stack diagrams MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit New ARCHITECTURE.md pulls the desktop app (TECHNICAL.md) and the license server (LICENSE-SERVER.md) into a single picture — the two were never reconciled into an end-to-end view before. Contents: §1. System diagram (ASCII) showing operator laptop, license server stack (nginx → FastAPI → Postgres), Postmark, Gumroad, and the buyer's machine — with the three primary flows (sale, manual mint, offline activation) traced through it. §2. Tech stack diagram, layered: desktop / server / operator / external SaaS, with version pins. §3. Trust + isolation boundaries table — what crosses each one and what the threat model is. §4. "Where things are stored" — paths, tables, files. §5. Pointers to the deeper per-component docs. ASCII over Mermaid since the repo's Gitea version is unknown and plain text renders in every viewer / IDE / raw `cat`. LICENSE-SERVER.md status flipped from "design proposal, not built" to "deployed (PR 1 + PR 2 code merged)" — that was stale since the PR 1 deploy yesterday. TECHNICAL.md and ADMIN.md gain one-line pointers to ARCHITECTURE.md so people land at the unified view when looking for "how does it all fit together". Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/ADMIN.md | 2 + docs/ARCHITECTURE.md | 241 +++++++++++++++++++++++++++++++++++++++++ docs/LICENSE-SERVER.md | 8 +- docs/TECHNICAL.md | 3 + 4 files changed, 251 insertions(+), 3 deletions(-) create mode 100644 docs/ARCHITECTURE.md diff --git a/docs/ADMIN.md b/docs/ADMIN.md index e0f896e..25fb36f 100644 --- a/docs/ADMIN.md +++ b/docs/ADMIN.md @@ -7,6 +7,8 @@ through the live server, where state lives on the box, how to rotate secrets, generating the signing keypair, the dev vs. production key story, and how to recover from key loss. +For the end-to-end system + tech stack diagrams, see `ARCHITECTURE.md`. + --- ## Live deployment (PR 1) diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md new file mode 100644 index 0000000..ed88372 --- /dev/null +++ b/docs/ARCHITECTURE.md @@ -0,0 +1,241 @@ +# ARCHITECTURE — end-to-end view + +Stitches the desktop app (`TECHNICAL.md`) and the license server +(`LICENSE-SERVER.md`) into a single picture. Read this first for "how +does it all fit together"; drill into the per-component docs for +detail. + +--- + +## 1. System diagram + +``` +┌────────────────────────────────────────────────────────────────────────┐ +│ OPERATOR / DEVELOPER LAPTOP │ +│ │ +│ git clone / push ←─── code lives in git.invixiom.com │ +│ datatools-admin CLI ─── manual mints, list, revoke ─────┐ │ +│ ssh -L 8090:127.0.0.1:8090 ───── tunnel for /internal/* ─────┤ │ +└────────────────────────────────────────────────────────────────────────┘ + │ + ┌─────────────────────────────────────────────────────────┘ + │ + │ internal Bearer-auth API (over SSH tunnel only) + ▼ +┌────────────────────────────────────────────────────────────────────────┐ +│ LICENSE SERVER — 46.225.166.142 │ +│ ───────────────────────────────────────────────────────────────── │ +│ │ +│ ┌──────────────────────────────────────────────────────────────────┐ │ +│ │ nginx 1.24 (TLS termination, public reverse proxy) │ │ +│ │ │ │ +│ │ datatools.unalogix.com → static placeholder │ │ +│ │ licenses.datatools.unalogix.com → 127.0.0.1:8090 (FastAPI) │ │ +│ │ /internal/* on public surface → blocked (404) │ │ +│ └────────────────────────────┬─────────────────────────────────────┘ │ +│ │ │ +│ ┌────────────────────────────▼─────────────────────────────────────┐ │ +│ │ FastAPI app — datatools-api (Docker container, UID 10001) │ │ +│ │ │ │ +│ │ ┌──────────────────┐ ┌──────────────────┐ ┌───────────────┐ │ │ +│ │ │ /webhooks/* │ │ /internal/* │ │ /health │ │ │ +│ │ │ (storefronts) │ │ (Bearer-auth) │ │ (public) │ │ │ +│ │ └────────┬─────────┘ └────────┬─────────┘ └───────────────┘ │ │ +│ │ │ │ │ │ +│ │ ▼ ▼ │ │ +│ │ ┌────────────────────────────────────────┐ │ │ +│ │ │ SourceAdapter (Protocol) — normalized │ │ │ +│ │ │ • ManualAdapter • GumroadAdapter │ │ │ +│ │ │ • (LemonSqueezy, Stripe — future) │ │ │ +│ │ └────────────────┬───────────────────────┘ │ │ +│ │ │ SaleEvent / RefundEvent │ │ +│ │ ▼ │ │ +│ │ ┌────────────────────────────────────────┐ │ │ +│ │ │ mint_from_sale() │ │ │ +│ │ │ • Ed25519 sign via PyCA cryptography │ │ │ +│ │ │ • idempotent on (source, order_id) │ │ │ +│ │ └────────────────┬───────────────────────┘ │ │ +│ └────────────────────┼─────────────────────────────────────────────┘ │ +│ │ SQL │ +│ ┌────────────────────▼─────────────────────────────────────────────┐ │ +│ │ Postgres 16 — datatools-postgres (container, vol pg_data) │ │ +│ │ • licenses — authoritative customer record │ │ +│ │ • gumroad_events — webhook audit log (idempotency, replay) │ │ +│ └──────────────────────────────────────────────────────────────────┘ │ +└───────────────────────┬────────────────────────────────┬───────────────┘ + │ │ + ┌───────────┘ └──────────┐ + │ POST /email (httpx) Gumroad Ping│ + ▼ POST │ + ┌───────────────────┐ ┌─────────────▼──┐ + │ Postmark │ │ Gumroad │ + │ (transactional │ │ (storefront, │ + │ email) │ │ payments) │ + └───────┬───────────┘ └────────────────┘ + │ DKIM-signed email with license blob ▲ + ▼ │ +┌────────────────────────────────────────────────────────────────┴───────┐ +│ BUYER'S MACHINE │ +│ │ +│ Receives email ──► copies DTLIC2: blob ──► pastes into desktop app │ +│ │ +│ ┌──────────────────────────────────────────────────────────────────┐ │ +│ │ DataTools desktop (Python 3.12 + Streamlit + Typer CLIs) │ │ +│ │ │ │ +│ │ ┌────────────────────────────────────────────────────────────┐ │ │ +│ │ │ Activate screen — verifies blob signature │ │ │ +│ │ │ against EMBEDDED Ed25519 public key │ │ │ +│ │ │ (NO network call to the license server, ever) │ │ │ +│ │ └─────────────────────────┬──────────────────────────────────┘ │ │ +│ │ ▼ │ │ +│ │ ~/.datatools/license.json (signed blob, mode 644, on disk) │ │ +│ └──────────────────────────────────────────────────────────────────┘ │ +│ │ +│ Pays via web browser ─────► Gumroad ────► (kicks off the Ping) │ +└────────────────────────────────────────────────────────────────────────┘ +``` + +**Three primary flows**, distinguishable by where the green arrows +start in the diagram: + +1. **Sale → fulfillment** (the automated path) + Buyer pays at Gumroad → Gumroad fires Ping to + `licenses.datatools.unalogix.com/webhooks/gumroad?secret=…` → nginx + → FastAPI → audit-log row → adapter normalizes payload → `mint_from_sale` + writes the `licenses` row + Ed25519-signs the blob → Postmark emails + the buyer their blob. End-to-end latency: a few hundred milliseconds. + +2. **Manual mint** (operator path — comps, support replacements) + Operator opens SSH tunnel → `datatools-admin mint` → `/internal/mint` + (Bearer-authed, never publicly reachable) → same `mint_from_sale` + path → blob returned in HTTP response. Operator delivers to buyer + out-of-band. + +3. **Activation** (buyer path — fully offline) + Buyer pastes blob into desktop's Activate screen → desktop verifies + the Ed25519 signature against the public key **embedded in the + shipped binary** → license written to `~/.datatools/license.json`. + The desktop app makes **no network calls** to the license server at + any point. This preserves the "your data never leaves your computer" + promise (`DECISIONS.md §9b`). + +--- + +## 2. Tech stack + +Layered view of what technology lives where. "External SaaS" entries +are services we depend on but don't operate. + +``` +┌────────────────────────────────────────────────────────────────────────┐ +│ DESKTOP APP (shipped binary, runs on buyer's box) │ +├──────────────────┬─────────────────────────────────────────────────────┤ +│ GUI │ Streamlit 1.35 — local web server, browser opens │ +│ CLI │ Typer 0.12 — per-tool entry points │ +│ Core logic │ pandas 2.x, numpy, rapidfuzz, charset-normalizer │ +│ Crypto (verify) │ PyCA cryptography — Ed25519 public-key verify only │ +│ Storage │ ~/.datatools/license.json (file, mode 644) │ +│ Internationalization │ i18n via JSON catalogs in src/i18n/ │ +│ Build │ PyInstaller — one-file binary, per OS │ +│ Runtimes │ Python 3.12 (bundled into installer) │ +│ Platforms │ Windows · macOS · Linux │ +└──────────────────┴─────────────────────────────────────────────────────┘ + +┌────────────────────────────────────────────────────────────────────────┐ +│ LICENSE SERVER (this box; non-buyer-facing) │ +├──────────────────┬─────────────────────────────────────────────────────┤ +│ Edge │ nginx 1.24 + Let's Encrypt (auto-renew via timer) │ +│ HTTP framework │ FastAPI 0.119 + Starlette + Pydantic v2 │ +│ ASGI server │ uvicorn 0.39 (+uvloop, +httptools, +watchfiles) │ +│ Form parsing │ python-multipart (for Gumroad form-encoded Pings) │ +│ ORM │ SQLAlchemy 2.0 │ +│ Migrations │ Alembic 1.18 (one initial migration so far) │ +│ Database │ Postgres 16-alpine (containerized, single node) │ +│ Database driver │ psycopg 3.3 (with binary wheel) │ +│ Crypto (sign) │ PyCA cryptography — Ed25519 private-key sign │ +│ HTTP client │ httpx 0.28 (Postmark calls, test mocking) │ +│ Config │ Pydantic Settings + YAML (products.yaml) │ +│ Container │ Docker + Docker Compose v2 plugin │ +│ Image base │ python:3.12-slim │ +│ Process user │ UID 10001 (non-root `app` user defined in image) │ +│ Logging │ stdlib `logging` to container stdout → docker logs │ +│ Host OS │ Ubuntu 24.04 LTS │ +└──────────────────┴─────────────────────────────────────────────────────┘ + +┌────────────────────────────────────────────────────────────────────────┐ +│ OPERATOR / DEVELOPER MACHINE │ +├──────────────────┬─────────────────────────────────────────────────────┤ +│ Source control │ git → self-hosted Gitea (git.invixiom.com) │ +│ Admin CLI │ Typer (src/admin_cli.py) │ +│ Server access │ SSH tunnel for /internal/* (no public exposure) │ +│ Break-glass │ scripts/generate_license.py (offline-only mints, │ +│ │ used when the license server is unreachable) │ +│ Test runner │ pytest 8.3 + SQLite in-memory (no docker required) │ +│ Smoke test │ bash + docker compose (server/scripts/smoke.sh) │ +└──────────────────┴─────────────────────────────────────────────────────┘ + +┌────────────────────────────────────────────────────────────────────────┐ +│ EXTERNAL SaaS / dependencies │ +├──────────────────┬─────────────────────────────────────────────────────┤ +│ Storefront │ Gumroad — Ping webhook to /webhooks/gumroad │ +│ Transactional │ Postmark — HTTP API for license-delivery emails │ +│ email │ (LoggingEmailService fallback when token unset) │ +│ TLS CA │ Let's Encrypt — ACME HTTP-01 challenge via certbot │ +│ Authoritative │ supercp / cPanel (your DNS host for unalogix.com) │ +│ DNS │ — Cloudflare front-door deferred │ +│ Source hosting │ Self-hosted Gitea (git.invixiom.com) — not on the │ +│ │ datatools box; shares the same physical host │ +└──────────────────┴─────────────────────────────────────────────────────┘ +``` + +--- + +## 3. Trust + isolation boundaries + +Worth tracing explicitly because the threat model differs at each +boundary: + +| Boundary | What crosses it | Trust model | +|---|---|---| +| Buyer ↔ Gumroad | Payment, buyer details | Out of scope — Gumroad's problem | +| Gumroad → license server (webhook) | Signed-by-shared-secret POST | URL secret check; non-matching = 404 (no info leak); audit-log everything regardless | +| License server → Postmark | DKIM-signed transactional mail | Postmark verified-sender domain; HTTP API auth via server token | +| License server → Postgres | SQL over local docker bridge | Same compose project; password from on-disk secret file | +| Operator → license server (`/internal/*`) | Bearer token over SSH tunnel | Token only on disk + in the operator's env; nginx blocks `/internal/*` publicly as defense-in-depth | +| License server → buyer (email) | Plaintext blob in inbox | Buyer's email account hygiene; we deliberately don't encrypt — blob is self-protecting (signature) | +| Buyer → desktop app (activation) | Signed blob pasted in | Verified against pubkey **embedded in the shipped binary**; no network call | + +The single most important property to preserve: **the desktop app +never talks to the license server.** All trust in the desktop comes +from the embedded public key + the signed blob. This is what makes +the offline activation guarantee real, and what keeps a license-server +outage from breaking buyers who've already activated. + +--- + +## 4. Where things are stored + +| Lives on… | Path / location | Contents | +|---|---|---| +| Buyer's machine | `~/.datatools/license.json` | Activated license blob | +| Buyer's machine | Postmark email | Delivery copy of the blob | +| License server | `licenses` table (Postgres) | Authoritative customer record — name, email, tier, blob, source, order ID, promotion, amount paid | +| License server | `gumroad_events` table | Append-only webhook delivery audit log | +| License server | `/srv/datatools-license/secrets/` | Postgres password, admin Bearer token, (PR 2) Postmark token + Gumroad secret | +| License server | `/etc/letsencrypt/live/datatools.unalogix.com/` | TLS cert + key | +| Operator's laptop | `~/.datatools-creator/issued.jsonl` | Creator-side issuance log (pre-server era, kept as a break-glass backup) | +| Operator's laptop | Git clone of this repo | Source code, including `server/config/products.yaml` | +| Gitea | This repo's commits | Everything except secrets | + +--- + +## 5. Related docs + +| Doc | Scope | +|---|---| +| `TECHNICAL.md` | Desktop app internals (core libs, GUI, CLIs) | +| `LICENSE-SERVER.md` | Server architecture rationale + DB schema | +| `SETUP-LICENSE-SERVER.md` | Server install runbook (DNS, packages, nginx, TLS, Postgres) | +| `ADMIN.md` | Day-2 operations (minting, rotation, inspection) | +| `DECISIONS.md` | Architecture decision records — `§9b` = no online activation check | +| `USER-GUIDE.md` | Buyer-facing documentation | diff --git a/docs/LICENSE-SERVER.md b/docs/LICENSE-SERVER.md index 3b5282d..1f9844f 100644 --- a/docs/LICENSE-SERVER.md +++ b/docs/LICENSE-SERVER.md @@ -1,7 +1,9 @@ -# LICENSE-SERVER — Future online issuance & record-keeping +# LICENSE-SERVER — online issuance & record-keeping -**Status:** design proposal. Not built. The current system is -fully offline (see `ADMIN.md`). +**Status:** **deployed (PR 1 + PR 2 code merged)**. Live at +`licenses.datatools.unalogix.com`. See `ADMIN.md §"Live deployment"` +for day-2 operations, and `ARCHITECTURE.md` for the end-to-end +diagram including the desktop and storefronts. This doc describes the smallest useful server we could build to replace the manual mint-and-paste workflow, without compromising the diff --git a/docs/TECHNICAL.md b/docs/TECHNICAL.md index 136579b..6e44072 100644 --- a/docs/TECHNICAL.md +++ b/docs/TECHNICAL.md @@ -3,6 +3,9 @@ > Creator-only. Do not ship to buyers. > **Version**: 1.6 · **Updated**: 2026-05-01 +For the end-to-end picture (desktop app + license server + storefronts ++ email), see `ARCHITECTURE.md`. This doc focuses on desktop internals. + ## 1. Architecture - **Dual interface**: CLI + GUI, both wrapping the same `src/core/` library.