docs(arch): end-to-end system + tech-stack diagrams

New ARCHITECTURE.md pulls the desktop app (TECHNICAL.md) and the
license server (LICENSE-SERVER.md) into a single picture — the two
were never reconciled into an end-to-end view before.

Contents:
  §1. System diagram (ASCII) showing operator laptop, license
      server stack (nginx → FastAPI → Postgres), Postmark, Gumroad,
      and the buyer's machine — with the three primary flows
      (sale, manual mint, offline activation) traced through it.
  §2. Tech stack diagram, layered: desktop / server / operator /
      external SaaS, with version pins.
  §3. Trust + isolation boundaries table — what crosses each one
      and what the threat model is.
  §4. "Where things are stored" — paths, tables, files.
  §5. Pointers to the deeper per-component docs.

ASCII over Mermaid since the repo's Gitea version is unknown and
plain text renders in every viewer / IDE / raw `cat`.

LICENSE-SERVER.md status flipped from "design proposal, not built"
to "deployed (PR 1 + PR 2 code merged)" — that was stale since
the PR 1 deploy yesterday.

TECHNICAL.md and ADMIN.md gain one-line pointers to ARCHITECTURE.md
so people land at the unified view when looking for "how does it
all fit together".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-14 01:59:05 +00:00
parent 86ad21db79
commit 624f99653e
4 changed files with 251 additions and 3 deletions

View File

@@ -7,6 +7,8 @@ through the live server, where state lives on the box, how to rotate secrets,
generating the signing keypair, the dev vs. production key story, and how to
recover from key loss.
For the end-to-end system + tech stack diagrams, see `ARCHITECTURE.md`.
---
## Live deployment (PR 1)

241
docs/ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,241 @@
# ARCHITECTURE — end-to-end view
Stitches the desktop app (`TECHNICAL.md`) and the license server
(`LICENSE-SERVER.md`) into a single picture. Read this first for "how
does it all fit together"; drill into the per-component docs for
detail.
---
## 1. System diagram
```
┌────────────────────────────────────────────────────────────────────────┐
│ OPERATOR / DEVELOPER LAPTOP │
│ │
│ git clone / push ←─── code lives in git.invixiom.com │
│ datatools-admin CLI ─── manual mints, list, revoke ─────┐ │
│ ssh -L 8090:127.0.0.1:8090 ───── tunnel for /internal/* ─────┤ │
└────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┘
│ internal Bearer-auth API (over SSH tunnel only)
┌────────────────────────────────────────────────────────────────────────┐
│ LICENSE SERVER — 46.225.166.142 │
│ ───────────────────────────────────────────────────────────────── │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ nginx 1.24 (TLS termination, public reverse proxy) │ │
│ │ │ │
│ │ datatools.unalogix.com → static placeholder │ │
│ │ licenses.datatools.unalogix.com → 127.0.0.1:8090 (FastAPI) │ │
│ │ /internal/* on public surface → blocked (404) │ │
│ └────────────────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────────▼─────────────────────────────────────┐ │
│ │ FastAPI app — datatools-api (Docker container, UID 10001) │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌──────────────────┐ ┌───────────────┐ │ │
│ │ │ /webhooks/* │ │ /internal/* │ │ /health │ │ │
│ │ │ (storefronts) │ │ (Bearer-auth) │ │ (public) │ │ │
│ │ └────────┬─────────┘ └────────┬─────────┘ └───────────────┘ │ │
│ │ │ │ │ │
│ │ ▼ ▼ │ │
│ │ ┌────────────────────────────────────────┐ │ │
│ │ │ SourceAdapter (Protocol) — normalized │ │ │
│ │ │ • ManualAdapter • GumroadAdapter │ │ │
│ │ │ • (LemonSqueezy, Stripe — future) │ │ │
│ │ └────────────────┬───────────────────────┘ │ │
│ │ │ SaleEvent / RefundEvent │ │
│ │ ▼ │ │
│ │ ┌────────────────────────────────────────┐ │ │
│ │ │ mint_from_sale() │ │ │
│ │ │ • Ed25519 sign via PyCA cryptography │ │ │
│ │ │ • idempotent on (source, order_id) │ │ │
│ │ └────────────────┬───────────────────────┘ │ │
│ └────────────────────┼─────────────────────────────────────────────┘ │
│ │ SQL │
│ ┌────────────────────▼─────────────────────────────────────────────┐ │
│ │ Postgres 16 — datatools-postgres (container, vol pg_data) │ │
│ │ • licenses — authoritative customer record │ │
│ │ • gumroad_events — webhook audit log (idempotency, replay) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
└───────────────────────┬────────────────────────────────┬───────────────┘
│ │
┌───────────┘ └──────────┐
│ POST /email (httpx) Gumroad Ping│
▼ POST │
┌───────────────────┐ ┌─────────────▼──┐
│ Postmark │ │ Gumroad │
│ (transactional │ │ (storefront, │
│ email) │ │ payments) │
└───────┬───────────┘ └────────────────┘
│ DKIM-signed email with license blob ▲
▼ │
┌────────────────────────────────────────────────────────────────┴───────┐
│ BUYER'S MACHINE │
│ │
│ Receives email ──► copies DTLIC2: blob ──► pastes into desktop app │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ DataTools desktop (Python 3.12 + Streamlit + Typer CLIs) │ │
│ │ │ │
│ │ ┌────────────────────────────────────────────────────────────┐ │ │
│ │ │ Activate screen — verifies blob signature │ │ │
│ │ │ against EMBEDDED Ed25519 public key │ │ │
│ │ │ (NO network call to the license server, ever) │ │ │
│ │ └─────────────────────────┬──────────────────────────────────┘ │ │
│ │ ▼ │ │
│ │ ~/.datatools/license.json (signed blob, mode 644, on disk) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ Pays via web browser ─────► Gumroad ────► (kicks off the Ping) │
└────────────────────────────────────────────────────────────────────────┘
```
**Three primary flows**, distinguishable by where the green arrows
start in the diagram:
1. **Sale → fulfillment** (the automated path)
Buyer pays at Gumroad → Gumroad fires Ping to
`licenses.datatools.unalogix.com/webhooks/gumroad?secret=…` → nginx
→ FastAPI → audit-log row → adapter normalizes payload → `mint_from_sale`
writes the `licenses` row + Ed25519-signs the blob → Postmark emails
the buyer their blob. End-to-end latency: a few hundred milliseconds.
2. **Manual mint** (operator path — comps, support replacements)
Operator opens SSH tunnel → `datatools-admin mint``/internal/mint`
(Bearer-authed, never publicly reachable) → same `mint_from_sale`
path → blob returned in HTTP response. Operator delivers to buyer
out-of-band.
3. **Activation** (buyer path — fully offline)
Buyer pastes blob into desktop's Activate screen → desktop verifies
the Ed25519 signature against the public key **embedded in the
shipped binary** → license written to `~/.datatools/license.json`.
The desktop app makes **no network calls** to the license server at
any point. This preserves the "your data never leaves your computer"
promise (`DECISIONS.md §9b`).
---
## 2. Tech stack
Layered view of what technology lives where. "External SaaS" entries
are services we depend on but don't operate.
```
┌────────────────────────────────────────────────────────────────────────┐
│ DESKTOP APP (shipped binary, runs on buyer's box) │
├──────────────────┬─────────────────────────────────────────────────────┤
│ GUI │ Streamlit 1.35 — local web server, browser opens │
│ CLI │ Typer 0.12 — per-tool entry points │
│ Core logic │ pandas 2.x, numpy, rapidfuzz, charset-normalizer │
│ Crypto (verify) │ PyCA cryptography — Ed25519 public-key verify only │
│ Storage │ ~/.datatools/license.json (file, mode 644) │
│ Internationalization │ i18n via JSON catalogs in src/i18n/ │
│ Build │ PyInstaller — one-file binary, per OS │
│ Runtimes │ Python 3.12 (bundled into installer) │
│ Platforms │ Windows · macOS · Linux │
└──────────────────┴─────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ LICENSE SERVER (this box; non-buyer-facing) │
├──────────────────┬─────────────────────────────────────────────────────┤
│ Edge │ nginx 1.24 + Let's Encrypt (auto-renew via timer) │
│ HTTP framework │ FastAPI 0.119 + Starlette + Pydantic v2 │
│ ASGI server │ uvicorn 0.39 (+uvloop, +httptools, +watchfiles) │
│ Form parsing │ python-multipart (for Gumroad form-encoded Pings) │
│ ORM │ SQLAlchemy 2.0 │
│ Migrations │ Alembic 1.18 (one initial migration so far) │
│ Database │ Postgres 16-alpine (containerized, single node) │
│ Database driver │ psycopg 3.3 (with binary wheel) │
│ Crypto (sign) │ PyCA cryptography — Ed25519 private-key sign │
│ HTTP client │ httpx 0.28 (Postmark calls, test mocking) │
│ Config │ Pydantic Settings + YAML (products.yaml) │
│ Container │ Docker + Docker Compose v2 plugin │
│ Image base │ python:3.12-slim │
│ Process user │ UID 10001 (non-root `app` user defined in image) │
│ Logging │ stdlib `logging` to container stdout → docker logs │
│ Host OS │ Ubuntu 24.04 LTS │
└──────────────────┴─────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ OPERATOR / DEVELOPER MACHINE │
├──────────────────┬─────────────────────────────────────────────────────┤
│ Source control │ git → self-hosted Gitea (git.invixiom.com) │
│ Admin CLI │ Typer (src/admin_cli.py) │
│ Server access │ SSH tunnel for /internal/* (no public exposure) │
│ Break-glass │ scripts/generate_license.py (offline-only mints, │
│ │ used when the license server is unreachable) │
│ Test runner │ pytest 8.3 + SQLite in-memory (no docker required) │
│ Smoke test │ bash + docker compose (server/scripts/smoke.sh) │
└──────────────────┴─────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ EXTERNAL SaaS / dependencies │
├──────────────────┬─────────────────────────────────────────────────────┤
│ Storefront │ Gumroad — Ping webhook to /webhooks/gumroad │
│ Transactional │ Postmark — HTTP API for license-delivery emails │
│ email │ (LoggingEmailService fallback when token unset) │
│ TLS CA │ Let's Encrypt — ACME HTTP-01 challenge via certbot │
│ Authoritative │ supercp / cPanel (your DNS host for unalogix.com) │
│ DNS │ — Cloudflare front-door deferred │
│ Source hosting │ Self-hosted Gitea (git.invixiom.com) — not on the │
│ │ datatools box; shares the same physical host │
└──────────────────┴─────────────────────────────────────────────────────┘
```
---
## 3. Trust + isolation boundaries
Worth tracing explicitly because the threat model differs at each
boundary:
| Boundary | What crosses it | Trust model |
|---|---|---|
| Buyer ↔ Gumroad | Payment, buyer details | Out of scope — Gumroad's problem |
| Gumroad → license server (webhook) | Signed-by-shared-secret POST | URL secret check; non-matching = 404 (no info leak); audit-log everything regardless |
| License server → Postmark | DKIM-signed transactional mail | Postmark verified-sender domain; HTTP API auth via server token |
| License server → Postgres | SQL over local docker bridge | Same compose project; password from on-disk secret file |
| Operator → license server (`/internal/*`) | Bearer token over SSH tunnel | Token only on disk + in the operator's env; nginx blocks `/internal/*` publicly as defense-in-depth |
| License server → buyer (email) | Plaintext blob in inbox | Buyer's email account hygiene; we deliberately don't encrypt — blob is self-protecting (signature) |
| Buyer → desktop app (activation) | Signed blob pasted in | Verified against pubkey **embedded in the shipped binary**; no network call |
The single most important property to preserve: **the desktop app
never talks to the license server.** All trust in the desktop comes
from the embedded public key + the signed blob. This is what makes
the offline activation guarantee real, and what keeps a license-server
outage from breaking buyers who've already activated.
---
## 4. Where things are stored
| Lives on… | Path / location | Contents |
|---|---|---|
| Buyer's machine | `~/.datatools/license.json` | Activated license blob |
| Buyer's machine | Postmark email | Delivery copy of the blob |
| License server | `licenses` table (Postgres) | Authoritative customer record — name, email, tier, blob, source, order ID, promotion, amount paid |
| License server | `gumroad_events` table | Append-only webhook delivery audit log |
| License server | `/srv/datatools-license/secrets/` | Postgres password, admin Bearer token, (PR 2) Postmark token + Gumroad secret |
| License server | `/etc/letsencrypt/live/datatools.unalogix.com/` | TLS cert + key |
| Operator's laptop | `~/.datatools-creator/issued.jsonl` | Creator-side issuance log (pre-server era, kept as a break-glass backup) |
| Operator's laptop | Git clone of this repo | Source code, including `server/config/products.yaml` |
| Gitea | This repo's commits | Everything except secrets |
---
## 5. Related docs
| Doc | Scope |
|---|---|
| `TECHNICAL.md` | Desktop app internals (core libs, GUI, CLIs) |
| `LICENSE-SERVER.md` | Server architecture rationale + DB schema |
| `SETUP-LICENSE-SERVER.md` | Server install runbook (DNS, packages, nginx, TLS, Postgres) |
| `ADMIN.md` | Day-2 operations (minting, rotation, inspection) |
| `DECISIONS.md` | Architecture decision records — `§9b` = no online activation check |
| `USER-GUIDE.md` | Buyer-facing documentation |

View File

@@ -1,7 +1,9 @@
# LICENSE-SERVER — Future online issuance & record-keeping
# LICENSE-SERVER — online issuance & record-keeping
**Status:** design proposal. Not built. The current system is
fully offline (see `ADMIN.md`).
**Status:** **deployed (PR 1 + PR 2 code merged)**. Live at
`licenses.datatools.unalogix.com`. See `ADMIN.md §"Live deployment"`
for day-2 operations, and `ARCHITECTURE.md` for the end-to-end
diagram including the desktop and storefronts.
This doc describes the smallest useful server we could build to
replace the manual mint-and-paste workflow, without compromising the

View File

@@ -3,6 +3,9 @@
> Creator-only. Do not ship to buyers.
> **Version**: 1.6 · **Updated**: 2026-05-01
For the end-to-end picture (desktop app + license server + storefronts
+ email), see `ARCHITECTURE.md`. This doc focuses on desktop internals.
## 1. Architecture
- **Dual interface**: CLI + GUI, both wrapping the same `src/core/` library.