sec(license): Ed25519 sigs + production-safe tripwire

Two coupled hardening upgrades.

1. Asymmetric signatures (HMAC → Ed25519)

The previous HMAC scheme used a symmetric secret that any motivated
reverse engineer could pull out of the shipped binary and use to
mint blobs for any tier / name / email. With Ed25519, the binary
ships only the public verification key; the signing key never
leaves the seller's environment, so binary compromise no longer
yields forgery.

- src/license/crypto.py rewritten around
  cryptography.hazmat.primitives.asymmetric.ed25519. Same public
  API surface (sign/verify/encode_blob/decode_blob), same canonical
  JSON encoding — drop-in for the manager / cli / GUI layers.
- DATATOOLS_LICENSE_PRIVKEY (seller-side) and
  DATATOOLS_LICENSE_PUBKEY (build-time) env vars supply the keys;
  the in-source dev keypair (src/license/_dev_keypair.py)
  deterministically derives from a seed phrase for repro builds and
  tests.
- Blob prefix bumped DTLIC1: → DTLIC2:. Decoding a DTLIC1 blob
  surfaces a clear "old format" error rather than a confusing
  signature mismatch.
- scripts/generate_keypair.py mints fresh production keypairs for
  the seller (run once, stash the private key offline). Adds
  cryptography>=41,<46 to requirements.txt (was an undeclared
  transitive dep).

2. Production-safe tripwire

assert_production_safe() refuses to boot a frozen / shipped build
when either:

- DATATOOLS_DEV_MODE=1 is set (would unconditionally bypass every
  license check — fine in source/test but catastrophic in a buyer
  install).
- The active verification key is still the embedded dev key (the
  build pipeline forgot to set DATATOOLS_LICENSE_PUBKEY).

No-op in source / pytest runs (sys.frozen is unset) so test
fixtures and dev workflows keep working without ceremony. Called
from src/cli_license_guard.guard() and from hide_streamlit_chrome
— so it fires on every CLI invocation and every GUI page load.

Tests: 49 license-layer unit tests (was 40); added Ed25519
wrong-key rejection, dev-keypair seed pin, blob v2 prefix, v1
rejection with clear message, and four production-safe scenarios
(no-op in source, fires on DEV_MODE in frozen, fires on dev key in
frozen, passes in frozen with prod pubkey). Total: 2024 → 2033.

Docs (REQUIREMENTS §17a, DEVELOPER licensing recipe, DECISIONS
§9b + decision log) updated with the new threat-model write-up,
key-storage workflow, and tripwire behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-13 17:34:48 +00:00
parent d32b58e61a
commit e534fb4989
12 changed files with 549 additions and 75 deletions

View File

@@ -178,6 +178,8 @@ $49-79/bundle · $149 full suite (when 3+ exist).
| May 13 (v1.6) | Ship licensing: 1-year HMAC-signed blobs, name+email registration, offline verification, tier-scaffolded for future SKUs | Unlock the lifetime-update business model without recurring infra. Honor-system DRM (HMAC + 30-day refund) — sufficient at $49. See §9b below. |
| May 13 (v1.6) | Add Lite SKU (Dedup + Text Cleaner + Format Standardizer) | Lower-priced entry point for buyers who only need the three universal tools. Per-tool feature gating + lock badges on the home grid surface the upgrade path. See §9b. |
| May 13 (v1.6) | Remove user-facing free trial | A 1-year all-features trial undercut the paid Lite SKU. Paid-only keeps tier economics clean. Internal ``_mint`` API still exists for tests and the seller's key generator. See §9b. |
| May 13 (v1.6) | Upgrade license crypto: HMAC → Ed25519 (asymmetric) | HMAC's symmetric secret was extractable from the shipped binary — anyone with the binary could mint blobs. Ed25519 splits sign (seller) from verify (binary), so binary compromise doesn't let an attacker forge licenses. Blob prefix bumped DTLIC1 → DTLIC2. See §9b. |
| May 13 (v1.6) | Add ``assert_production_safe`` tripwire | A shipped build with ``DATATOOLS_DEV_MODE=1`` or the in-source dev pubkey would silently defeat licensing. The tripwire refuses to boot such a build. No-op in source / pytest runs. See §9b. |
## 9b. Licensing model
@@ -191,7 +193,13 @@ $49-79/bundle · $149 full suite (when 3+ exist).
| Time-bombed binary (PyInstaller --no-license) | Rejected. Can't deliver renewals without re-shipping the installer. |
| Hardware-locked license | Rejected. Friction on legitimate device-swaps; doesn't match the buyer persona's tolerance. |
**Threat model**: a motivated reverse engineer can pull the HMAC secret out of the binary, mint their own licenses, and bypass the check. That's acceptable — the goal is to discourage casual blob-sharing among non-technical buyers, not stop targeted piracy. The 30-day refund window covers the same gap from a different angle (anyone who shares their blob is implicitly authorizing the buyer to issue them a refund-on-demand).
**Threat model** (v1.6 — Ed25519): the binary ships only the public key. A motivated reverse engineer who pulls everything out of the binary has the verification key but not the signing key — they can't mint new licenses. The earlier HMAC scheme had this hole; the asymmetric upgrade closes it. The remaining attack surface is:
- Re-signing with a forked binary that ships an attacker-controlled pubkey + auto-grants licenses. Costs more effort than the price of a legitimate copy and the result is per-fork, not shareable.
- Hooking the verification call to always return True. Defeats DRM entirely but only on the attacker's own machine — they could just write down "I unlocked DataTools" and skip the work.
- Setting ``DATATOOLS_DEV_MODE=1`` to bypass checks. **Refused in shipped builds** by ``assert_production_safe``; works in source/test runs only.
The 30-day refund window covers casual blob sharing from a different angle (anyone who shares their blob is implicitly authorizing the buyer to issue them a refund-on-demand).
**What's enforced**:
- License blob signature must match (HMAC-SHA256 with the build secret).

View File

@@ -143,9 +143,35 @@ require_feature(FeatureFlag.DEDUPLICATOR)
```
**Storage**: ``~/.datatools/license.json`` (override via
``DATATOOLS_LICENSE_PATH``). Signed locally with HMAC-SHA256 using a
secret read from ``DATATOOLS_LICENSE_SECRET`` (build-time replace; the
in-repo default is a development placeholder).
``DATATOOLS_LICENSE_PATH``). Signed with Ed25519 (asymmetric) — the
seller's private key signs; the buyer's binary verifies with the
embedded public key.
**Key material**:
| Variable | Who has it | Where it's used |
|---|---|---|
| ``DATATOOLS_LICENSE_PRIVKEY`` | Seller only | ``scripts/generate_license.py`` (mint a buyer's blob), ``scripts/generate_keypair.py`` writes a fresh one |
| ``DATATOOLS_LICENSE_PUBKEY`` | Every shipped binary | Verification at activation time; set at build time via PyInstaller env |
If neither env var is set, ``src.license.crypto`` falls back to the
deterministic dev keypair in ``src/license/_dev_keypair.py``. The
dev key is in source on purpose (so tests work without secrets),
but a frozen build that's still using it is a build-config bug —
:func:`assert_production_safe` refuses to start such a binary.
**First-time setup for shipped builds**:
1. ``python scripts/generate_keypair.py --output prod-keys.env`` —
creates a fresh keypair.
2. Stash ``DATATOOLS_LICENSE_PRIVKEY`` somewhere safe (password
manager / KMS). Lose it and you can't issue renewals without
reshipping a new build with a new public key.
3. Configure the PyInstaller build env with
``DATATOOLS_LICENSE_PUBKEY=<hex>`` so the shipped binary
verifies against the production key.
4. Mint buyer licenses with
``DATATOOLS_LICENSE_PRIVKEY=<hex> python scripts/generate_license.py ...``.
**Dev bypass**: ``DATATOOLS_DEV_MODE=1`` short-circuits every check.
The test suite's autouse fixture sets this so existing tests don't

View File

@@ -174,10 +174,11 @@ and proceeds.
- **Dev**: pytest, tox.
## 16. Test coverage
- 2,024 tests passing, 0 skipped, 0 xfailed.
- 1,859 core + CLI tests (run with `pytest -m 'not gui'` for a quick loop).
Includes 40 license-layer unit tests, 25 license-CLI tests, and
17 Lite-tier feature-map + guard tests.
- 2,033 tests passing, 0 skipped, 0 xfailed.
- 1,868 core + CLI tests (run with `pytest -m 'not gui'` for a quick loop).
Includes 49 license-layer unit tests (Ed25519 sign/verify, dev-key
derivation, production-safe tripwire, schema), 25 license-CLI
tests, and 17 Lite-tier feature-map + guard tests.
- 165 GUI tests under `tests/gui/` driving Streamlit pages via `AppTest`
(smoke + EN/ES localization, chrome, gate, workflows, dedup review,
advanced panels, error paths, findings panel, activation +
@@ -194,8 +195,14 @@ and proceeds.
## 17a. Licensing
- **Storage**: ``~/.datatools/license.json`` (or
``$DATATOOLS_LICENSE_PATH`` override). Signed locally with
HMAC-SHA256.
``$DATATOOLS_LICENSE_PATH`` override). Signed with Ed25519
(asymmetric).
- **Crypto**: Ed25519. The seller holds the private key; every
shipped binary embeds only the public key. A motivated reverse
engineer who pulls everything out of the binary still can't sign
new licenses. Keys are 32 bytes raw, exposed as hex via
``DATATOOLS_LICENSE_PRIVKEY`` (seller-side) and
``DATATOOLS_LICENSE_PUBKEY`` (build-time bake-in).
- **Activation**: buyer pastes a base64-encoded license blob
(``DTLIC1:...``) on first launch; app verifies the signature
offline + matches the buyer-entered name/email to the embedded
@@ -226,10 +233,17 @@ and proceeds.
- **Lock badge**: the home grid shows a red 🔒 Locked pill on tool
cards the current tier doesn't unlock.
- **Dev bypass**: ``DATATOOLS_DEV_MODE=1`` skips every check (used by
the test suite and during development).
the test suite and during development). **Refused in shipped
builds** by the production-safe tripwire.
- **Production-safe tripwire**: ``assert_production_safe()`` runs at
startup in every frozen build. Refuses to boot when ``DEV_MODE``
is set or the verification key is still the embedded dev key
(i.e., the build pipeline forgot to override
``DATATOOLS_LICENSE_PUBKEY``). No-op in source / pytest runs.
- **No internet**: signature verification is fully offline. The
shipped binary embeds the verification secret; see
``docs/DECISIONS.md`` for the threat-model discussion.
shipped binary embeds only the public key; the private key never
leaves the seller. See ``docs/DECISIONS.md`` for the threat-model
discussion.
## 18. Error handling
- Structured hierarchy: `DataToolsError` → `InputValidationError`, `ConfigError`, `FileFormatError`, `FileAccessError`.