diff --git a/docs/DECISIONS.md b/docs/DECISIONS.md index 94fb5dc..a9d8564 100644 --- a/docs/DECISIONS.md +++ b/docs/DECISIONS.md @@ -178,6 +178,8 @@ $49-79/bundle · $149 full suite (when 3+ exist). | May 13 (v1.6) | Ship licensing: 1-year HMAC-signed blobs, name+email registration, offline verification, tier-scaffolded for future SKUs | Unlock the lifetime-update business model without recurring infra. Honor-system DRM (HMAC + 30-day refund) — sufficient at $49. See §9b below. | | May 13 (v1.6) | Add Lite SKU (Dedup + Text Cleaner + Format Standardizer) | Lower-priced entry point for buyers who only need the three universal tools. Per-tool feature gating + lock badges on the home grid surface the upgrade path. See §9b. | | May 13 (v1.6) | Remove user-facing free trial | A 1-year all-features trial undercut the paid Lite SKU. Paid-only keeps tier economics clean. Internal ``_mint`` API still exists for tests and the seller's key generator. See §9b. | +| May 13 (v1.6) | Upgrade license crypto: HMAC → Ed25519 (asymmetric) | HMAC's symmetric secret was extractable from the shipped binary — anyone with the binary could mint blobs. Ed25519 splits sign (seller) from verify (binary), so binary compromise doesn't let an attacker forge licenses. Blob prefix bumped DTLIC1 → DTLIC2. See §9b. | +| May 13 (v1.6) | Add ``assert_production_safe`` tripwire | A shipped build with ``DATATOOLS_DEV_MODE=1`` or the in-source dev pubkey would silently defeat licensing. The tripwire refuses to boot such a build. No-op in source / pytest runs. See §9b. | ## 9b. Licensing model @@ -191,7 +193,13 @@ $49-79/bundle · $149 full suite (when 3+ exist). | Time-bombed binary (PyInstaller --no-license) | Rejected. Can't deliver renewals without re-shipping the installer. | | Hardware-locked license | Rejected. Friction on legitimate device-swaps; doesn't match the buyer persona's tolerance. | -**Threat model**: a motivated reverse engineer can pull the HMAC secret out of the binary, mint their own licenses, and bypass the check. That's acceptable — the goal is to discourage casual blob-sharing among non-technical buyers, not stop targeted piracy. The 30-day refund window covers the same gap from a different angle (anyone who shares their blob is implicitly authorizing the buyer to issue them a refund-on-demand). +**Threat model** (v1.6 — Ed25519): the binary ships only the public key. A motivated reverse engineer who pulls everything out of the binary has the verification key but not the signing key — they can't mint new licenses. The earlier HMAC scheme had this hole; the asymmetric upgrade closes it. The remaining attack surface is: + +- Re-signing with a forked binary that ships an attacker-controlled pubkey + auto-grants licenses. Costs more effort than the price of a legitimate copy and the result is per-fork, not shareable. +- Hooking the verification call to always return True. Defeats DRM entirely but only on the attacker's own machine — they could just write down "I unlocked DataTools" and skip the work. +- Setting ``DATATOOLS_DEV_MODE=1`` to bypass checks. **Refused in shipped builds** by ``assert_production_safe``; works in source/test runs only. + +The 30-day refund window covers casual blob sharing from a different angle (anyone who shares their blob is implicitly authorizing the buyer to issue them a refund-on-demand). **What's enforced**: - License blob signature must match (HMAC-SHA256 with the build secret). diff --git a/docs/DEVELOPER.md b/docs/DEVELOPER.md index 6a5e448..3cbb88e 100644 --- a/docs/DEVELOPER.md +++ b/docs/DEVELOPER.md @@ -143,9 +143,35 @@ require_feature(FeatureFlag.DEDUPLICATOR) ``` **Storage**: ``~/.datatools/license.json`` (override via -``DATATOOLS_LICENSE_PATH``). Signed locally with HMAC-SHA256 using a -secret read from ``DATATOOLS_LICENSE_SECRET`` (build-time replace; the -in-repo default is a development placeholder). +``DATATOOLS_LICENSE_PATH``). Signed with Ed25519 (asymmetric) — the +seller's private key signs; the buyer's binary verifies with the +embedded public key. + +**Key material**: + +| Variable | Who has it | Where it's used | +|---|---|---| +| ``DATATOOLS_LICENSE_PRIVKEY`` | Seller only | ``scripts/generate_license.py`` (mint a buyer's blob), ``scripts/generate_keypair.py`` writes a fresh one | +| ``DATATOOLS_LICENSE_PUBKEY`` | Every shipped binary | Verification at activation time; set at build time via PyInstaller env | + +If neither env var is set, ``src.license.crypto`` falls back to the +deterministic dev keypair in ``src/license/_dev_keypair.py``. The +dev key is in source on purpose (so tests work without secrets), +but a frozen build that's still using it is a build-config bug — +:func:`assert_production_safe` refuses to start such a binary. + +**First-time setup for shipped builds**: + +1. ``python scripts/generate_keypair.py --output prod-keys.env`` — + creates a fresh keypair. +2. Stash ``DATATOOLS_LICENSE_PRIVKEY`` somewhere safe (password + manager / KMS). Lose it and you can't issue renewals without + reshipping a new build with a new public key. +3. Configure the PyInstaller build env with + ``DATATOOLS_LICENSE_PUBKEY=`` so the shipped binary + verifies against the production key. +4. Mint buyer licenses with + ``DATATOOLS_LICENSE_PRIVKEY= python scripts/generate_license.py ...``. **Dev bypass**: ``DATATOOLS_DEV_MODE=1`` short-circuits every check. The test suite's autouse fixture sets this so existing tests don't diff --git a/docs/REQUIREMENTS.md b/docs/REQUIREMENTS.md index 1d8e42f..064bec0 100644 --- a/docs/REQUIREMENTS.md +++ b/docs/REQUIREMENTS.md @@ -174,10 +174,11 @@ and proceeds. - **Dev**: pytest, tox. ## 16. Test coverage -- 2,024 tests passing, 0 skipped, 0 xfailed. - - 1,859 core + CLI tests (run with `pytest -m 'not gui'` for a quick loop). - Includes 40 license-layer unit tests, 25 license-CLI tests, and - 17 Lite-tier feature-map + guard tests. +- 2,033 tests passing, 0 skipped, 0 xfailed. + - 1,868 core + CLI tests (run with `pytest -m 'not gui'` for a quick loop). + Includes 49 license-layer unit tests (Ed25519 sign/verify, dev-key + derivation, production-safe tripwire, schema), 25 license-CLI + tests, and 17 Lite-tier feature-map + guard tests. - 165 GUI tests under `tests/gui/` driving Streamlit pages via `AppTest` (smoke + EN/ES localization, chrome, gate, workflows, dedup review, advanced panels, error paths, findings panel, activation + @@ -194,8 +195,14 @@ and proceeds. ## 17a. Licensing - **Storage**: ``~/.datatools/license.json`` (or - ``$DATATOOLS_LICENSE_PATH`` override). Signed locally with - HMAC-SHA256. + ``$DATATOOLS_LICENSE_PATH`` override). Signed with Ed25519 + (asymmetric). +- **Crypto**: Ed25519. The seller holds the private key; every + shipped binary embeds only the public key. A motivated reverse + engineer who pulls everything out of the binary still can't sign + new licenses. Keys are 32 bytes raw, exposed as hex via + ``DATATOOLS_LICENSE_PRIVKEY`` (seller-side) and + ``DATATOOLS_LICENSE_PUBKEY`` (build-time bake-in). - **Activation**: buyer pastes a base64-encoded license blob (``DTLIC1:...``) on first launch; app verifies the signature offline + matches the buyer-entered name/email to the embedded @@ -226,10 +233,17 @@ and proceeds. - **Lock badge**: the home grid shows a red 🔒 Locked pill on tool cards the current tier doesn't unlock. - **Dev bypass**: ``DATATOOLS_DEV_MODE=1`` skips every check (used by - the test suite and during development). + the test suite and during development). **Refused in shipped + builds** by the production-safe tripwire. +- **Production-safe tripwire**: ``assert_production_safe()`` runs at + startup in every frozen build. Refuses to boot when ``DEV_MODE`` + is set or the verification key is still the embedded dev key + (i.e., the build pipeline forgot to override + ``DATATOOLS_LICENSE_PUBKEY``). No-op in source / pytest runs. - **No internet**: signature verification is fully offline. The - shipped binary embeds the verification secret; see - ``docs/DECISIONS.md`` for the threat-model discussion. + shipped binary embeds only the public key; the private key never + leaves the seller. See ``docs/DECISIONS.md`` for the threat-model + discussion. ## 18. Error handling - Structured hierarchy: `DataToolsError` → `InputValidationError`, `ConfigError`, `FileFormatError`, `FileAccessError`. diff --git a/requirements.txt b/requirements.txt index a261810..2cf200d 100644 --- a/requirements.txt +++ b/requirements.txt @@ -8,3 +8,4 @@ tqdm>=4.66,<5 typer>=0.12,<1 phonenumbers>=8.13,<9 streamlit>=1.35,<2 +cryptography>=41,<46 diff --git a/scripts/generate_keypair.py b/scripts/generate_keypair.py new file mode 100644 index 0000000..2dd7c41 --- /dev/null +++ b/scripts/generate_keypair.py @@ -0,0 +1,106 @@ +#!/usr/bin/env python3 +"""Generate a fresh Ed25519 keypair for production license signing. + +**Creator-only.** Run once, write the private key somewhere safe, +configure the build pipeline with the public key. + +Usage:: + + python scripts/generate_keypair.py + python scripts/generate_keypair.py --json + python scripts/generate_keypair.py --output keys.txt + +The output looks like:: + + DATATOOLS_LICENSE_PRIVKEY=<64 hex chars> # KEEP SECRET + DATATOOLS_LICENSE_PUBKEY=<64 hex chars> # BAKE INTO BUILD + +The private key never goes near the buyer-facing binary. Stash it in +a password manager / KMS / hardware token; the only places it gets +loaded are: + +- ``scripts/generate_license.py`` when minting a buyer's blob +- Your CI's signing step, if you've automated blob minting + +The public key gets set as ``DATATOOLS_LICENSE_PUBKEY`` in the +PyInstaller build env (so the shipped binary verifies against it), +and the production-safe runtime check refuses to start any frozen +build that's still using the in-source dev key. +""" + +from __future__ import annotations + +import argparse +import json +import sys +from pathlib import Path + +from cryptography.hazmat.primitives import serialization +from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey + + +def generate() -> tuple[str, str]: + """Return ``(private_hex, public_hex)`` for a fresh keypair.""" + priv = Ed25519PrivateKey.generate() + priv_hex = priv.private_bytes( + encoding=serialization.Encoding.Raw, + format=serialization.PrivateFormat.Raw, + encryption_algorithm=serialization.NoEncryption(), + ).hex() + pub_hex = priv.public_key().public_bytes( + encoding=serialization.Encoding.Raw, + format=serialization.PublicFormat.Raw, + ).hex() + return priv_hex, pub_hex + + +def main(argv: list[str] | None = None) -> int: + p = argparse.ArgumentParser(description=__doc__.splitlines()[0]) + p.add_argument("--json", action="store_true", help="Emit JSON instead of env-file format.") + p.add_argument("--output", "-o", type=Path, default=None, help="Write to this file instead of stdout.") + args = p.parse_args(argv) + + priv_hex, pub_hex = generate() + + if args.json: + payload = json.dumps( + {"private_key": priv_hex, "public_key": pub_hex}, + indent=2, + ) + else: + payload = ( + f"# DataTools license keypair — generated by generate_keypair.py\n" + f"# KEEP THE PRIVATE KEY SECRET. Lose it and your existing\n" + f"# licenses can't be renewed (you'd have to ship a new build\n" + f"# with a new public key and re-issue every active license).\n" + f"\n" + f"DATATOOLS_LICENSE_PRIVKEY={priv_hex}\n" + f"DATATOOLS_LICENSE_PUBKEY={pub_hex}\n" + ) + + if args.output: + args.output.write_text(payload + "\n", encoding="utf-8") + # chmod 600 — best-effort; ignored on Windows. + try: + args.output.chmod(0o600) + except OSError: + pass + print(f"Wrote {args.output} (mode 600)", file=sys.stderr) + else: + print(payload) + + print( + "\nNext steps:\n" + " 1. Store the private key in your password manager.\n" + " 2. Bake the public key into the PyInstaller build:\n" + " DATATOOLS_LICENSE_PUBKEY= pyinstaller ...\n" + " 3. Mint buyer licenses by setting the private key:\n" + " DATATOOLS_LICENSE_PRIVKEY= " + "python scripts/generate_license.py --name 'Buyer' --email b@x.com\n", + file=sys.stderr, + ) + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/src/cli_license_guard.py b/src/cli_license_guard.py index 1ec6305..aec3148 100644 --- a/src/cli_license_guard.py +++ b/src/cli_license_guard.py @@ -53,9 +53,14 @@ def guard(feature: str | None = None) -> None: InvalidLicenseError, LicenseError, UnsupportedFeatureError, + assert_production_safe, get_manager, ) + # Refuse to run a misconfigured shipped build. No-op in + # development / pytest runs. + assert_production_safe() + mgr = get_manager() if mgr.dev_mode: return diff --git a/src/gui/components/_legacy.py b/src/gui/components/_legacy.py index e6fdcda..1e52f07 100644 --- a/src/gui/components/_legacy.py +++ b/src/gui/components/_legacy.py @@ -89,6 +89,12 @@ def hide_streamlit_chrome(*, gate_license: bool = True) -> None: can render its own form without recursion. """ st.markdown(_HIDE_CHROME_CSS, unsafe_allow_html=True) + # Production-safe check runs first so a misconfigured shipped + # build refuses to render anything (rather than rendering a + # broken activation form that doesn't accept real blobs). + # No-op in source / pytest runs. + from src.license import assert_production_safe + assert_production_safe() # Imported lazily so this module stays importable in environments # where the i18n packs haven't been laid out (e.g. unit tests of # individual legacy helpers). diff --git a/src/license/__init__.py b/src/license/__init__.py index 5f7b60e..676dbe5 100644 --- a/src/license/__init__.py +++ b/src/license/__init__.py @@ -34,12 +34,21 @@ from .errors import ( UnsupportedFeatureError, ) from .features import FEATURES_BY_TIER, all_features_for_tier -from .manager import LicenseManager, current_state, get_manager, require_feature +from .manager import ( + LicenseManager, + ProductionBuildError, + assert_production_safe, + current_state, + get_manager, + require_feature, +) from .schema import FeatureFlag, License, Tier __all__ = [ # Manager "LicenseManager", + "ProductionBuildError", + "assert_production_safe", "current_state", "get_manager", "require_feature", diff --git a/src/license/_dev_keypair.py b/src/license/_dev_keypair.py new file mode 100644 index 0000000..a5761dd --- /dev/null +++ b/src/license/_dev_keypair.py @@ -0,0 +1,73 @@ +"""**Development-only** Ed25519 keypair embedded in the source tree. + +This pair lets developers run / test / sign locally without needing +the production private key. Both values are deterministic from a +seed string (``hashlib.sha256(SEED).digest()``) so any contributor +checking out the source gets the same keys — which is exactly what +makes this keypair unsafe for production. + +============================================================ +DO NOT SHIP THIS KEYPAIR. +============================================================ + +For shipped builds: + +1. Run ``scripts/generate_keypair.py`` to produce a fresh production + keypair. +2. Stash the **private** key in your password manager / KMS. +3. In the PyInstaller build pipeline, set the env var + ``DATATOOLS_LICENSE_PUBKEY=`` so the + shipped binary verifies against the production key, not this dev + key. +4. The production-safe runtime check (``assert_production_safe``) + refuses to start a frozen build that's still verifying against + this dev key — that's the tripwire that catches a missing build + step. + +The matching seed phrase below is in source on purpose; rotating +the dev key means changing it here AND regenerating every test +fixture that hard-codes a blob. The seed includes the words +"DEV-seed-NOT-FOR-PRODUCTION" specifically so a string-grep against +a shipped binary would flag a missing build override immediately. +""" + +from __future__ import annotations + +import hashlib + +# The seed phrase. Hashed to 32 bytes → Ed25519 private-key seed. +DEV_SEED_PHRASE: bytes = ( + b"datatools-license-v2-DEV-seed-NOT-FOR-PRODUCTION" +) + +# Derived constants. Computed once at import for self-test +# (``test_dev_keypair_matches_seed`` in ``tests/test_license.py``) +# without doing crypto work on every import. +DEV_PRIVATE_KEY_HEX: str = ( + "0bdc196f098b84ed155bacbd00061d4fff2cb68e10109f94332f1fc7de194cdb" +) +DEV_PUBLIC_KEY_HEX: str = ( + "1cbef16b7826dd364ac0c7187d42c2ee00d76486e42389db05efa45dd1ade78a" +) + + +def _derive_from_seed() -> tuple[str, str]: + """Re-derive the dev keypair from the seed phrase. Used by the + unit test that pins the constants above to the seed.""" + from cryptography.hazmat.primitives.asymmetric.ed25519 import ( + Ed25519PrivateKey, + ) + from cryptography.hazmat.primitives import serialization + + seed = hashlib.sha256(DEV_SEED_PHRASE).digest() + priv = Ed25519PrivateKey.from_private_bytes(seed) + priv_hex = priv.private_bytes( + encoding=serialization.Encoding.Raw, + format=serialization.PrivateFormat.Raw, + encryption_algorithm=serialization.NoEncryption(), + ).hex() + pub_hex = priv.public_key().public_bytes( + encoding=serialization.Encoding.Raw, + format=serialization.PublicFormat.Raw, + ).hex() + return priv_hex, pub_hex diff --git a/src/license/crypto.py b/src/license/crypto.py index 580478a..1f4b053 100644 --- a/src/license/crypto.py +++ b/src/license/crypto.py @@ -1,85 +1,150 @@ -"""HMAC sign/verify for license blobs. +"""Ed25519 sign/verify for license blobs. -The signing secret is read from ``$DATATOOLS_LICENSE_SECRET`` if -present, otherwise from the build-time constant below. Replace the -constant at build time (via PyInstaller hook or a sed step in the -build pipeline) so the shipped binary has a different secret from -this repo's source tree. +Asymmetric model: -Threat model: honor-system DRM. A motivated reverse engineer can pull -the secret out of the binary, sign their own licenses, and bypass the -check. That's expected for $49 desktop software — the goal is to -discourage casual sharing, not stop targeted piracy. The 30-day -refund policy and the personal-name embedded in every license cover -the same gap from a different angle. +- **Private key** (32 bytes) lives with the seller only. It signs the + buyer's name/email/tier/etc into a license blob via + ``scripts/generate_license.py``. +- **Public key** (32 bytes) is embedded in every shipped binary. The + binary uses it to verify blobs at activation time. + +The split means a motivated reverse engineer who pulls everything out +of the binary still can't sign new licenses — they'd need the private +key, which never leaves the seller's environment. This is the key +upgrade vs. the v1 HMAC scheme: HMAC's symmetric secret was trivially +extractable, so anyone with the binary could mint blobs for any tier. + +Keys come from (in priority order): + +1. ``$DATATOOLS_LICENSE_PRIVKEY`` / ``$DATATOOLS_LICENSE_PUBKEY`` — + hex-encoded raw bytes. The build pipeline sets the pubkey here. +2. The dev-only constants in ``_dev_keypair.py`` — deterministic from + a seed, embedded in the source tree for local development and + testing. **Never** ship a binary that still uses these. + +A frozen / shipped build verifying against the dev key is a build +configuration error — ``assert_production_safe`` (see +``.manager``) fires loudly on startup in that case. + +Blob format: ``DTLIC2:`` + base64-encoded JSON. The version prefix +bumped from ``DTLIC1`` to ``DTLIC2`` when we switched from HMAC to +Ed25519, so old v1 blobs surface a clear "old format" error rather +than a confusing "signature mismatch". """ from __future__ import annotations import base64 -import hashlib -import hmac import json import os from typing import Any -# Build-time default. Replace via env var in shipped builds; keep this -# constant non-empty so unit tests have a stable verification key. -_DEFAULT_SECRET = ( - "datatools-license-v1-development-secret-" - "replace-at-build-time-via-DATATOOLS_LICENSE_SECRET" +from cryptography.exceptions import InvalidSignature +from cryptography.hazmat.primitives.asymmetric.ed25519 import ( + Ed25519PrivateKey, + Ed25519PublicKey, ) +from ._dev_keypair import DEV_PRIVATE_KEY_HEX, DEV_PUBLIC_KEY_HEX -def _secret_bytes() -> bytes: - """Return the active HMAC secret as bytes.""" - return os.environ.get("DATATOOLS_LICENSE_SECRET", _DEFAULT_SECRET).encode("utf-8") +# --------------------------------------------------------------------------- +# Key material +# --------------------------------------------------------------------------- + +def _privkey_hex() -> str: + """Hex-encoded raw Ed25519 private-key bytes. + + Read from ``$DATATOOLS_LICENSE_PRIVKEY`` first (where the seller + stashes their real key), falling back to the dev seed-derived + constant. The dev fallback only matters during testing / + development; a shipped build calling :func:`sign` is a bug (only + the seller's key-gen script does that). + """ + return os.environ.get("DATATOOLS_LICENSE_PRIVKEY") or DEV_PRIVATE_KEY_HEX + + +def _pubkey_hex() -> str: + """Hex-encoded raw Ed25519 public-key bytes. + + Read from ``$DATATOOLS_LICENSE_PUBKEY`` first (the build + pipeline sets this), falling back to the dev key. + """ + return os.environ.get("DATATOOLS_LICENSE_PUBKEY") or DEV_PUBLIC_KEY_HEX + + +def _privkey() -> Ed25519PrivateKey: + return Ed25519PrivateKey.from_private_bytes(bytes.fromhex(_privkey_hex())) + + +def _pubkey() -> Ed25519PublicKey: + return Ed25519PublicKey.from_public_bytes(bytes.fromhex(_pubkey_hex())) + + +def is_using_dev_key() -> bool: + """True when the active **public** key matches the embedded dev key. + + Used by :func:`.manager.assert_production_safe` to catch frozen + builds whose pubkey wasn't overridden at build time. + """ + return _pubkey_hex() == DEV_PUBLIC_KEY_HEX + + +# --------------------------------------------------------------------------- +# Canonical encoding (shared with v1 — same bytes, same hash, same sig) +# --------------------------------------------------------------------------- def _canonical_bytes(payload: dict[str, Any]) -> bytes: - """Canonical JSON encoding for the HMAC input. + """Canonical JSON encoding for the signature input. ``sort_keys=True`` + ``separators=(",", ":")`` produce a byte-for- byte deterministic representation across Python versions and OS - locales. Without that, two structurally-identical dicts could hash - to different signatures. + locales. Without that, two structurally-identical dicts could + produce different signatures. """ return json.dumps(payload, sort_keys=True, separators=(",", ":")).encode("utf-8") +# --------------------------------------------------------------------------- +# Sign / verify +# --------------------------------------------------------------------------- + def sign(payload: dict[str, Any]) -> str: - """Compute the HMAC-SHA256 hex digest over *payload*. + """Produce an Ed25519 signature over *payload*, hex-encoded. - *payload* MUST NOT contain a ``signature`` key — that's the field - we're computing. The caller is responsible for stripping it. + Caller must strip any existing ``signature`` field — the function + signs whatever it's given, including a stale signature, which + would never verify because verify recomputes from a fresh + no-``signature`` canonical form. """ - digest = hmac.new(_secret_bytes(), _canonical_bytes(payload), hashlib.sha256) - return digest.hexdigest() + sig_bytes = _privkey().sign(_canonical_bytes(payload)) + return sig_bytes.hex() -def verify(payload: dict[str, Any], signature: str) -> bool: - """Constant-time compare between the recomputed HMAC and *signature*. - - Returns ``True`` on a match. Uses :func:`hmac.compare_digest` so a - timing oracle can't be used to recover the secret one byte at a - time — overkill for honor-system DRM, but free. - """ - expected = sign(payload) - return hmac.compare_digest(expected.encode("ascii"), signature.encode("ascii")) +def verify(payload: dict[str, Any], signature_hex: str) -> bool: + """Verify *signature_hex* against *payload*. Returns True/False; + never raises (a missing or malformed signature is just False).""" + try: + sig_bytes = bytes.fromhex(signature_hex) + except ValueError: + return False + try: + _pubkey().verify(sig_bytes, _canonical_bytes(payload)) + return True + except InvalidSignature: + return False # --------------------------------------------------------------------------- # Blob encoding / decoding # --------------------------------------------------------------------------- -# A "license blob" is the artifact the buyer pastes into the activation -# form. It's a base64-encoded JSON dict containing every license field -# *plus* the signature. We choose base64 over raw JSON so the blob is -# one paste-able token (no whitespace surprises) and so a typo -# truncates the blob into an obviously-invalid form rather than a -# subtly-mutated payload. - -_BLOB_PREFIX = "DTLIC1:" +# Buyers paste this whole token into the activation page. The prefix +# is the version marker: +# DTLIC1 — old HMAC scheme (no longer accepted) +# DTLIC2 — Ed25519 (current) +_BLOB_PREFIX = "DTLIC2:" +_OLD_PREFIX = "DTLIC1:" def encode_blob(payload_with_signature: dict[str, Any]) -> str: @@ -92,10 +157,15 @@ def encode_blob(payload_with_signature: dict[str, Any]) -> str: def decode_blob(blob: str) -> dict[str, Any]: """Reverse of :func:`encode_blob`. Raises ``ValueError`` on a - blob that doesn't carry the expected prefix or doesn't decode - cleanly — both surface as :class:`InvalidLicenseError` at the - manager layer.""" + blob that doesn't carry the expected prefix, doesn't decode + cleanly, or carries the v1 prefix (which we no longer accept).""" s = blob.strip() + if s.startswith(_OLD_PREFIX): + raise ValueError( + f"License blob is the old {_OLD_PREFIX!r} format. v1 blobs " + "used a symmetric secret that has since been retired — " + "request a new blob from support." + ) if not s.startswith(_BLOB_PREFIX): raise ValueError( f"License blob missing {_BLOB_PREFIX!r} prefix. " diff --git a/src/license/manager.py b/src/license/manager.py index ee6573d..8832575 100644 --- a/src/license/manager.py +++ b/src/license/manager.py @@ -6,6 +6,7 @@ constructor for full isolation. Lifecycle:: + assert_production_safe() # guard against build-config errors mgr = get_manager() if not mgr.is_activated(): mgr.activate_from_blob(blob, name, email) @@ -17,6 +18,7 @@ from __future__ import annotations import os import re +import sys import uuid from dataclasses import dataclass from datetime import datetime, timezone @@ -468,3 +470,69 @@ def current_state() -> LicenseState: def require_feature(feature: str | FeatureFlag) -> License: return get_manager().require_feature(feature) + + +# --------------------------------------------------------------------------- +# Production-build sanity check +# --------------------------------------------------------------------------- + +class ProductionBuildError(RuntimeError): + """Raised when a frozen / shipped build is misconfigured in a way + that would defeat licensing. Always loud, always fatal — the + binary must not boot in this state.""" + + +def _is_shipped_build() -> bool: + """True when running from a PyInstaller bundle (``sys.frozen``). + + Set automatically by PyInstaller; not set in source / pytest + runs. The whole purpose of the prod-safe check is to enforce + invariants that only matter in a shipped build, so the rest of + the codebase can stay flexible. + """ + return getattr(sys, "frozen", False) + + +def assert_production_safe() -> None: + """Fail loudly if a shipped build is misconfigured. + + Two tripwires: + + 1. ``DATATOOLS_DEV_MODE`` is set in a frozen build. The dev-mode + env var unconditionally bypasses license verification — if a + buyer's installer somehow ships it enabled (build pipeline + bug, mis-set environment), every license check is a no-op. + Refuse to start instead. + + 2. The active verification key is still the dev key. The build + pipeline is supposed to override + ``DATATOOLS_LICENSE_PUBKEY`` with the production key; if it + didn't, the binary will reject every legitimate license + (signed with the prod private key) AND would *accept* + anything signed with the dev key (which is checked into the + source tree). Refuse to start. + + No-ops in non-frozen runs (development, tests) so the dev key + + dev mode keep working in those contexts. Production builds + call this from :func:`src.cli_license_guard.guard` and + :func:`src.gui.components.hide_streamlit_chrome`. + """ + if not _is_shipped_build(): + return + + if _truthy_env("DATATOOLS_DEV_MODE"): + raise ProductionBuildError( + "DATATOOLS_DEV_MODE is set in a shipped build. This env " + "var disables every license check and must never be set " + "on a buyer machine. If you see this message in a release " + "build, the install was misconfigured — contact support." + ) + + if crypto.is_using_dev_key(): + raise ProductionBuildError( + "Shipped build is verifying against the development " + "license key. The build pipeline must set " + "DATATOOLS_LICENSE_PUBKEY to the production public key " + "before packaging. This binary will reject every real " + "license blob — re-download from the official channel." + ) diff --git a/tests/test_license.py b/tests/test_license.py index f235e62..77672ec 100644 --- a/tests/test_license.py +++ b/tests/test_license.py @@ -35,9 +35,9 @@ from src.license import ( UnsupportedFeatureError, ) from src.license.crypto import ( - _DEFAULT_SECRET, decode_blob, encode_blob, + is_using_dev_key, sign, verify, ) @@ -138,13 +138,26 @@ class TestSignAndVerify: bad = sig[:-1] + ("0" if sig[-1] != "0" else "1") assert verify(payload, bad) is False - def test_sign_respects_secret_env_override(self, monkeypatch): + def test_sign_respects_privkey_env_override(self, monkeypatch): + # Use a different valid Ed25519 private key (32 bytes hex). + # Picked arbitrarily; doesn't need to match the dev key. + alt_priv = "00" * 32 payload = {"a": 1} - monkeypatch.setenv("DATATOOLS_LICENSE_SECRET", "alternate") - alt = sign(payload) - monkeypatch.delenv("DATATOOLS_LICENSE_SECRET", raising=False) - default = sign(payload) - assert alt != default + monkeypatch.setenv("DATATOOLS_LICENSE_PRIVKEY", alt_priv) + alt_sig = sign(payload) + monkeypatch.delenv("DATATOOLS_LICENSE_PRIVKEY", raising=False) + default_sig = sign(payload) + assert alt_sig != default_sig + + def test_verify_with_wrong_pubkey_returns_false(self, monkeypatch): + # Sign with the dev key (default), then swap the pubkey and + # confirm verification fails. + payload = {"a": 1} + sig = sign(payload) + # 32-byte hex that isn't the matching dev pubkey. + wrong_pub = "11" * 32 + monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", wrong_pub) + assert verify(payload, sig) is False def test_canonical_form_is_key_order_invariant(self): a = {"x": 1, "y": 2} @@ -159,17 +172,26 @@ class TestBlobEncodeDecode: again = decode_blob(blob) assert again == payload - def test_blob_has_human_readable_prefix(self): + def test_blob_uses_v2_prefix(self): + """v1.6 switched HMAC → Ed25519; blob version bumped to DTLIC2. + Pin the prefix so any future scheme change is intentional.""" blob = encode_blob({"x": 1}) - assert blob.startswith("DTLIC1:") + assert blob.startswith("DTLIC2:") def test_decode_rejects_missing_prefix(self): - with pytest.raises(ValueError, match="DTLIC1"): + with pytest.raises(ValueError, match="DTLIC2"): decode_blob("not-a-blob") + def test_decode_rejects_v1_blob_with_clear_message(self): + """A v1 (HMAC) blob must surface a clear 'old format' message + rather than 'signature mismatch' — buyers redeeming an old + delivery email need to know to request a new blob.""" + with pytest.raises(ValueError, match="DTLIC1"): + decode_blob("DTLIC1:eyJhIjogMX0=") + def test_decode_rejects_bad_base64(self): with pytest.raises(ValueError, match="base64"): - decode_blob("DTLIC1:!!!notbase64!!!") + decode_blob("DTLIC2:!!!notbase64!!!") def test_decode_rejects_truncated_blob(self): blob = encode_blob({"x": 1}) @@ -178,6 +200,72 @@ class TestBlobEncodeDecode: decode_blob(truncated) +class TestDevKeypair: + """The embedded dev keypair must match the seed phrase so anyone + reproducing the build gets the same values. Catches a hand-edit + to ``_dev_keypair.py`` that drifts the constants from the seed.""" + + def test_dev_keypair_matches_seed(self): + from src.license._dev_keypair import ( + DEV_PRIVATE_KEY_HEX, + DEV_PUBLIC_KEY_HEX, + _derive_from_seed, + ) + derived_priv, derived_pub = _derive_from_seed() + assert derived_priv == DEV_PRIVATE_KEY_HEX + assert derived_pub == DEV_PUBLIC_KEY_HEX + + def test_is_using_dev_key_true_by_default(self, monkeypatch): + monkeypatch.delenv("DATATOOLS_LICENSE_PUBKEY", raising=False) + assert is_using_dev_key() is True + + def test_is_using_dev_key_false_when_overridden(self, monkeypatch): + monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", "22" * 32) + assert is_using_dev_key() is False + + +class TestProductionSafe: + """``assert_production_safe`` is a tripwire that fires only in + frozen / shipped builds. Tests simulate the frozen state via + monkeypatching ``sys.frozen``.""" + + def test_no_op_in_source_run(self): + # Default test run: sys.frozen is unset; nothing should raise. + from src.license import assert_production_safe + assert_production_safe() # no exception + + def test_raises_on_dev_mode_in_frozen_build(self, monkeypatch): + from src.license import ( + ProductionBuildError, + assert_production_safe, + ) + monkeypatch.setattr("sys.frozen", True, raising=False) + monkeypatch.setenv("DATATOOLS_DEV_MODE", "1") + # Override pubkey so the dev-key check doesn't fire first. + monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", "22" * 32) + with pytest.raises(ProductionBuildError, match="DATATOOLS_DEV_MODE"): + assert_production_safe() + + def test_raises_on_dev_key_in_frozen_build(self, monkeypatch): + from src.license import ( + ProductionBuildError, + assert_production_safe, + ) + monkeypatch.setattr("sys.frozen", True, raising=False) + monkeypatch.delenv("DATATOOLS_DEV_MODE", raising=False) + monkeypatch.delenv("DATATOOLS_LICENSE_PUBKEY", raising=False) + with pytest.raises(ProductionBuildError, match="development license key"): + assert_production_safe() + + def test_passes_in_frozen_build_with_prod_pubkey(self, monkeypatch): + from src.license import assert_production_safe + monkeypatch.setattr("sys.frozen", True, raising=False) + monkeypatch.delenv("DATATOOLS_DEV_MODE", raising=False) + monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", "22" * 32) + # Should not raise. + assert_production_safe() + + # --------------------------------------------------------------------------- # Features # ---------------------------------------------------------------------------