sec(license): Ed25519 sigs + production-safe tripwire

Two coupled hardening upgrades.

1. Asymmetric signatures (HMAC → Ed25519)

The previous HMAC scheme used a symmetric secret that any motivated
reverse engineer could pull out of the shipped binary and use to
mint blobs for any tier / name / email. With Ed25519, the binary
ships only the public verification key; the signing key never
leaves the seller's environment, so binary compromise no longer
yields forgery.

- src/license/crypto.py rewritten around
  cryptography.hazmat.primitives.asymmetric.ed25519. Same public
  API surface (sign/verify/encode_blob/decode_blob), same canonical
  JSON encoding — drop-in for the manager / cli / GUI layers.
- DATATOOLS_LICENSE_PRIVKEY (seller-side) and
  DATATOOLS_LICENSE_PUBKEY (build-time) env vars supply the keys;
  the in-source dev keypair (src/license/_dev_keypair.py)
  deterministically derives from a seed phrase for repro builds and
  tests.
- Blob prefix bumped DTLIC1: → DTLIC2:. Decoding a DTLIC1 blob
  surfaces a clear "old format" error rather than a confusing
  signature mismatch.
- scripts/generate_keypair.py mints fresh production keypairs for
  the seller (run once, stash the private key offline). Adds
  cryptography>=41,<46 to requirements.txt (was an undeclared
  transitive dep).

2. Production-safe tripwire

assert_production_safe() refuses to boot a frozen / shipped build
when either:

- DATATOOLS_DEV_MODE=1 is set (would unconditionally bypass every
  license check — fine in source/test but catastrophic in a buyer
  install).
- The active verification key is still the embedded dev key (the
  build pipeline forgot to set DATATOOLS_LICENSE_PUBKEY).

No-op in source / pytest runs (sys.frozen is unset) so test
fixtures and dev workflows keep working without ceremony. Called
from src/cli_license_guard.guard() and from hide_streamlit_chrome
— so it fires on every CLI invocation and every GUI page load.

Tests: 49 license-layer unit tests (was 40); added Ed25519
wrong-key rejection, dev-keypair seed pin, blob v2 prefix, v1
rejection with clear message, and four production-safe scenarios
(no-op in source, fires on DEV_MODE in frozen, fires on dev key in
frozen, passes in frozen with prod pubkey). Total: 2024 → 2033.

Docs (REQUIREMENTS §17a, DEVELOPER licensing recipe, DECISIONS
§9b + decision log) updated with the new threat-model write-up,
key-storage workflow, and tripwire behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-13 17:34:48 +00:00
parent d32b58e61a
commit e534fb4989
12 changed files with 549 additions and 75 deletions

View File

@@ -178,6 +178,8 @@ $49-79/bundle · $149 full suite (when 3+ exist).
| May 13 (v1.6) | Ship licensing: 1-year HMAC-signed blobs, name+email registration, offline verification, tier-scaffolded for future SKUs | Unlock the lifetime-update business model without recurring infra. Honor-system DRM (HMAC + 30-day refund) — sufficient at $49. See §9b below. | | May 13 (v1.6) | Ship licensing: 1-year HMAC-signed blobs, name+email registration, offline verification, tier-scaffolded for future SKUs | Unlock the lifetime-update business model without recurring infra. Honor-system DRM (HMAC + 30-day refund) — sufficient at $49. See §9b below. |
| May 13 (v1.6) | Add Lite SKU (Dedup + Text Cleaner + Format Standardizer) | Lower-priced entry point for buyers who only need the three universal tools. Per-tool feature gating + lock badges on the home grid surface the upgrade path. See §9b. | | May 13 (v1.6) | Add Lite SKU (Dedup + Text Cleaner + Format Standardizer) | Lower-priced entry point for buyers who only need the three universal tools. Per-tool feature gating + lock badges on the home grid surface the upgrade path. See §9b. |
| May 13 (v1.6) | Remove user-facing free trial | A 1-year all-features trial undercut the paid Lite SKU. Paid-only keeps tier economics clean. Internal ``_mint`` API still exists for tests and the seller's key generator. See §9b. | | May 13 (v1.6) | Remove user-facing free trial | A 1-year all-features trial undercut the paid Lite SKU. Paid-only keeps tier economics clean. Internal ``_mint`` API still exists for tests and the seller's key generator. See §9b. |
| May 13 (v1.6) | Upgrade license crypto: HMAC → Ed25519 (asymmetric) | HMAC's symmetric secret was extractable from the shipped binary — anyone with the binary could mint blobs. Ed25519 splits sign (seller) from verify (binary), so binary compromise doesn't let an attacker forge licenses. Blob prefix bumped DTLIC1 → DTLIC2. See §9b. |
| May 13 (v1.6) | Add ``assert_production_safe`` tripwire | A shipped build with ``DATATOOLS_DEV_MODE=1`` or the in-source dev pubkey would silently defeat licensing. The tripwire refuses to boot such a build. No-op in source / pytest runs. See §9b. |
## 9b. Licensing model ## 9b. Licensing model
@@ -191,7 +193,13 @@ $49-79/bundle · $149 full suite (when 3+ exist).
| Time-bombed binary (PyInstaller --no-license) | Rejected. Can't deliver renewals without re-shipping the installer. | | Time-bombed binary (PyInstaller --no-license) | Rejected. Can't deliver renewals without re-shipping the installer. |
| Hardware-locked license | Rejected. Friction on legitimate device-swaps; doesn't match the buyer persona's tolerance. | | Hardware-locked license | Rejected. Friction on legitimate device-swaps; doesn't match the buyer persona's tolerance. |
**Threat model**: a motivated reverse engineer can pull the HMAC secret out of the binary, mint their own licenses, and bypass the check. That's acceptable — the goal is to discourage casual blob-sharing among non-technical buyers, not stop targeted piracy. The 30-day refund window covers the same gap from a different angle (anyone who shares their blob is implicitly authorizing the buyer to issue them a refund-on-demand). **Threat model** (v1.6 — Ed25519): the binary ships only the public key. A motivated reverse engineer who pulls everything out of the binary has the verification key but not the signing key — they can't mint new licenses. The earlier HMAC scheme had this hole; the asymmetric upgrade closes it. The remaining attack surface is:
- Re-signing with a forked binary that ships an attacker-controlled pubkey + auto-grants licenses. Costs more effort than the price of a legitimate copy and the result is per-fork, not shareable.
- Hooking the verification call to always return True. Defeats DRM entirely but only on the attacker's own machine — they could just write down "I unlocked DataTools" and skip the work.
- Setting ``DATATOOLS_DEV_MODE=1`` to bypass checks. **Refused in shipped builds** by ``assert_production_safe``; works in source/test runs only.
The 30-day refund window covers casual blob sharing from a different angle (anyone who shares their blob is implicitly authorizing the buyer to issue them a refund-on-demand).
**What's enforced**: **What's enforced**:
- License blob signature must match (HMAC-SHA256 with the build secret). - License blob signature must match (HMAC-SHA256 with the build secret).

View File

@@ -143,9 +143,35 @@ require_feature(FeatureFlag.DEDUPLICATOR)
``` ```
**Storage**: ``~/.datatools/license.json`` (override via **Storage**: ``~/.datatools/license.json`` (override via
``DATATOOLS_LICENSE_PATH``). Signed locally with HMAC-SHA256 using a ``DATATOOLS_LICENSE_PATH``). Signed with Ed25519 (asymmetric) — the
secret read from ``DATATOOLS_LICENSE_SECRET`` (build-time replace; the seller's private key signs; the buyer's binary verifies with the
in-repo default is a development placeholder). embedded public key.
**Key material**:
| Variable | Who has it | Where it's used |
|---|---|---|
| ``DATATOOLS_LICENSE_PRIVKEY`` | Seller only | ``scripts/generate_license.py`` (mint a buyer's blob), ``scripts/generate_keypair.py`` writes a fresh one |
| ``DATATOOLS_LICENSE_PUBKEY`` | Every shipped binary | Verification at activation time; set at build time via PyInstaller env |
If neither env var is set, ``src.license.crypto`` falls back to the
deterministic dev keypair in ``src/license/_dev_keypair.py``. The
dev key is in source on purpose (so tests work without secrets),
but a frozen build that's still using it is a build-config bug —
:func:`assert_production_safe` refuses to start such a binary.
**First-time setup for shipped builds**:
1. ``python scripts/generate_keypair.py --output prod-keys.env`` —
creates a fresh keypair.
2. Stash ``DATATOOLS_LICENSE_PRIVKEY`` somewhere safe (password
manager / KMS). Lose it and you can't issue renewals without
reshipping a new build with a new public key.
3. Configure the PyInstaller build env with
``DATATOOLS_LICENSE_PUBKEY=<hex>`` so the shipped binary
verifies against the production key.
4. Mint buyer licenses with
``DATATOOLS_LICENSE_PRIVKEY=<hex> python scripts/generate_license.py ...``.
**Dev bypass**: ``DATATOOLS_DEV_MODE=1`` short-circuits every check. **Dev bypass**: ``DATATOOLS_DEV_MODE=1`` short-circuits every check.
The test suite's autouse fixture sets this so existing tests don't The test suite's autouse fixture sets this so existing tests don't

View File

@@ -174,10 +174,11 @@ and proceeds.
- **Dev**: pytest, tox. - **Dev**: pytest, tox.
## 16. Test coverage ## 16. Test coverage
- 2,024 tests passing, 0 skipped, 0 xfailed. - 2,033 tests passing, 0 skipped, 0 xfailed.
- 1,859 core + CLI tests (run with `pytest -m 'not gui'` for a quick loop). - 1,868 core + CLI tests (run with `pytest -m 'not gui'` for a quick loop).
Includes 40 license-layer unit tests, 25 license-CLI tests, and Includes 49 license-layer unit tests (Ed25519 sign/verify, dev-key
17 Lite-tier feature-map + guard tests. derivation, production-safe tripwire, schema), 25 license-CLI
tests, and 17 Lite-tier feature-map + guard tests.
- 165 GUI tests under `tests/gui/` driving Streamlit pages via `AppTest` - 165 GUI tests under `tests/gui/` driving Streamlit pages via `AppTest`
(smoke + EN/ES localization, chrome, gate, workflows, dedup review, (smoke + EN/ES localization, chrome, gate, workflows, dedup review,
advanced panels, error paths, findings panel, activation + advanced panels, error paths, findings panel, activation +
@@ -194,8 +195,14 @@ and proceeds.
## 17a. Licensing ## 17a. Licensing
- **Storage**: ``~/.datatools/license.json`` (or - **Storage**: ``~/.datatools/license.json`` (or
``$DATATOOLS_LICENSE_PATH`` override). Signed locally with ``$DATATOOLS_LICENSE_PATH`` override). Signed with Ed25519
HMAC-SHA256. (asymmetric).
- **Crypto**: Ed25519. The seller holds the private key; every
shipped binary embeds only the public key. A motivated reverse
engineer who pulls everything out of the binary still can't sign
new licenses. Keys are 32 bytes raw, exposed as hex via
``DATATOOLS_LICENSE_PRIVKEY`` (seller-side) and
``DATATOOLS_LICENSE_PUBKEY`` (build-time bake-in).
- **Activation**: buyer pastes a base64-encoded license blob - **Activation**: buyer pastes a base64-encoded license blob
(``DTLIC1:...``) on first launch; app verifies the signature (``DTLIC1:...``) on first launch; app verifies the signature
offline + matches the buyer-entered name/email to the embedded offline + matches the buyer-entered name/email to the embedded
@@ -226,10 +233,17 @@ and proceeds.
- **Lock badge**: the home grid shows a red 🔒 Locked pill on tool - **Lock badge**: the home grid shows a red 🔒 Locked pill on tool
cards the current tier doesn't unlock. cards the current tier doesn't unlock.
- **Dev bypass**: ``DATATOOLS_DEV_MODE=1`` skips every check (used by - **Dev bypass**: ``DATATOOLS_DEV_MODE=1`` skips every check (used by
the test suite and during development). the test suite and during development). **Refused in shipped
builds** by the production-safe tripwire.
- **Production-safe tripwire**: ``assert_production_safe()`` runs at
startup in every frozen build. Refuses to boot when ``DEV_MODE``
is set or the verification key is still the embedded dev key
(i.e., the build pipeline forgot to override
``DATATOOLS_LICENSE_PUBKEY``). No-op in source / pytest runs.
- **No internet**: signature verification is fully offline. The - **No internet**: signature verification is fully offline. The
shipped binary embeds the verification secret; see shipped binary embeds only the public key; the private key never
``docs/DECISIONS.md`` for the threat-model discussion. leaves the seller. See ``docs/DECISIONS.md`` for the threat-model
discussion.
## 18. Error handling ## 18. Error handling
- Structured hierarchy: `DataToolsError` → `InputValidationError`, `ConfigError`, `FileFormatError`, `FileAccessError`. - Structured hierarchy: `DataToolsError` → `InputValidationError`, `ConfigError`, `FileFormatError`, `FileAccessError`.

View File

@@ -8,3 +8,4 @@ tqdm>=4.66,<5
typer>=0.12,<1 typer>=0.12,<1
phonenumbers>=8.13,<9 phonenumbers>=8.13,<9
streamlit>=1.35,<2 streamlit>=1.35,<2
cryptography>=41,<46

106
scripts/generate_keypair.py Normal file
View File

@@ -0,0 +1,106 @@
#!/usr/bin/env python3
"""Generate a fresh Ed25519 keypair for production license signing.
**Creator-only.** Run once, write the private key somewhere safe,
configure the build pipeline with the public key.
Usage::
python scripts/generate_keypair.py
python scripts/generate_keypair.py --json
python scripts/generate_keypair.py --output keys.txt
The output looks like::
DATATOOLS_LICENSE_PRIVKEY=<64 hex chars> # KEEP SECRET
DATATOOLS_LICENSE_PUBKEY=<64 hex chars> # BAKE INTO BUILD
The private key never goes near the buyer-facing binary. Stash it in
a password manager / KMS / hardware token; the only places it gets
loaded are:
- ``scripts/generate_license.py`` when minting a buyer's blob
- Your CI's signing step, if you've automated blob minting
The public key gets set as ``DATATOOLS_LICENSE_PUBKEY`` in the
PyInstaller build env (so the shipped binary verifies against it),
and the production-safe runtime check refuses to start any frozen
build that's still using the in-source dev key.
"""
from __future__ import annotations
import argparse
import json
import sys
from pathlib import Path
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
def generate() -> tuple[str, str]:
"""Return ``(private_hex, public_hex)`` for a fresh keypair."""
priv = Ed25519PrivateKey.generate()
priv_hex = priv.private_bytes(
encoding=serialization.Encoding.Raw,
format=serialization.PrivateFormat.Raw,
encryption_algorithm=serialization.NoEncryption(),
).hex()
pub_hex = priv.public_key().public_bytes(
encoding=serialization.Encoding.Raw,
format=serialization.PublicFormat.Raw,
).hex()
return priv_hex, pub_hex
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser(description=__doc__.splitlines()[0])
p.add_argument("--json", action="store_true", help="Emit JSON instead of env-file format.")
p.add_argument("--output", "-o", type=Path, default=None, help="Write to this file instead of stdout.")
args = p.parse_args(argv)
priv_hex, pub_hex = generate()
if args.json:
payload = json.dumps(
{"private_key": priv_hex, "public_key": pub_hex},
indent=2,
)
else:
payload = (
f"# DataTools license keypair — generated by generate_keypair.py\n"
f"# KEEP THE PRIVATE KEY SECRET. Lose it and your existing\n"
f"# licenses can't be renewed (you'd have to ship a new build\n"
f"# with a new public key and re-issue every active license).\n"
f"\n"
f"DATATOOLS_LICENSE_PRIVKEY={priv_hex}\n"
f"DATATOOLS_LICENSE_PUBKEY={pub_hex}\n"
)
if args.output:
args.output.write_text(payload + "\n", encoding="utf-8")
# chmod 600 — best-effort; ignored on Windows.
try:
args.output.chmod(0o600)
except OSError:
pass
print(f"Wrote {args.output} (mode 600)", file=sys.stderr)
else:
print(payload)
print(
"\nNext steps:\n"
" 1. Store the private key in your password manager.\n"
" 2. Bake the public key into the PyInstaller build:\n"
" DATATOOLS_LICENSE_PUBKEY=<pubkey> pyinstaller ...\n"
" 3. Mint buyer licenses by setting the private key:\n"
" DATATOOLS_LICENSE_PRIVKEY=<privkey> "
"python scripts/generate_license.py --name 'Buyer' --email b@x.com\n",
file=sys.stderr,
)
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -53,9 +53,14 @@ def guard(feature: str | None = None) -> None:
InvalidLicenseError, InvalidLicenseError,
LicenseError, LicenseError,
UnsupportedFeatureError, UnsupportedFeatureError,
assert_production_safe,
get_manager, get_manager,
) )
# Refuse to run a misconfigured shipped build. No-op in
# development / pytest runs.
assert_production_safe()
mgr = get_manager() mgr = get_manager()
if mgr.dev_mode: if mgr.dev_mode:
return return

View File

@@ -89,6 +89,12 @@ def hide_streamlit_chrome(*, gate_license: bool = True) -> None:
can render its own form without recursion. can render its own form without recursion.
""" """
st.markdown(_HIDE_CHROME_CSS, unsafe_allow_html=True) st.markdown(_HIDE_CHROME_CSS, unsafe_allow_html=True)
# Production-safe check runs first so a misconfigured shipped
# build refuses to render anything (rather than rendering a
# broken activation form that doesn't accept real blobs).
# No-op in source / pytest runs.
from src.license import assert_production_safe
assert_production_safe()
# Imported lazily so this module stays importable in environments # Imported lazily so this module stays importable in environments
# where the i18n packs haven't been laid out (e.g. unit tests of # where the i18n packs haven't been laid out (e.g. unit tests of
# individual legacy helpers). # individual legacy helpers).

View File

@@ -34,12 +34,21 @@ from .errors import (
UnsupportedFeatureError, UnsupportedFeatureError,
) )
from .features import FEATURES_BY_TIER, all_features_for_tier from .features import FEATURES_BY_TIER, all_features_for_tier
from .manager import LicenseManager, current_state, get_manager, require_feature from .manager import (
LicenseManager,
ProductionBuildError,
assert_production_safe,
current_state,
get_manager,
require_feature,
)
from .schema import FeatureFlag, License, Tier from .schema import FeatureFlag, License, Tier
__all__ = [ __all__ = [
# Manager # Manager
"LicenseManager", "LicenseManager",
"ProductionBuildError",
"assert_production_safe",
"current_state", "current_state",
"get_manager", "get_manager",
"require_feature", "require_feature",

View File

@@ -0,0 +1,73 @@
"""**Development-only** Ed25519 keypair embedded in the source tree.
This pair lets developers run / test / sign locally without needing
the production private key. Both values are deterministic from a
seed string (``hashlib.sha256(SEED).digest()``) so any contributor
checking out the source gets the same keys — which is exactly what
makes this keypair unsafe for production.
============================================================
DO NOT SHIP THIS KEYPAIR.
============================================================
For shipped builds:
1. Run ``scripts/generate_keypair.py`` to produce a fresh production
keypair.
2. Stash the **private** key in your password manager / KMS.
3. In the PyInstaller build pipeline, set the env var
``DATATOOLS_LICENSE_PUBKEY=<production-pubkey-hex>`` so the
shipped binary verifies against the production key, not this dev
key.
4. The production-safe runtime check (``assert_production_safe``)
refuses to start a frozen build that's still verifying against
this dev key — that's the tripwire that catches a missing build
step.
The matching seed phrase below is in source on purpose; rotating
the dev key means changing it here AND regenerating every test
fixture that hard-codes a blob. The seed includes the words
"DEV-seed-NOT-FOR-PRODUCTION" specifically so a string-grep against
a shipped binary would flag a missing build override immediately.
"""
from __future__ import annotations
import hashlib
# The seed phrase. Hashed to 32 bytes → Ed25519 private-key seed.
DEV_SEED_PHRASE: bytes = (
b"datatools-license-v2-DEV-seed-NOT-FOR-PRODUCTION"
)
# Derived constants. Computed once at import for self-test
# (``test_dev_keypair_matches_seed`` in ``tests/test_license.py``)
# without doing crypto work on every import.
DEV_PRIVATE_KEY_HEX: str = (
"0bdc196f098b84ed155bacbd00061d4fff2cb68e10109f94332f1fc7de194cdb"
)
DEV_PUBLIC_KEY_HEX: str = (
"1cbef16b7826dd364ac0c7187d42c2ee00d76486e42389db05efa45dd1ade78a"
)
def _derive_from_seed() -> tuple[str, str]:
"""Re-derive the dev keypair from the seed phrase. Used by the
unit test that pins the constants above to the seed."""
from cryptography.hazmat.primitives.asymmetric.ed25519 import (
Ed25519PrivateKey,
)
from cryptography.hazmat.primitives import serialization
seed = hashlib.sha256(DEV_SEED_PHRASE).digest()
priv = Ed25519PrivateKey.from_private_bytes(seed)
priv_hex = priv.private_bytes(
encoding=serialization.Encoding.Raw,
format=serialization.PrivateFormat.Raw,
encryption_algorithm=serialization.NoEncryption(),
).hex()
pub_hex = priv.public_key().public_bytes(
encoding=serialization.Encoding.Raw,
format=serialization.PublicFormat.Raw,
).hex()
return priv_hex, pub_hex

View File

@@ -1,85 +1,150 @@
"""HMAC sign/verify for license blobs. """Ed25519 sign/verify for license blobs.
The signing secret is read from ``$DATATOOLS_LICENSE_SECRET`` if Asymmetric model:
present, otherwise from the build-time constant below. Replace the
constant at build time (via PyInstaller hook or a sed step in the
build pipeline) so the shipped binary has a different secret from
this repo's source tree.
Threat model: honor-system DRM. A motivated reverse engineer can pull - **Private key** (32 bytes) lives with the seller only. It signs the
the secret out of the binary, sign their own licenses, and bypass the buyer's name/email/tier/etc into a license blob via
check. That's expected for $49 desktop software — the goal is to ``scripts/generate_license.py``.
discourage casual sharing, not stop targeted piracy. The 30-day - **Public key** (32 bytes) is embedded in every shipped binary. The
refund policy and the personal-name embedded in every license cover binary uses it to verify blobs at activation time.
the same gap from a different angle.
The split means a motivated reverse engineer who pulls everything out
of the binary still can't sign new licenses — they'd need the private
key, which never leaves the seller's environment. This is the key
upgrade vs. the v1 HMAC scheme: HMAC's symmetric secret was trivially
extractable, so anyone with the binary could mint blobs for any tier.
Keys come from (in priority order):
1. ``$DATATOOLS_LICENSE_PRIVKEY`` / ``$DATATOOLS_LICENSE_PUBKEY`` —
hex-encoded raw bytes. The build pipeline sets the pubkey here.
2. The dev-only constants in ``_dev_keypair.py`` — deterministic from
a seed, embedded in the source tree for local development and
testing. **Never** ship a binary that still uses these.
A frozen / shipped build verifying against the dev key is a build
configuration error — ``assert_production_safe`` (see
``.manager``) fires loudly on startup in that case.
Blob format: ``DTLIC2:`` + base64-encoded JSON. The version prefix
bumped from ``DTLIC1`` to ``DTLIC2`` when we switched from HMAC to
Ed25519, so old v1 blobs surface a clear "old format" error rather
than a confusing "signature mismatch".
""" """
from __future__ import annotations from __future__ import annotations
import base64 import base64
import hashlib
import hmac
import json import json
import os import os
from typing import Any from typing import Any
# Build-time default. Replace via env var in shipped builds; keep this from cryptography.exceptions import InvalidSignature
# constant non-empty so unit tests have a stable verification key. from cryptography.hazmat.primitives.asymmetric.ed25519 import (
_DEFAULT_SECRET = ( Ed25519PrivateKey,
"datatools-license-v1-development-secret-" Ed25519PublicKey,
"replace-at-build-time-via-DATATOOLS_LICENSE_SECRET"
) )
from ._dev_keypair import DEV_PRIVATE_KEY_HEX, DEV_PUBLIC_KEY_HEX
def _secret_bytes() -> bytes:
"""Return the active HMAC secret as bytes."""
return os.environ.get("DATATOOLS_LICENSE_SECRET", _DEFAULT_SECRET).encode("utf-8")
# ---------------------------------------------------------------------------
# Key material
# ---------------------------------------------------------------------------
def _privkey_hex() -> str:
"""Hex-encoded raw Ed25519 private-key bytes.
Read from ``$DATATOOLS_LICENSE_PRIVKEY`` first (where the seller
stashes their real key), falling back to the dev seed-derived
constant. The dev fallback only matters during testing /
development; a shipped build calling :func:`sign` is a bug (only
the seller's key-gen script does that).
"""
return os.environ.get("DATATOOLS_LICENSE_PRIVKEY") or DEV_PRIVATE_KEY_HEX
def _pubkey_hex() -> str:
"""Hex-encoded raw Ed25519 public-key bytes.
Read from ``$DATATOOLS_LICENSE_PUBKEY`` first (the build
pipeline sets this), falling back to the dev key.
"""
return os.environ.get("DATATOOLS_LICENSE_PUBKEY") or DEV_PUBLIC_KEY_HEX
def _privkey() -> Ed25519PrivateKey:
return Ed25519PrivateKey.from_private_bytes(bytes.fromhex(_privkey_hex()))
def _pubkey() -> Ed25519PublicKey:
return Ed25519PublicKey.from_public_bytes(bytes.fromhex(_pubkey_hex()))
def is_using_dev_key() -> bool:
"""True when the active **public** key matches the embedded dev key.
Used by :func:`.manager.assert_production_safe` to catch frozen
builds whose pubkey wasn't overridden at build time.
"""
return _pubkey_hex() == DEV_PUBLIC_KEY_HEX
# ---------------------------------------------------------------------------
# Canonical encoding (shared with v1 — same bytes, same hash, same sig)
# ---------------------------------------------------------------------------
def _canonical_bytes(payload: dict[str, Any]) -> bytes: def _canonical_bytes(payload: dict[str, Any]) -> bytes:
"""Canonical JSON encoding for the HMAC input. """Canonical JSON encoding for the signature input.
``sort_keys=True`` + ``separators=(",", ":")`` produce a byte-for- ``sort_keys=True`` + ``separators=(",", ":")`` produce a byte-for-
byte deterministic representation across Python versions and OS byte deterministic representation across Python versions and OS
locales. Without that, two structurally-identical dicts could hash locales. Without that, two structurally-identical dicts could
to different signatures. produce different signatures.
""" """
return json.dumps(payload, sort_keys=True, separators=(",", ":")).encode("utf-8") return json.dumps(payload, sort_keys=True, separators=(",", ":")).encode("utf-8")
# ---------------------------------------------------------------------------
# Sign / verify
# ---------------------------------------------------------------------------
def sign(payload: dict[str, Any]) -> str: def sign(payload: dict[str, Any]) -> str:
"""Compute the HMAC-SHA256 hex digest over *payload*. """Produce an Ed25519 signature over *payload*, hex-encoded.
*payload* MUST NOT contain a ``signature`` key — that's the field Caller must strip any existing ``signature`` field — the function
we're computing. The caller is responsible for stripping it. signs whatever it's given, including a stale signature, which
would never verify because verify recomputes from a fresh
no-``signature`` canonical form.
""" """
digest = hmac.new(_secret_bytes(), _canonical_bytes(payload), hashlib.sha256) sig_bytes = _privkey().sign(_canonical_bytes(payload))
return digest.hexdigest() return sig_bytes.hex()
def verify(payload: dict[str, Any], signature: str) -> bool: def verify(payload: dict[str, Any], signature_hex: str) -> bool:
"""Constant-time compare between the recomputed HMAC and *signature*. """Verify *signature_hex* against *payload*. Returns True/False;
never raises (a missing or malformed signature is just False)."""
Returns ``True`` on a match. Uses :func:`hmac.compare_digest` so a try:
timing oracle can't be used to recover the secret one byte at a sig_bytes = bytes.fromhex(signature_hex)
time — overkill for honor-system DRM, but free. except ValueError:
""" return False
expected = sign(payload) try:
return hmac.compare_digest(expected.encode("ascii"), signature.encode("ascii")) _pubkey().verify(sig_bytes, _canonical_bytes(payload))
return True
except InvalidSignature:
return False
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Blob encoding / decoding # Blob encoding / decoding
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# A "license blob" is the artifact the buyer pastes into the activation # Buyers paste this whole token into the activation page. The prefix
# form. It's a base64-encoded JSON dict containing every license field # is the version marker:
# *plus* the signature. We choose base64 over raw JSON so the blob is # DTLIC1 — old HMAC scheme (no longer accepted)
# one paste-able token (no whitespace surprises) and so a typo # DTLIC2 — Ed25519 (current)
# truncates the blob into an obviously-invalid form rather than a _BLOB_PREFIX = "DTLIC2:"
# subtly-mutated payload. _OLD_PREFIX = "DTLIC1:"
_BLOB_PREFIX = "DTLIC1:"
def encode_blob(payload_with_signature: dict[str, Any]) -> str: def encode_blob(payload_with_signature: dict[str, Any]) -> str:
@@ -92,10 +157,15 @@ def encode_blob(payload_with_signature: dict[str, Any]) -> str:
def decode_blob(blob: str) -> dict[str, Any]: def decode_blob(blob: str) -> dict[str, Any]:
"""Reverse of :func:`encode_blob`. Raises ``ValueError`` on a """Reverse of :func:`encode_blob`. Raises ``ValueError`` on a
blob that doesn't carry the expected prefix or doesn't decode blob that doesn't carry the expected prefix, doesn't decode
cleanly — both surface as :class:`InvalidLicenseError` at the cleanly, or carries the v1 prefix (which we no longer accept)."""
manager layer."""
s = blob.strip() s = blob.strip()
if s.startswith(_OLD_PREFIX):
raise ValueError(
f"License blob is the old {_OLD_PREFIX!r} format. v1 blobs "
"used a symmetric secret that has since been retired — "
"request a new blob from support."
)
if not s.startswith(_BLOB_PREFIX): if not s.startswith(_BLOB_PREFIX):
raise ValueError( raise ValueError(
f"License blob missing {_BLOB_PREFIX!r} prefix. " f"License blob missing {_BLOB_PREFIX!r} prefix. "

View File

@@ -6,6 +6,7 @@ constructor for full isolation.
Lifecycle:: Lifecycle::
assert_production_safe() # guard against build-config errors
mgr = get_manager() mgr = get_manager()
if not mgr.is_activated(): if not mgr.is_activated():
mgr.activate_from_blob(blob, name, email) mgr.activate_from_blob(blob, name, email)
@@ -17,6 +18,7 @@ from __future__ import annotations
import os import os
import re import re
import sys
import uuid import uuid
from dataclasses import dataclass from dataclasses import dataclass
from datetime import datetime, timezone from datetime import datetime, timezone
@@ -468,3 +470,69 @@ def current_state() -> LicenseState:
def require_feature(feature: str | FeatureFlag) -> License: def require_feature(feature: str | FeatureFlag) -> License:
return get_manager().require_feature(feature) return get_manager().require_feature(feature)
# ---------------------------------------------------------------------------
# Production-build sanity check
# ---------------------------------------------------------------------------
class ProductionBuildError(RuntimeError):
"""Raised when a frozen / shipped build is misconfigured in a way
that would defeat licensing. Always loud, always fatal — the
binary must not boot in this state."""
def _is_shipped_build() -> bool:
"""True when running from a PyInstaller bundle (``sys.frozen``).
Set automatically by PyInstaller; not set in source / pytest
runs. The whole purpose of the prod-safe check is to enforce
invariants that only matter in a shipped build, so the rest of
the codebase can stay flexible.
"""
return getattr(sys, "frozen", False)
def assert_production_safe() -> None:
"""Fail loudly if a shipped build is misconfigured.
Two tripwires:
1. ``DATATOOLS_DEV_MODE`` is set in a frozen build. The dev-mode
env var unconditionally bypasses license verification — if a
buyer's installer somehow ships it enabled (build pipeline
bug, mis-set environment), every license check is a no-op.
Refuse to start instead.
2. The active verification key is still the dev key. The build
pipeline is supposed to override
``DATATOOLS_LICENSE_PUBKEY`` with the production key; if it
didn't, the binary will reject every legitimate license
(signed with the prod private key) AND would *accept*
anything signed with the dev key (which is checked into the
source tree). Refuse to start.
No-ops in non-frozen runs (development, tests) so the dev key
+ dev mode keep working in those contexts. Production builds
call this from :func:`src.cli_license_guard.guard` and
:func:`src.gui.components.hide_streamlit_chrome`.
"""
if not _is_shipped_build():
return
if _truthy_env("DATATOOLS_DEV_MODE"):
raise ProductionBuildError(
"DATATOOLS_DEV_MODE is set in a shipped build. This env "
"var disables every license check and must never be set "
"on a buyer machine. If you see this message in a release "
"build, the install was misconfigured — contact support."
)
if crypto.is_using_dev_key():
raise ProductionBuildError(
"Shipped build is verifying against the development "
"license key. The build pipeline must set "
"DATATOOLS_LICENSE_PUBKEY to the production public key "
"before packaging. This binary will reject every real "
"license blob — re-download from the official channel."
)

View File

@@ -35,9 +35,9 @@ from src.license import (
UnsupportedFeatureError, UnsupportedFeatureError,
) )
from src.license.crypto import ( from src.license.crypto import (
_DEFAULT_SECRET,
decode_blob, decode_blob,
encode_blob, encode_blob,
is_using_dev_key,
sign, sign,
verify, verify,
) )
@@ -138,13 +138,26 @@ class TestSignAndVerify:
bad = sig[:-1] + ("0" if sig[-1] != "0" else "1") bad = sig[:-1] + ("0" if sig[-1] != "0" else "1")
assert verify(payload, bad) is False assert verify(payload, bad) is False
def test_sign_respects_secret_env_override(self, monkeypatch): def test_sign_respects_privkey_env_override(self, monkeypatch):
# Use a different valid Ed25519 private key (32 bytes hex).
# Picked arbitrarily; doesn't need to match the dev key.
alt_priv = "00" * 32
payload = {"a": 1} payload = {"a": 1}
monkeypatch.setenv("DATATOOLS_LICENSE_SECRET", "alternate") monkeypatch.setenv("DATATOOLS_LICENSE_PRIVKEY", alt_priv)
alt = sign(payload) alt_sig = sign(payload)
monkeypatch.delenv("DATATOOLS_LICENSE_SECRET", raising=False) monkeypatch.delenv("DATATOOLS_LICENSE_PRIVKEY", raising=False)
default = sign(payload) default_sig = sign(payload)
assert alt != default assert alt_sig != default_sig
def test_verify_with_wrong_pubkey_returns_false(self, monkeypatch):
# Sign with the dev key (default), then swap the pubkey and
# confirm verification fails.
payload = {"a": 1}
sig = sign(payload)
# 32-byte hex that isn't the matching dev pubkey.
wrong_pub = "11" * 32
monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", wrong_pub)
assert verify(payload, sig) is False
def test_canonical_form_is_key_order_invariant(self): def test_canonical_form_is_key_order_invariant(self):
a = {"x": 1, "y": 2} a = {"x": 1, "y": 2}
@@ -159,17 +172,26 @@ class TestBlobEncodeDecode:
again = decode_blob(blob) again = decode_blob(blob)
assert again == payload assert again == payload
def test_blob_has_human_readable_prefix(self): def test_blob_uses_v2_prefix(self):
"""v1.6 switched HMAC → Ed25519; blob version bumped to DTLIC2.
Pin the prefix so any future scheme change is intentional."""
blob = encode_blob({"x": 1}) blob = encode_blob({"x": 1})
assert blob.startswith("DTLIC1:") assert blob.startswith("DTLIC2:")
def test_decode_rejects_missing_prefix(self): def test_decode_rejects_missing_prefix(self):
with pytest.raises(ValueError, match="DTLIC1"): with pytest.raises(ValueError, match="DTLIC2"):
decode_blob("not-a-blob") decode_blob("not-a-blob")
def test_decode_rejects_v1_blob_with_clear_message(self):
"""A v1 (HMAC) blob must surface a clear 'old format' message
rather than 'signature mismatch' — buyers redeeming an old
delivery email need to know to request a new blob."""
with pytest.raises(ValueError, match="DTLIC1"):
decode_blob("DTLIC1:eyJhIjogMX0=")
def test_decode_rejects_bad_base64(self): def test_decode_rejects_bad_base64(self):
with pytest.raises(ValueError, match="base64"): with pytest.raises(ValueError, match="base64"):
decode_blob("DTLIC1:!!!notbase64!!!") decode_blob("DTLIC2:!!!notbase64!!!")
def test_decode_rejects_truncated_blob(self): def test_decode_rejects_truncated_blob(self):
blob = encode_blob({"x": 1}) blob = encode_blob({"x": 1})
@@ -178,6 +200,72 @@ class TestBlobEncodeDecode:
decode_blob(truncated) decode_blob(truncated)
class TestDevKeypair:
"""The embedded dev keypair must match the seed phrase so anyone
reproducing the build gets the same values. Catches a hand-edit
to ``_dev_keypair.py`` that drifts the constants from the seed."""
def test_dev_keypair_matches_seed(self):
from src.license._dev_keypair import (
DEV_PRIVATE_KEY_HEX,
DEV_PUBLIC_KEY_HEX,
_derive_from_seed,
)
derived_priv, derived_pub = _derive_from_seed()
assert derived_priv == DEV_PRIVATE_KEY_HEX
assert derived_pub == DEV_PUBLIC_KEY_HEX
def test_is_using_dev_key_true_by_default(self, monkeypatch):
monkeypatch.delenv("DATATOOLS_LICENSE_PUBKEY", raising=False)
assert is_using_dev_key() is True
def test_is_using_dev_key_false_when_overridden(self, monkeypatch):
monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", "22" * 32)
assert is_using_dev_key() is False
class TestProductionSafe:
"""``assert_production_safe`` is a tripwire that fires only in
frozen / shipped builds. Tests simulate the frozen state via
monkeypatching ``sys.frozen``."""
def test_no_op_in_source_run(self):
# Default test run: sys.frozen is unset; nothing should raise.
from src.license import assert_production_safe
assert_production_safe() # no exception
def test_raises_on_dev_mode_in_frozen_build(self, monkeypatch):
from src.license import (
ProductionBuildError,
assert_production_safe,
)
monkeypatch.setattr("sys.frozen", True, raising=False)
monkeypatch.setenv("DATATOOLS_DEV_MODE", "1")
# Override pubkey so the dev-key check doesn't fire first.
monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", "22" * 32)
with pytest.raises(ProductionBuildError, match="DATATOOLS_DEV_MODE"):
assert_production_safe()
def test_raises_on_dev_key_in_frozen_build(self, monkeypatch):
from src.license import (
ProductionBuildError,
assert_production_safe,
)
monkeypatch.setattr("sys.frozen", True, raising=False)
monkeypatch.delenv("DATATOOLS_DEV_MODE", raising=False)
monkeypatch.delenv("DATATOOLS_LICENSE_PUBKEY", raising=False)
with pytest.raises(ProductionBuildError, match="development license key"):
assert_production_safe()
def test_passes_in_frozen_build_with_prod_pubkey(self, monkeypatch):
from src.license import assert_production_safe
monkeypatch.setattr("sys.frozen", True, raising=False)
monkeypatch.delenv("DATATOOLS_DEV_MODE", raising=False)
monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", "22" * 32)
# Should not raise.
assert_production_safe()
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Features # Features
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------