sec(license): Ed25519 sigs + production-safe tripwire

Two coupled hardening upgrades.

1. Asymmetric signatures (HMAC → Ed25519)

The previous HMAC scheme used a symmetric secret that any motivated
reverse engineer could pull out of the shipped binary and use to
mint blobs for any tier / name / email. With Ed25519, the binary
ships only the public verification key; the signing key never
leaves the seller's environment, so binary compromise no longer
yields forgery.

- src/license/crypto.py rewritten around
  cryptography.hazmat.primitives.asymmetric.ed25519. Same public
  API surface (sign/verify/encode_blob/decode_blob), same canonical
  JSON encoding — drop-in for the manager / cli / GUI layers.
- DATATOOLS_LICENSE_PRIVKEY (seller-side) and
  DATATOOLS_LICENSE_PUBKEY (build-time) env vars supply the keys;
  the in-source dev keypair (src/license/_dev_keypair.py)
  deterministically derives from a seed phrase for repro builds and
  tests.
- Blob prefix bumped DTLIC1: → DTLIC2:. Decoding a DTLIC1 blob
  surfaces a clear "old format" error rather than a confusing
  signature mismatch.
- scripts/generate_keypair.py mints fresh production keypairs for
  the seller (run once, stash the private key offline). Adds
  cryptography>=41,<46 to requirements.txt (was an undeclared
  transitive dep).

2. Production-safe tripwire

assert_production_safe() refuses to boot a frozen / shipped build
when either:

- DATATOOLS_DEV_MODE=1 is set (would unconditionally bypass every
  license check — fine in source/test but catastrophic in a buyer
  install).
- The active verification key is still the embedded dev key (the
  build pipeline forgot to set DATATOOLS_LICENSE_PUBKEY).

No-op in source / pytest runs (sys.frozen is unset) so test
fixtures and dev workflows keep working without ceremony. Called
from src/cli_license_guard.guard() and from hide_streamlit_chrome
— so it fires on every CLI invocation and every GUI page load.

Tests: 49 license-layer unit tests (was 40); added Ed25519
wrong-key rejection, dev-keypair seed pin, blob v2 prefix, v1
rejection with clear message, and four production-safe scenarios
(no-op in source, fires on DEV_MODE in frozen, fires on dev key in
frozen, passes in frozen with prod pubkey). Total: 2024 → 2033.

Docs (REQUIREMENTS §17a, DEVELOPER licensing recipe, DECISIONS
§9b + decision log) updated with the new threat-model write-up,
key-storage workflow, and tripwire behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-13 17:34:48 +00:00
parent d32b58e61a
commit e534fb4989
12 changed files with 549 additions and 75 deletions

View File

@@ -178,6 +178,8 @@ $49-79/bundle · $149 full suite (when 3+ exist).
| May 13 (v1.6) | Ship licensing: 1-year HMAC-signed blobs, name+email registration, offline verification, tier-scaffolded for future SKUs | Unlock the lifetime-update business model without recurring infra. Honor-system DRM (HMAC + 30-day refund) — sufficient at $49. See §9b below. |
| May 13 (v1.6) | Add Lite SKU (Dedup + Text Cleaner + Format Standardizer) | Lower-priced entry point for buyers who only need the three universal tools. Per-tool feature gating + lock badges on the home grid surface the upgrade path. See §9b. |
| May 13 (v1.6) | Remove user-facing free trial | A 1-year all-features trial undercut the paid Lite SKU. Paid-only keeps tier economics clean. Internal ``_mint`` API still exists for tests and the seller's key generator. See §9b. |
| May 13 (v1.6) | Upgrade license crypto: HMAC → Ed25519 (asymmetric) | HMAC's symmetric secret was extractable from the shipped binary — anyone with the binary could mint blobs. Ed25519 splits sign (seller) from verify (binary), so binary compromise doesn't let an attacker forge licenses. Blob prefix bumped DTLIC1 → DTLIC2. See §9b. |
| May 13 (v1.6) | Add ``assert_production_safe`` tripwire | A shipped build with ``DATATOOLS_DEV_MODE=1`` or the in-source dev pubkey would silently defeat licensing. The tripwire refuses to boot such a build. No-op in source / pytest runs. See §9b. |
## 9b. Licensing model
@@ -191,7 +193,13 @@ $49-79/bundle · $149 full suite (when 3+ exist).
| Time-bombed binary (PyInstaller --no-license) | Rejected. Can't deliver renewals without re-shipping the installer. |
| Hardware-locked license | Rejected. Friction on legitimate device-swaps; doesn't match the buyer persona's tolerance. |
**Threat model**: a motivated reverse engineer can pull the HMAC secret out of the binary, mint their own licenses, and bypass the check. That's acceptable — the goal is to discourage casual blob-sharing among non-technical buyers, not stop targeted piracy. The 30-day refund window covers the same gap from a different angle (anyone who shares their blob is implicitly authorizing the buyer to issue them a refund-on-demand).
**Threat model** (v1.6 — Ed25519): the binary ships only the public key. A motivated reverse engineer who pulls everything out of the binary has the verification key but not the signing key — they can't mint new licenses. The earlier HMAC scheme had this hole; the asymmetric upgrade closes it. The remaining attack surface is:
- Re-signing with a forked binary that ships an attacker-controlled pubkey + auto-grants licenses. Costs more effort than the price of a legitimate copy and the result is per-fork, not shareable.
- Hooking the verification call to always return True. Defeats DRM entirely but only on the attacker's own machine — they could just write down "I unlocked DataTools" and skip the work.
- Setting ``DATATOOLS_DEV_MODE=1`` to bypass checks. **Refused in shipped builds** by ``assert_production_safe``; works in source/test runs only.
The 30-day refund window covers casual blob sharing from a different angle (anyone who shares their blob is implicitly authorizing the buyer to issue them a refund-on-demand).
**What's enforced**:
- License blob signature must match (HMAC-SHA256 with the build secret).

View File

@@ -143,9 +143,35 @@ require_feature(FeatureFlag.DEDUPLICATOR)
```
**Storage**: ``~/.datatools/license.json`` (override via
``DATATOOLS_LICENSE_PATH``). Signed locally with HMAC-SHA256 using a
secret read from ``DATATOOLS_LICENSE_SECRET`` (build-time replace; the
in-repo default is a development placeholder).
``DATATOOLS_LICENSE_PATH``). Signed with Ed25519 (asymmetric) — the
seller's private key signs; the buyer's binary verifies with the
embedded public key.
**Key material**:
| Variable | Who has it | Where it's used |
|---|---|---|
| ``DATATOOLS_LICENSE_PRIVKEY`` | Seller only | ``scripts/generate_license.py`` (mint a buyer's blob), ``scripts/generate_keypair.py`` writes a fresh one |
| ``DATATOOLS_LICENSE_PUBKEY`` | Every shipped binary | Verification at activation time; set at build time via PyInstaller env |
If neither env var is set, ``src.license.crypto`` falls back to the
deterministic dev keypair in ``src/license/_dev_keypair.py``. The
dev key is in source on purpose (so tests work without secrets),
but a frozen build that's still using it is a build-config bug —
:func:`assert_production_safe` refuses to start such a binary.
**First-time setup for shipped builds**:
1. ``python scripts/generate_keypair.py --output prod-keys.env`` —
creates a fresh keypair.
2. Stash ``DATATOOLS_LICENSE_PRIVKEY`` somewhere safe (password
manager / KMS). Lose it and you can't issue renewals without
reshipping a new build with a new public key.
3. Configure the PyInstaller build env with
``DATATOOLS_LICENSE_PUBKEY=<hex>`` so the shipped binary
verifies against the production key.
4. Mint buyer licenses with
``DATATOOLS_LICENSE_PRIVKEY=<hex> python scripts/generate_license.py ...``.
**Dev bypass**: ``DATATOOLS_DEV_MODE=1`` short-circuits every check.
The test suite's autouse fixture sets this so existing tests don't

View File

@@ -174,10 +174,11 @@ and proceeds.
- **Dev**: pytest, tox.
## 16. Test coverage
- 2,024 tests passing, 0 skipped, 0 xfailed.
- 1,859 core + CLI tests (run with `pytest -m 'not gui'` for a quick loop).
Includes 40 license-layer unit tests, 25 license-CLI tests, and
17 Lite-tier feature-map + guard tests.
- 2,033 tests passing, 0 skipped, 0 xfailed.
- 1,868 core + CLI tests (run with `pytest -m 'not gui'` for a quick loop).
Includes 49 license-layer unit tests (Ed25519 sign/verify, dev-key
derivation, production-safe tripwire, schema), 25 license-CLI
tests, and 17 Lite-tier feature-map + guard tests.
- 165 GUI tests under `tests/gui/` driving Streamlit pages via `AppTest`
(smoke + EN/ES localization, chrome, gate, workflows, dedup review,
advanced panels, error paths, findings panel, activation +
@@ -194,8 +195,14 @@ and proceeds.
## 17a. Licensing
- **Storage**: ``~/.datatools/license.json`` (or
``$DATATOOLS_LICENSE_PATH`` override). Signed locally with
HMAC-SHA256.
``$DATATOOLS_LICENSE_PATH`` override). Signed with Ed25519
(asymmetric).
- **Crypto**: Ed25519. The seller holds the private key; every
shipped binary embeds only the public key. A motivated reverse
engineer who pulls everything out of the binary still can't sign
new licenses. Keys are 32 bytes raw, exposed as hex via
``DATATOOLS_LICENSE_PRIVKEY`` (seller-side) and
``DATATOOLS_LICENSE_PUBKEY`` (build-time bake-in).
- **Activation**: buyer pastes a base64-encoded license blob
(``DTLIC1:...``) on first launch; app verifies the signature
offline + matches the buyer-entered name/email to the embedded
@@ -226,10 +233,17 @@ and proceeds.
- **Lock badge**: the home grid shows a red 🔒 Locked pill on tool
cards the current tier doesn't unlock.
- **Dev bypass**: ``DATATOOLS_DEV_MODE=1`` skips every check (used by
the test suite and during development).
the test suite and during development). **Refused in shipped
builds** by the production-safe tripwire.
- **Production-safe tripwire**: ``assert_production_safe()`` runs at
startup in every frozen build. Refuses to boot when ``DEV_MODE``
is set or the verification key is still the embedded dev key
(i.e., the build pipeline forgot to override
``DATATOOLS_LICENSE_PUBKEY``). No-op in source / pytest runs.
- **No internet**: signature verification is fully offline. The
shipped binary embeds the verification secret; see
``docs/DECISIONS.md`` for the threat-model discussion.
shipped binary embeds only the public key; the private key never
leaves the seller. See ``docs/DECISIONS.md`` for the threat-model
discussion.
## 18. Error handling
- Structured hierarchy: `DataToolsError` → `InputValidationError`, `ConfigError`, `FileFormatError`, `FileAccessError`.

View File

@@ -8,3 +8,4 @@ tqdm>=4.66,<5
typer>=0.12,<1
phonenumbers>=8.13,<9
streamlit>=1.35,<2
cryptography>=41,<46

106
scripts/generate_keypair.py Normal file
View File

@@ -0,0 +1,106 @@
#!/usr/bin/env python3
"""Generate a fresh Ed25519 keypair for production license signing.
**Creator-only.** Run once, write the private key somewhere safe,
configure the build pipeline with the public key.
Usage::
python scripts/generate_keypair.py
python scripts/generate_keypair.py --json
python scripts/generate_keypair.py --output keys.txt
The output looks like::
DATATOOLS_LICENSE_PRIVKEY=<64 hex chars> # KEEP SECRET
DATATOOLS_LICENSE_PUBKEY=<64 hex chars> # BAKE INTO BUILD
The private key never goes near the buyer-facing binary. Stash it in
a password manager / KMS / hardware token; the only places it gets
loaded are:
- ``scripts/generate_license.py`` when minting a buyer's blob
- Your CI's signing step, if you've automated blob minting
The public key gets set as ``DATATOOLS_LICENSE_PUBKEY`` in the
PyInstaller build env (so the shipped binary verifies against it),
and the production-safe runtime check refuses to start any frozen
build that's still using the in-source dev key.
"""
from __future__ import annotations
import argparse
import json
import sys
from pathlib import Path
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey
def generate() -> tuple[str, str]:
"""Return ``(private_hex, public_hex)`` for a fresh keypair."""
priv = Ed25519PrivateKey.generate()
priv_hex = priv.private_bytes(
encoding=serialization.Encoding.Raw,
format=serialization.PrivateFormat.Raw,
encryption_algorithm=serialization.NoEncryption(),
).hex()
pub_hex = priv.public_key().public_bytes(
encoding=serialization.Encoding.Raw,
format=serialization.PublicFormat.Raw,
).hex()
return priv_hex, pub_hex
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser(description=__doc__.splitlines()[0])
p.add_argument("--json", action="store_true", help="Emit JSON instead of env-file format.")
p.add_argument("--output", "-o", type=Path, default=None, help="Write to this file instead of stdout.")
args = p.parse_args(argv)
priv_hex, pub_hex = generate()
if args.json:
payload = json.dumps(
{"private_key": priv_hex, "public_key": pub_hex},
indent=2,
)
else:
payload = (
f"# DataTools license keypair — generated by generate_keypair.py\n"
f"# KEEP THE PRIVATE KEY SECRET. Lose it and your existing\n"
f"# licenses can't be renewed (you'd have to ship a new build\n"
f"# with a new public key and re-issue every active license).\n"
f"\n"
f"DATATOOLS_LICENSE_PRIVKEY={priv_hex}\n"
f"DATATOOLS_LICENSE_PUBKEY={pub_hex}\n"
)
if args.output:
args.output.write_text(payload + "\n", encoding="utf-8")
# chmod 600 — best-effort; ignored on Windows.
try:
args.output.chmod(0o600)
except OSError:
pass
print(f"Wrote {args.output} (mode 600)", file=sys.stderr)
else:
print(payload)
print(
"\nNext steps:\n"
" 1. Store the private key in your password manager.\n"
" 2. Bake the public key into the PyInstaller build:\n"
" DATATOOLS_LICENSE_PUBKEY=<pubkey> pyinstaller ...\n"
" 3. Mint buyer licenses by setting the private key:\n"
" DATATOOLS_LICENSE_PRIVKEY=<privkey> "
"python scripts/generate_license.py --name 'Buyer' --email b@x.com\n",
file=sys.stderr,
)
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -53,9 +53,14 @@ def guard(feature: str | None = None) -> None:
InvalidLicenseError,
LicenseError,
UnsupportedFeatureError,
assert_production_safe,
get_manager,
)
# Refuse to run a misconfigured shipped build. No-op in
# development / pytest runs.
assert_production_safe()
mgr = get_manager()
if mgr.dev_mode:
return

View File

@@ -89,6 +89,12 @@ def hide_streamlit_chrome(*, gate_license: bool = True) -> None:
can render its own form without recursion.
"""
st.markdown(_HIDE_CHROME_CSS, unsafe_allow_html=True)
# Production-safe check runs first so a misconfigured shipped
# build refuses to render anything (rather than rendering a
# broken activation form that doesn't accept real blobs).
# No-op in source / pytest runs.
from src.license import assert_production_safe
assert_production_safe()
# Imported lazily so this module stays importable in environments
# where the i18n packs haven't been laid out (e.g. unit tests of
# individual legacy helpers).

View File

@@ -34,12 +34,21 @@ from .errors import (
UnsupportedFeatureError,
)
from .features import FEATURES_BY_TIER, all_features_for_tier
from .manager import LicenseManager, current_state, get_manager, require_feature
from .manager import (
LicenseManager,
ProductionBuildError,
assert_production_safe,
current_state,
get_manager,
require_feature,
)
from .schema import FeatureFlag, License, Tier
__all__ = [
# Manager
"LicenseManager",
"ProductionBuildError",
"assert_production_safe",
"current_state",
"get_manager",
"require_feature",

View File

@@ -0,0 +1,73 @@
"""**Development-only** Ed25519 keypair embedded in the source tree.
This pair lets developers run / test / sign locally without needing
the production private key. Both values are deterministic from a
seed string (``hashlib.sha256(SEED).digest()``) so any contributor
checking out the source gets the same keys — which is exactly what
makes this keypair unsafe for production.
============================================================
DO NOT SHIP THIS KEYPAIR.
============================================================
For shipped builds:
1. Run ``scripts/generate_keypair.py`` to produce a fresh production
keypair.
2. Stash the **private** key in your password manager / KMS.
3. In the PyInstaller build pipeline, set the env var
``DATATOOLS_LICENSE_PUBKEY=<production-pubkey-hex>`` so the
shipped binary verifies against the production key, not this dev
key.
4. The production-safe runtime check (``assert_production_safe``)
refuses to start a frozen build that's still verifying against
this dev key — that's the tripwire that catches a missing build
step.
The matching seed phrase below is in source on purpose; rotating
the dev key means changing it here AND regenerating every test
fixture that hard-codes a blob. The seed includes the words
"DEV-seed-NOT-FOR-PRODUCTION" specifically so a string-grep against
a shipped binary would flag a missing build override immediately.
"""
from __future__ import annotations
import hashlib
# The seed phrase. Hashed to 32 bytes → Ed25519 private-key seed.
DEV_SEED_PHRASE: bytes = (
b"datatools-license-v2-DEV-seed-NOT-FOR-PRODUCTION"
)
# Derived constants. Computed once at import for self-test
# (``test_dev_keypair_matches_seed`` in ``tests/test_license.py``)
# without doing crypto work on every import.
DEV_PRIVATE_KEY_HEX: str = (
"0bdc196f098b84ed155bacbd00061d4fff2cb68e10109f94332f1fc7de194cdb"
)
DEV_PUBLIC_KEY_HEX: str = (
"1cbef16b7826dd364ac0c7187d42c2ee00d76486e42389db05efa45dd1ade78a"
)
def _derive_from_seed() -> tuple[str, str]:
"""Re-derive the dev keypair from the seed phrase. Used by the
unit test that pins the constants above to the seed."""
from cryptography.hazmat.primitives.asymmetric.ed25519 import (
Ed25519PrivateKey,
)
from cryptography.hazmat.primitives import serialization
seed = hashlib.sha256(DEV_SEED_PHRASE).digest()
priv = Ed25519PrivateKey.from_private_bytes(seed)
priv_hex = priv.private_bytes(
encoding=serialization.Encoding.Raw,
format=serialization.PrivateFormat.Raw,
encryption_algorithm=serialization.NoEncryption(),
).hex()
pub_hex = priv.public_key().public_bytes(
encoding=serialization.Encoding.Raw,
format=serialization.PublicFormat.Raw,
).hex()
return priv_hex, pub_hex

View File

@@ -1,85 +1,150 @@
"""HMAC sign/verify for license blobs.
"""Ed25519 sign/verify for license blobs.
The signing secret is read from ``$DATATOOLS_LICENSE_SECRET`` if
present, otherwise from the build-time constant below. Replace the
constant at build time (via PyInstaller hook or a sed step in the
build pipeline) so the shipped binary has a different secret from
this repo's source tree.
Asymmetric model:
Threat model: honor-system DRM. A motivated reverse engineer can pull
the secret out of the binary, sign their own licenses, and bypass the
check. That's expected for $49 desktop software — the goal is to
discourage casual sharing, not stop targeted piracy. The 30-day
refund policy and the personal-name embedded in every license cover
the same gap from a different angle.
- **Private key** (32 bytes) lives with the seller only. It signs the
buyer's name/email/tier/etc into a license blob via
``scripts/generate_license.py``.
- **Public key** (32 bytes) is embedded in every shipped binary. The
binary uses it to verify blobs at activation time.
The split means a motivated reverse engineer who pulls everything out
of the binary still can't sign new licenses — they'd need the private
key, which never leaves the seller's environment. This is the key
upgrade vs. the v1 HMAC scheme: HMAC's symmetric secret was trivially
extractable, so anyone with the binary could mint blobs for any tier.
Keys come from (in priority order):
1. ``$DATATOOLS_LICENSE_PRIVKEY`` / ``$DATATOOLS_LICENSE_PUBKEY`` —
hex-encoded raw bytes. The build pipeline sets the pubkey here.
2. The dev-only constants in ``_dev_keypair.py`` — deterministic from
a seed, embedded in the source tree for local development and
testing. **Never** ship a binary that still uses these.
A frozen / shipped build verifying against the dev key is a build
configuration error — ``assert_production_safe`` (see
``.manager``) fires loudly on startup in that case.
Blob format: ``DTLIC2:`` + base64-encoded JSON. The version prefix
bumped from ``DTLIC1`` to ``DTLIC2`` when we switched from HMAC to
Ed25519, so old v1 blobs surface a clear "old format" error rather
than a confusing "signature mismatch".
"""
from __future__ import annotations
import base64
import hashlib
import hmac
import json
import os
from typing import Any
# Build-time default. Replace via env var in shipped builds; keep this
# constant non-empty so unit tests have a stable verification key.
_DEFAULT_SECRET = (
"datatools-license-v1-development-secret-"
"replace-at-build-time-via-DATATOOLS_LICENSE_SECRET"
from cryptography.exceptions import InvalidSignature
from cryptography.hazmat.primitives.asymmetric.ed25519 import (
Ed25519PrivateKey,
Ed25519PublicKey,
)
from ._dev_keypair import DEV_PRIVATE_KEY_HEX, DEV_PUBLIC_KEY_HEX
def _secret_bytes() -> bytes:
"""Return the active HMAC secret as bytes."""
return os.environ.get("DATATOOLS_LICENSE_SECRET", _DEFAULT_SECRET).encode("utf-8")
# ---------------------------------------------------------------------------
# Key material
# ---------------------------------------------------------------------------
def _privkey_hex() -> str:
"""Hex-encoded raw Ed25519 private-key bytes.
Read from ``$DATATOOLS_LICENSE_PRIVKEY`` first (where the seller
stashes their real key), falling back to the dev seed-derived
constant. The dev fallback only matters during testing /
development; a shipped build calling :func:`sign` is a bug (only
the seller's key-gen script does that).
"""
return os.environ.get("DATATOOLS_LICENSE_PRIVKEY") or DEV_PRIVATE_KEY_HEX
def _pubkey_hex() -> str:
"""Hex-encoded raw Ed25519 public-key bytes.
Read from ``$DATATOOLS_LICENSE_PUBKEY`` first (the build
pipeline sets this), falling back to the dev key.
"""
return os.environ.get("DATATOOLS_LICENSE_PUBKEY") or DEV_PUBLIC_KEY_HEX
def _privkey() -> Ed25519PrivateKey:
return Ed25519PrivateKey.from_private_bytes(bytes.fromhex(_privkey_hex()))
def _pubkey() -> Ed25519PublicKey:
return Ed25519PublicKey.from_public_bytes(bytes.fromhex(_pubkey_hex()))
def is_using_dev_key() -> bool:
"""True when the active **public** key matches the embedded dev key.
Used by :func:`.manager.assert_production_safe` to catch frozen
builds whose pubkey wasn't overridden at build time.
"""
return _pubkey_hex() == DEV_PUBLIC_KEY_HEX
# ---------------------------------------------------------------------------
# Canonical encoding (shared with v1 — same bytes, same hash, same sig)
# ---------------------------------------------------------------------------
def _canonical_bytes(payload: dict[str, Any]) -> bytes:
"""Canonical JSON encoding for the HMAC input.
"""Canonical JSON encoding for the signature input.
``sort_keys=True`` + ``separators=(",", ":")`` produce a byte-for-
byte deterministic representation across Python versions and OS
locales. Without that, two structurally-identical dicts could hash
to different signatures.
locales. Without that, two structurally-identical dicts could
produce different signatures.
"""
return json.dumps(payload, sort_keys=True, separators=(",", ":")).encode("utf-8")
# ---------------------------------------------------------------------------
# Sign / verify
# ---------------------------------------------------------------------------
def sign(payload: dict[str, Any]) -> str:
"""Compute the HMAC-SHA256 hex digest over *payload*.
"""Produce an Ed25519 signature over *payload*, hex-encoded.
*payload* MUST NOT contain a ``signature`` key — that's the field
we're computing. The caller is responsible for stripping it.
Caller must strip any existing ``signature`` field — the function
signs whatever it's given, including a stale signature, which
would never verify because verify recomputes from a fresh
no-``signature`` canonical form.
"""
digest = hmac.new(_secret_bytes(), _canonical_bytes(payload), hashlib.sha256)
return digest.hexdigest()
sig_bytes = _privkey().sign(_canonical_bytes(payload))
return sig_bytes.hex()
def verify(payload: dict[str, Any], signature: str) -> bool:
"""Constant-time compare between the recomputed HMAC and *signature*.
Returns ``True`` on a match. Uses :func:`hmac.compare_digest` so a
timing oracle can't be used to recover the secret one byte at a
time — overkill for honor-system DRM, but free.
"""
expected = sign(payload)
return hmac.compare_digest(expected.encode("ascii"), signature.encode("ascii"))
def verify(payload: dict[str, Any], signature_hex: str) -> bool:
"""Verify *signature_hex* against *payload*. Returns True/False;
never raises (a missing or malformed signature is just False)."""
try:
sig_bytes = bytes.fromhex(signature_hex)
except ValueError:
return False
try:
_pubkey().verify(sig_bytes, _canonical_bytes(payload))
return True
except InvalidSignature:
return False
# ---------------------------------------------------------------------------
# Blob encoding / decoding
# ---------------------------------------------------------------------------
# A "license blob" is the artifact the buyer pastes into the activation
# form. It's a base64-encoded JSON dict containing every license field
# *plus* the signature. We choose base64 over raw JSON so the blob is
# one paste-able token (no whitespace surprises) and so a typo
# truncates the blob into an obviously-invalid form rather than a
# subtly-mutated payload.
_BLOB_PREFIX = "DTLIC1:"
# Buyers paste this whole token into the activation page. The prefix
# is the version marker:
# DTLIC1 — old HMAC scheme (no longer accepted)
# DTLIC2 — Ed25519 (current)
_BLOB_PREFIX = "DTLIC2:"
_OLD_PREFIX = "DTLIC1:"
def encode_blob(payload_with_signature: dict[str, Any]) -> str:
@@ -92,10 +157,15 @@ def encode_blob(payload_with_signature: dict[str, Any]) -> str:
def decode_blob(blob: str) -> dict[str, Any]:
"""Reverse of :func:`encode_blob`. Raises ``ValueError`` on a
blob that doesn't carry the expected prefix or doesn't decode
cleanly — both surface as :class:`InvalidLicenseError` at the
manager layer."""
blob that doesn't carry the expected prefix, doesn't decode
cleanly, or carries the v1 prefix (which we no longer accept)."""
s = blob.strip()
if s.startswith(_OLD_PREFIX):
raise ValueError(
f"License blob is the old {_OLD_PREFIX!r} format. v1 blobs "
"used a symmetric secret that has since been retired — "
"request a new blob from support."
)
if not s.startswith(_BLOB_PREFIX):
raise ValueError(
f"License blob missing {_BLOB_PREFIX!r} prefix. "

View File

@@ -6,6 +6,7 @@ constructor for full isolation.
Lifecycle::
assert_production_safe() # guard against build-config errors
mgr = get_manager()
if not mgr.is_activated():
mgr.activate_from_blob(blob, name, email)
@@ -17,6 +18,7 @@ from __future__ import annotations
import os
import re
import sys
import uuid
from dataclasses import dataclass
from datetime import datetime, timezone
@@ -468,3 +470,69 @@ def current_state() -> LicenseState:
def require_feature(feature: str | FeatureFlag) -> License:
return get_manager().require_feature(feature)
# ---------------------------------------------------------------------------
# Production-build sanity check
# ---------------------------------------------------------------------------
class ProductionBuildError(RuntimeError):
"""Raised when a frozen / shipped build is misconfigured in a way
that would defeat licensing. Always loud, always fatal — the
binary must not boot in this state."""
def _is_shipped_build() -> bool:
"""True when running from a PyInstaller bundle (``sys.frozen``).
Set automatically by PyInstaller; not set in source / pytest
runs. The whole purpose of the prod-safe check is to enforce
invariants that only matter in a shipped build, so the rest of
the codebase can stay flexible.
"""
return getattr(sys, "frozen", False)
def assert_production_safe() -> None:
"""Fail loudly if a shipped build is misconfigured.
Two tripwires:
1. ``DATATOOLS_DEV_MODE`` is set in a frozen build. The dev-mode
env var unconditionally bypasses license verification — if a
buyer's installer somehow ships it enabled (build pipeline
bug, mis-set environment), every license check is a no-op.
Refuse to start instead.
2. The active verification key is still the dev key. The build
pipeline is supposed to override
``DATATOOLS_LICENSE_PUBKEY`` with the production key; if it
didn't, the binary will reject every legitimate license
(signed with the prod private key) AND would *accept*
anything signed with the dev key (which is checked into the
source tree). Refuse to start.
No-ops in non-frozen runs (development, tests) so the dev key
+ dev mode keep working in those contexts. Production builds
call this from :func:`src.cli_license_guard.guard` and
:func:`src.gui.components.hide_streamlit_chrome`.
"""
if not _is_shipped_build():
return
if _truthy_env("DATATOOLS_DEV_MODE"):
raise ProductionBuildError(
"DATATOOLS_DEV_MODE is set in a shipped build. This env "
"var disables every license check and must never be set "
"on a buyer machine. If you see this message in a release "
"build, the install was misconfigured — contact support."
)
if crypto.is_using_dev_key():
raise ProductionBuildError(
"Shipped build is verifying against the development "
"license key. The build pipeline must set "
"DATATOOLS_LICENSE_PUBKEY to the production public key "
"before packaging. This binary will reject every real "
"license blob — re-download from the official channel."
)

View File

@@ -35,9 +35,9 @@ from src.license import (
UnsupportedFeatureError,
)
from src.license.crypto import (
_DEFAULT_SECRET,
decode_blob,
encode_blob,
is_using_dev_key,
sign,
verify,
)
@@ -138,13 +138,26 @@ class TestSignAndVerify:
bad = sig[:-1] + ("0" if sig[-1] != "0" else "1")
assert verify(payload, bad) is False
def test_sign_respects_secret_env_override(self, monkeypatch):
def test_sign_respects_privkey_env_override(self, monkeypatch):
# Use a different valid Ed25519 private key (32 bytes hex).
# Picked arbitrarily; doesn't need to match the dev key.
alt_priv = "00" * 32
payload = {"a": 1}
monkeypatch.setenv("DATATOOLS_LICENSE_SECRET", "alternate")
alt = sign(payload)
monkeypatch.delenv("DATATOOLS_LICENSE_SECRET", raising=False)
default = sign(payload)
assert alt != default
monkeypatch.setenv("DATATOOLS_LICENSE_PRIVKEY", alt_priv)
alt_sig = sign(payload)
monkeypatch.delenv("DATATOOLS_LICENSE_PRIVKEY", raising=False)
default_sig = sign(payload)
assert alt_sig != default_sig
def test_verify_with_wrong_pubkey_returns_false(self, monkeypatch):
# Sign with the dev key (default), then swap the pubkey and
# confirm verification fails.
payload = {"a": 1}
sig = sign(payload)
# 32-byte hex that isn't the matching dev pubkey.
wrong_pub = "11" * 32
monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", wrong_pub)
assert verify(payload, sig) is False
def test_canonical_form_is_key_order_invariant(self):
a = {"x": 1, "y": 2}
@@ -159,17 +172,26 @@ class TestBlobEncodeDecode:
again = decode_blob(blob)
assert again == payload
def test_blob_has_human_readable_prefix(self):
def test_blob_uses_v2_prefix(self):
"""v1.6 switched HMAC → Ed25519; blob version bumped to DTLIC2.
Pin the prefix so any future scheme change is intentional."""
blob = encode_blob({"x": 1})
assert blob.startswith("DTLIC1:")
assert blob.startswith("DTLIC2:")
def test_decode_rejects_missing_prefix(self):
with pytest.raises(ValueError, match="DTLIC1"):
with pytest.raises(ValueError, match="DTLIC2"):
decode_blob("not-a-blob")
def test_decode_rejects_v1_blob_with_clear_message(self):
"""A v1 (HMAC) blob must surface a clear 'old format' message
rather than 'signature mismatch' — buyers redeeming an old
delivery email need to know to request a new blob."""
with pytest.raises(ValueError, match="DTLIC1"):
decode_blob("DTLIC1:eyJhIjogMX0=")
def test_decode_rejects_bad_base64(self):
with pytest.raises(ValueError, match="base64"):
decode_blob("DTLIC1:!!!notbase64!!!")
decode_blob("DTLIC2:!!!notbase64!!!")
def test_decode_rejects_truncated_blob(self):
blob = encode_blob({"x": 1})
@@ -178,6 +200,72 @@ class TestBlobEncodeDecode:
decode_blob(truncated)
class TestDevKeypair:
"""The embedded dev keypair must match the seed phrase so anyone
reproducing the build gets the same values. Catches a hand-edit
to ``_dev_keypair.py`` that drifts the constants from the seed."""
def test_dev_keypair_matches_seed(self):
from src.license._dev_keypair import (
DEV_PRIVATE_KEY_HEX,
DEV_PUBLIC_KEY_HEX,
_derive_from_seed,
)
derived_priv, derived_pub = _derive_from_seed()
assert derived_priv == DEV_PRIVATE_KEY_HEX
assert derived_pub == DEV_PUBLIC_KEY_HEX
def test_is_using_dev_key_true_by_default(self, monkeypatch):
monkeypatch.delenv("DATATOOLS_LICENSE_PUBKEY", raising=False)
assert is_using_dev_key() is True
def test_is_using_dev_key_false_when_overridden(self, monkeypatch):
monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", "22" * 32)
assert is_using_dev_key() is False
class TestProductionSafe:
"""``assert_production_safe`` is a tripwire that fires only in
frozen / shipped builds. Tests simulate the frozen state via
monkeypatching ``sys.frozen``."""
def test_no_op_in_source_run(self):
# Default test run: sys.frozen is unset; nothing should raise.
from src.license import assert_production_safe
assert_production_safe() # no exception
def test_raises_on_dev_mode_in_frozen_build(self, monkeypatch):
from src.license import (
ProductionBuildError,
assert_production_safe,
)
monkeypatch.setattr("sys.frozen", True, raising=False)
monkeypatch.setenv("DATATOOLS_DEV_MODE", "1")
# Override pubkey so the dev-key check doesn't fire first.
monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", "22" * 32)
with pytest.raises(ProductionBuildError, match="DATATOOLS_DEV_MODE"):
assert_production_safe()
def test_raises_on_dev_key_in_frozen_build(self, monkeypatch):
from src.license import (
ProductionBuildError,
assert_production_safe,
)
monkeypatch.setattr("sys.frozen", True, raising=False)
monkeypatch.delenv("DATATOOLS_DEV_MODE", raising=False)
monkeypatch.delenv("DATATOOLS_LICENSE_PUBKEY", raising=False)
with pytest.raises(ProductionBuildError, match="development license key"):
assert_production_safe()
def test_passes_in_frozen_build_with_prod_pubkey(self, monkeypatch):
from src.license import assert_production_safe
monkeypatch.setattr("sys.frozen", True, raising=False)
monkeypatch.delenv("DATATOOLS_DEV_MODE", raising=False)
monkeypatch.setenv("DATATOOLS_LICENSE_PUBKEY", "22" * 32)
# Should not raise.
assert_production_safe()
# ---------------------------------------------------------------------------
# Features
# ---------------------------------------------------------------------------