sec(license): Ed25519 sigs + production-safe tripwire
Two coupled hardening upgrades. 1. Asymmetric signatures (HMAC → Ed25519) The previous HMAC scheme used a symmetric secret that any motivated reverse engineer could pull out of the shipped binary and use to mint blobs for any tier / name / email. With Ed25519, the binary ships only the public verification key; the signing key never leaves the seller's environment, so binary compromise no longer yields forgery. - src/license/crypto.py rewritten around cryptography.hazmat.primitives.asymmetric.ed25519. Same public API surface (sign/verify/encode_blob/decode_blob), same canonical JSON encoding — drop-in for the manager / cli / GUI layers. - DATATOOLS_LICENSE_PRIVKEY (seller-side) and DATATOOLS_LICENSE_PUBKEY (build-time) env vars supply the keys; the in-source dev keypair (src/license/_dev_keypair.py) deterministically derives from a seed phrase for repro builds and tests. - Blob prefix bumped DTLIC1: → DTLIC2:. Decoding a DTLIC1 blob surfaces a clear "old format" error rather than a confusing signature mismatch. - scripts/generate_keypair.py mints fresh production keypairs for the seller (run once, stash the private key offline). Adds cryptography>=41,<46 to requirements.txt (was an undeclared transitive dep). 2. Production-safe tripwire assert_production_safe() refuses to boot a frozen / shipped build when either: - DATATOOLS_DEV_MODE=1 is set (would unconditionally bypass every license check — fine in source/test but catastrophic in a buyer install). - The active verification key is still the embedded dev key (the build pipeline forgot to set DATATOOLS_LICENSE_PUBKEY). No-op in source / pytest runs (sys.frozen is unset) so test fixtures and dev workflows keep working without ceremony. Called from src/cli_license_guard.guard() and from hide_streamlit_chrome — so it fires on every CLI invocation and every GUI page load. Tests: 49 license-layer unit tests (was 40); added Ed25519 wrong-key rejection, dev-keypair seed pin, blob v2 prefix, v1 rejection with clear message, and four production-safe scenarios (no-op in source, fires on DEV_MODE in frozen, fires on dev key in frozen, passes in frozen with prod pubkey). Total: 2024 → 2033. Docs (REQUIREMENTS §17a, DEVELOPER licensing recipe, DECISIONS §9b + decision log) updated with the new threat-model write-up, key-storage workflow, and tripwire behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -34,12 +34,21 @@ from .errors import (
|
||||
UnsupportedFeatureError,
|
||||
)
|
||||
from .features import FEATURES_BY_TIER, all_features_for_tier
|
||||
from .manager import LicenseManager, current_state, get_manager, require_feature
|
||||
from .manager import (
|
||||
LicenseManager,
|
||||
ProductionBuildError,
|
||||
assert_production_safe,
|
||||
current_state,
|
||||
get_manager,
|
||||
require_feature,
|
||||
)
|
||||
from .schema import FeatureFlag, License, Tier
|
||||
|
||||
__all__ = [
|
||||
# Manager
|
||||
"LicenseManager",
|
||||
"ProductionBuildError",
|
||||
"assert_production_safe",
|
||||
"current_state",
|
||||
"get_manager",
|
||||
"require_feature",
|
||||
|
||||
73
src/license/_dev_keypair.py
Normal file
73
src/license/_dev_keypair.py
Normal file
@@ -0,0 +1,73 @@
|
||||
"""**Development-only** Ed25519 keypair embedded in the source tree.
|
||||
|
||||
This pair lets developers run / test / sign locally without needing
|
||||
the production private key. Both values are deterministic from a
|
||||
seed string (``hashlib.sha256(SEED).digest()``) so any contributor
|
||||
checking out the source gets the same keys — which is exactly what
|
||||
makes this keypair unsafe for production.
|
||||
|
||||
============================================================
|
||||
DO NOT SHIP THIS KEYPAIR.
|
||||
============================================================
|
||||
|
||||
For shipped builds:
|
||||
|
||||
1. Run ``scripts/generate_keypair.py`` to produce a fresh production
|
||||
keypair.
|
||||
2. Stash the **private** key in your password manager / KMS.
|
||||
3. In the PyInstaller build pipeline, set the env var
|
||||
``DATATOOLS_LICENSE_PUBKEY=<production-pubkey-hex>`` so the
|
||||
shipped binary verifies against the production key, not this dev
|
||||
key.
|
||||
4. The production-safe runtime check (``assert_production_safe``)
|
||||
refuses to start a frozen build that's still verifying against
|
||||
this dev key — that's the tripwire that catches a missing build
|
||||
step.
|
||||
|
||||
The matching seed phrase below is in source on purpose; rotating
|
||||
the dev key means changing it here AND regenerating every test
|
||||
fixture that hard-codes a blob. The seed includes the words
|
||||
"DEV-seed-NOT-FOR-PRODUCTION" specifically so a string-grep against
|
||||
a shipped binary would flag a missing build override immediately.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
|
||||
# The seed phrase. Hashed to 32 bytes → Ed25519 private-key seed.
|
||||
DEV_SEED_PHRASE: bytes = (
|
||||
b"datatools-license-v2-DEV-seed-NOT-FOR-PRODUCTION"
|
||||
)
|
||||
|
||||
# Derived constants. Computed once at import for self-test
|
||||
# (``test_dev_keypair_matches_seed`` in ``tests/test_license.py``)
|
||||
# without doing crypto work on every import.
|
||||
DEV_PRIVATE_KEY_HEX: str = (
|
||||
"0bdc196f098b84ed155bacbd00061d4fff2cb68e10109f94332f1fc7de194cdb"
|
||||
)
|
||||
DEV_PUBLIC_KEY_HEX: str = (
|
||||
"1cbef16b7826dd364ac0c7187d42c2ee00d76486e42389db05efa45dd1ade78a"
|
||||
)
|
||||
|
||||
|
||||
def _derive_from_seed() -> tuple[str, str]:
|
||||
"""Re-derive the dev keypair from the seed phrase. Used by the
|
||||
unit test that pins the constants above to the seed."""
|
||||
from cryptography.hazmat.primitives.asymmetric.ed25519 import (
|
||||
Ed25519PrivateKey,
|
||||
)
|
||||
from cryptography.hazmat.primitives import serialization
|
||||
|
||||
seed = hashlib.sha256(DEV_SEED_PHRASE).digest()
|
||||
priv = Ed25519PrivateKey.from_private_bytes(seed)
|
||||
priv_hex = priv.private_bytes(
|
||||
encoding=serialization.Encoding.Raw,
|
||||
format=serialization.PrivateFormat.Raw,
|
||||
encryption_algorithm=serialization.NoEncryption(),
|
||||
).hex()
|
||||
pub_hex = priv.public_key().public_bytes(
|
||||
encoding=serialization.Encoding.Raw,
|
||||
format=serialization.PublicFormat.Raw,
|
||||
).hex()
|
||||
return priv_hex, pub_hex
|
||||
@@ -1,85 +1,150 @@
|
||||
"""HMAC sign/verify for license blobs.
|
||||
"""Ed25519 sign/verify for license blobs.
|
||||
|
||||
The signing secret is read from ``$DATATOOLS_LICENSE_SECRET`` if
|
||||
present, otherwise from the build-time constant below. Replace the
|
||||
constant at build time (via PyInstaller hook or a sed step in the
|
||||
build pipeline) so the shipped binary has a different secret from
|
||||
this repo's source tree.
|
||||
Asymmetric model:
|
||||
|
||||
Threat model: honor-system DRM. A motivated reverse engineer can pull
|
||||
the secret out of the binary, sign their own licenses, and bypass the
|
||||
check. That's expected for $49 desktop software — the goal is to
|
||||
discourage casual sharing, not stop targeted piracy. The 30-day
|
||||
refund policy and the personal-name embedded in every license cover
|
||||
the same gap from a different angle.
|
||||
- **Private key** (32 bytes) lives with the seller only. It signs the
|
||||
buyer's name/email/tier/etc into a license blob via
|
||||
``scripts/generate_license.py``.
|
||||
- **Public key** (32 bytes) is embedded in every shipped binary. The
|
||||
binary uses it to verify blobs at activation time.
|
||||
|
||||
The split means a motivated reverse engineer who pulls everything out
|
||||
of the binary still can't sign new licenses — they'd need the private
|
||||
key, which never leaves the seller's environment. This is the key
|
||||
upgrade vs. the v1 HMAC scheme: HMAC's symmetric secret was trivially
|
||||
extractable, so anyone with the binary could mint blobs for any tier.
|
||||
|
||||
Keys come from (in priority order):
|
||||
|
||||
1. ``$DATATOOLS_LICENSE_PRIVKEY`` / ``$DATATOOLS_LICENSE_PUBKEY`` —
|
||||
hex-encoded raw bytes. The build pipeline sets the pubkey here.
|
||||
2. The dev-only constants in ``_dev_keypair.py`` — deterministic from
|
||||
a seed, embedded in the source tree for local development and
|
||||
testing. **Never** ship a binary that still uses these.
|
||||
|
||||
A frozen / shipped build verifying against the dev key is a build
|
||||
configuration error — ``assert_production_safe`` (see
|
||||
``.manager``) fires loudly on startup in that case.
|
||||
|
||||
Blob format: ``DTLIC2:`` + base64-encoded JSON. The version prefix
|
||||
bumped from ``DTLIC1`` to ``DTLIC2`` when we switched from HMAC to
|
||||
Ed25519, so old v1 blobs surface a clear "old format" error rather
|
||||
than a confusing "signature mismatch".
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import base64
|
||||
import hashlib
|
||||
import hmac
|
||||
import json
|
||||
import os
|
||||
from typing import Any
|
||||
|
||||
# Build-time default. Replace via env var in shipped builds; keep this
|
||||
# constant non-empty so unit tests have a stable verification key.
|
||||
_DEFAULT_SECRET = (
|
||||
"datatools-license-v1-development-secret-"
|
||||
"replace-at-build-time-via-DATATOOLS_LICENSE_SECRET"
|
||||
from cryptography.exceptions import InvalidSignature
|
||||
from cryptography.hazmat.primitives.asymmetric.ed25519 import (
|
||||
Ed25519PrivateKey,
|
||||
Ed25519PublicKey,
|
||||
)
|
||||
|
||||
from ._dev_keypair import DEV_PRIVATE_KEY_HEX, DEV_PUBLIC_KEY_HEX
|
||||
|
||||
def _secret_bytes() -> bytes:
|
||||
"""Return the active HMAC secret as bytes."""
|
||||
return os.environ.get("DATATOOLS_LICENSE_SECRET", _DEFAULT_SECRET).encode("utf-8")
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Key material
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _privkey_hex() -> str:
|
||||
"""Hex-encoded raw Ed25519 private-key bytes.
|
||||
|
||||
Read from ``$DATATOOLS_LICENSE_PRIVKEY`` first (where the seller
|
||||
stashes their real key), falling back to the dev seed-derived
|
||||
constant. The dev fallback only matters during testing /
|
||||
development; a shipped build calling :func:`sign` is a bug (only
|
||||
the seller's key-gen script does that).
|
||||
"""
|
||||
return os.environ.get("DATATOOLS_LICENSE_PRIVKEY") or DEV_PRIVATE_KEY_HEX
|
||||
|
||||
|
||||
def _pubkey_hex() -> str:
|
||||
"""Hex-encoded raw Ed25519 public-key bytes.
|
||||
|
||||
Read from ``$DATATOOLS_LICENSE_PUBKEY`` first (the build
|
||||
pipeline sets this), falling back to the dev key.
|
||||
"""
|
||||
return os.environ.get("DATATOOLS_LICENSE_PUBKEY") or DEV_PUBLIC_KEY_HEX
|
||||
|
||||
|
||||
def _privkey() -> Ed25519PrivateKey:
|
||||
return Ed25519PrivateKey.from_private_bytes(bytes.fromhex(_privkey_hex()))
|
||||
|
||||
|
||||
def _pubkey() -> Ed25519PublicKey:
|
||||
return Ed25519PublicKey.from_public_bytes(bytes.fromhex(_pubkey_hex()))
|
||||
|
||||
|
||||
def is_using_dev_key() -> bool:
|
||||
"""True when the active **public** key matches the embedded dev key.
|
||||
|
||||
Used by :func:`.manager.assert_production_safe` to catch frozen
|
||||
builds whose pubkey wasn't overridden at build time.
|
||||
"""
|
||||
return _pubkey_hex() == DEV_PUBLIC_KEY_HEX
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Canonical encoding (shared with v1 — same bytes, same hash, same sig)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _canonical_bytes(payload: dict[str, Any]) -> bytes:
|
||||
"""Canonical JSON encoding for the HMAC input.
|
||||
"""Canonical JSON encoding for the signature input.
|
||||
|
||||
``sort_keys=True`` + ``separators=(",", ":")`` produce a byte-for-
|
||||
byte deterministic representation across Python versions and OS
|
||||
locales. Without that, two structurally-identical dicts could hash
|
||||
to different signatures.
|
||||
locales. Without that, two structurally-identical dicts could
|
||||
produce different signatures.
|
||||
"""
|
||||
return json.dumps(payload, sort_keys=True, separators=(",", ":")).encode("utf-8")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Sign / verify
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def sign(payload: dict[str, Any]) -> str:
|
||||
"""Compute the HMAC-SHA256 hex digest over *payload*.
|
||||
"""Produce an Ed25519 signature over *payload*, hex-encoded.
|
||||
|
||||
*payload* MUST NOT contain a ``signature`` key — that's the field
|
||||
we're computing. The caller is responsible for stripping it.
|
||||
Caller must strip any existing ``signature`` field — the function
|
||||
signs whatever it's given, including a stale signature, which
|
||||
would never verify because verify recomputes from a fresh
|
||||
no-``signature`` canonical form.
|
||||
"""
|
||||
digest = hmac.new(_secret_bytes(), _canonical_bytes(payload), hashlib.sha256)
|
||||
return digest.hexdigest()
|
||||
sig_bytes = _privkey().sign(_canonical_bytes(payload))
|
||||
return sig_bytes.hex()
|
||||
|
||||
|
||||
def verify(payload: dict[str, Any], signature: str) -> bool:
|
||||
"""Constant-time compare between the recomputed HMAC and *signature*.
|
||||
|
||||
Returns ``True`` on a match. Uses :func:`hmac.compare_digest` so a
|
||||
timing oracle can't be used to recover the secret one byte at a
|
||||
time — overkill for honor-system DRM, but free.
|
||||
"""
|
||||
expected = sign(payload)
|
||||
return hmac.compare_digest(expected.encode("ascii"), signature.encode("ascii"))
|
||||
def verify(payload: dict[str, Any], signature_hex: str) -> bool:
|
||||
"""Verify *signature_hex* against *payload*. Returns True/False;
|
||||
never raises (a missing or malformed signature is just False)."""
|
||||
try:
|
||||
sig_bytes = bytes.fromhex(signature_hex)
|
||||
except ValueError:
|
||||
return False
|
||||
try:
|
||||
_pubkey().verify(sig_bytes, _canonical_bytes(payload))
|
||||
return True
|
||||
except InvalidSignature:
|
||||
return False
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Blob encoding / decoding
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# A "license blob" is the artifact the buyer pastes into the activation
|
||||
# form. It's a base64-encoded JSON dict containing every license field
|
||||
# *plus* the signature. We choose base64 over raw JSON so the blob is
|
||||
# one paste-able token (no whitespace surprises) and so a typo
|
||||
# truncates the blob into an obviously-invalid form rather than a
|
||||
# subtly-mutated payload.
|
||||
|
||||
_BLOB_PREFIX = "DTLIC1:"
|
||||
# Buyers paste this whole token into the activation page. The prefix
|
||||
# is the version marker:
|
||||
# DTLIC1 — old HMAC scheme (no longer accepted)
|
||||
# DTLIC2 — Ed25519 (current)
|
||||
_BLOB_PREFIX = "DTLIC2:"
|
||||
_OLD_PREFIX = "DTLIC1:"
|
||||
|
||||
|
||||
def encode_blob(payload_with_signature: dict[str, Any]) -> str:
|
||||
@@ -92,10 +157,15 @@ def encode_blob(payload_with_signature: dict[str, Any]) -> str:
|
||||
|
||||
def decode_blob(blob: str) -> dict[str, Any]:
|
||||
"""Reverse of :func:`encode_blob`. Raises ``ValueError`` on a
|
||||
blob that doesn't carry the expected prefix or doesn't decode
|
||||
cleanly — both surface as :class:`InvalidLicenseError` at the
|
||||
manager layer."""
|
||||
blob that doesn't carry the expected prefix, doesn't decode
|
||||
cleanly, or carries the v1 prefix (which we no longer accept)."""
|
||||
s = blob.strip()
|
||||
if s.startswith(_OLD_PREFIX):
|
||||
raise ValueError(
|
||||
f"License blob is the old {_OLD_PREFIX!r} format. v1 blobs "
|
||||
"used a symmetric secret that has since been retired — "
|
||||
"request a new blob from support."
|
||||
)
|
||||
if not s.startswith(_BLOB_PREFIX):
|
||||
raise ValueError(
|
||||
f"License blob missing {_BLOB_PREFIX!r} prefix. "
|
||||
|
||||
@@ -6,6 +6,7 @@ constructor for full isolation.
|
||||
|
||||
Lifecycle::
|
||||
|
||||
assert_production_safe() # guard against build-config errors
|
||||
mgr = get_manager()
|
||||
if not mgr.is_activated():
|
||||
mgr.activate_from_blob(blob, name, email)
|
||||
@@ -17,6 +18,7 @@ from __future__ import annotations
|
||||
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import uuid
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime, timezone
|
||||
@@ -468,3 +470,69 @@ def current_state() -> LicenseState:
|
||||
|
||||
def require_feature(feature: str | FeatureFlag) -> License:
|
||||
return get_manager().require_feature(feature)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Production-build sanity check
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class ProductionBuildError(RuntimeError):
|
||||
"""Raised when a frozen / shipped build is misconfigured in a way
|
||||
that would defeat licensing. Always loud, always fatal — the
|
||||
binary must not boot in this state."""
|
||||
|
||||
|
||||
def _is_shipped_build() -> bool:
|
||||
"""True when running from a PyInstaller bundle (``sys.frozen``).
|
||||
|
||||
Set automatically by PyInstaller; not set in source / pytest
|
||||
runs. The whole purpose of the prod-safe check is to enforce
|
||||
invariants that only matter in a shipped build, so the rest of
|
||||
the codebase can stay flexible.
|
||||
"""
|
||||
return getattr(sys, "frozen", False)
|
||||
|
||||
|
||||
def assert_production_safe() -> None:
|
||||
"""Fail loudly if a shipped build is misconfigured.
|
||||
|
||||
Two tripwires:
|
||||
|
||||
1. ``DATATOOLS_DEV_MODE`` is set in a frozen build. The dev-mode
|
||||
env var unconditionally bypasses license verification — if a
|
||||
buyer's installer somehow ships it enabled (build pipeline
|
||||
bug, mis-set environment), every license check is a no-op.
|
||||
Refuse to start instead.
|
||||
|
||||
2. The active verification key is still the dev key. The build
|
||||
pipeline is supposed to override
|
||||
``DATATOOLS_LICENSE_PUBKEY`` with the production key; if it
|
||||
didn't, the binary will reject every legitimate license
|
||||
(signed with the prod private key) AND would *accept*
|
||||
anything signed with the dev key (which is checked into the
|
||||
source tree). Refuse to start.
|
||||
|
||||
No-ops in non-frozen runs (development, tests) so the dev key
|
||||
+ dev mode keep working in those contexts. Production builds
|
||||
call this from :func:`src.cli_license_guard.guard` and
|
||||
:func:`src.gui.components.hide_streamlit_chrome`.
|
||||
"""
|
||||
if not _is_shipped_build():
|
||||
return
|
||||
|
||||
if _truthy_env("DATATOOLS_DEV_MODE"):
|
||||
raise ProductionBuildError(
|
||||
"DATATOOLS_DEV_MODE is set in a shipped build. This env "
|
||||
"var disables every license check and must never be set "
|
||||
"on a buyer machine. If you see this message in a release "
|
||||
"build, the install was misconfigured — contact support."
|
||||
)
|
||||
|
||||
if crypto.is_using_dev_key():
|
||||
raise ProductionBuildError(
|
||||
"Shipped build is verifying against the development "
|
||||
"license key. The build pipeline must set "
|
||||
"DATATOOLS_LICENSE_PUBKEY to the production public key "
|
||||
"before packaging. This binary will reject every real "
|
||||
"license blob — re-download from the official channel."
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user