feat(license): registration + 1-year licenses + tier scaffolding

A complete offline licensing layer (no internet at any step):

Core
- src/license/ — schema (License, Tier, FeatureFlag), HMAC crypto,
  JSON storage, LicenseManager singleton with activate/renew/
  deactivate/issue_trial. Tier-scaffolded so future SKUs can carve
  per-tool feature sets without consumer-code edits.
- scripts/generate_license.py — creator-only key generator. Mints a
  DTLIC1: blob the buyer pastes into the activation page.

GUI
- New activation form component (src/gui/components/activation.py).
- hide_streamlit_chrome() now inline-renders the activation form when
  no valid license is present (every page short-circuits to the form
  until activated).
- Sidebar shows tier + days remaining; renewal warning under 30 days.
- New pages/_Activate.py for revisiting the form after activation.

CLI
- src/license_cli.py — activate / renew / status / trial / deactivate
  commands. Exempt from the guard.
- src/cli_license_guard.py — drop-in guard call added to every tool
  CLI's main(). Lets --help through; respects DATATOOLS_DEV_MODE.

i18n
- New activation.* and license.* keys in en.json + es.json
  (page title, form labels, status badges, renewal warnings, error
  messages). Pack parity test stays green.

Test infrastructure
- tests/conftest.py autouse fixture sets DATATOOLS_DEV_MODE=1 so the
  existing 1916 tests continue to pass.
- isolated_license_path / activated_license_manager /
  unactivated_license_manager fixtures for tests that want to drive
  the real check.

Tests (+79)
- tests/test_license.py (40): schema, crypto roundtrip, blob
  encode/decode, tier→feature mapping, activation flow, name/email
  mismatch rejection, tamper detection, expiration, renewal,
  dev-mode bypass.
- tests/test_license_cli.py (26): every license_cli command +
  subprocess tests confirming every tool CLI refuses to run without
  a license, --help always works, DEV_MODE bypasses.
- tests/gui/test_activation.py (13): gate blocks without license,
  passes with trial, activation form submission unlocks the gate,
  sidebar status, renewal warning, i18n.

Total: 1916 → 1995 tests. All pass under the strict warning filter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-13 16:54:23 +00:00
parent b2c7b94fe9
commit e435103113
27 changed files with 2798 additions and 6 deletions

59
src/license/__init__.py Normal file
View File

@@ -0,0 +1,59 @@
"""License module — registration, activation, expiration, feature gating.
Public API the rest of the app uses:
- :func:`get_manager` — singleton :class:`LicenseManager` instance.
- :func:`current_state` — quick snapshot for status badges / tests.
- :func:`require_feature` — raise :class:`LicenseError` if a feature
isn't unlocked by the active license.
- :class:`License`, :class:`Tier`, :class:`FeatureFlag` — schema.
- :class:`LicenseError` and subclasses — typed failures the UI can
branch on (not yet activated vs. expired vs. tampered).
The license model is:
1. The seller (creator) runs ``scripts/generate_license.py`` to mint a
signed **license blob** keyed to a buyer's name + email.
2. The buyer pastes the blob into the activation page on first launch.
3. The app verifies the HMAC signature locally (no internet), then
writes a canonical ``~/.datatools/license.json`` and the app
unlocks.
The signature is HMAC-SHA256 with a build-time secret. Combined with
the 30-day refund policy, this is honor-system DRM — see
``docs/DECISIONS.md`` for the trade-off discussion.
"""
from __future__ import annotations
from .errors import (
ExpiredLicenseError,
InvalidLicenseError,
LicenseError,
NotActivatedError,
UnsupportedFeatureError,
)
from .features import FEATURES_BY_TIER, all_features_for_tier
from .manager import LicenseManager, current_state, get_manager, require_feature
from .schema import FeatureFlag, License, Tier
__all__ = [
# Manager
"LicenseManager",
"current_state",
"get_manager",
"require_feature",
# Schema
"FeatureFlag",
"License",
"Tier",
# Feature registry
"FEATURES_BY_TIER",
"all_features_for_tier",
# Errors
"LicenseError",
"NotActivatedError",
"ExpiredLicenseError",
"InvalidLicenseError",
"UnsupportedFeatureError",
]

112
src/license/crypto.py Normal file
View File

@@ -0,0 +1,112 @@
"""HMAC sign/verify for license blobs.
The signing secret is read from ``$DATATOOLS_LICENSE_SECRET`` if
present, otherwise from the build-time constant below. Replace the
constant at build time (via PyInstaller hook or a sed step in the
build pipeline) so the shipped binary has a different secret from
this repo's source tree.
Threat model: honor-system DRM. A motivated reverse engineer can pull
the secret out of the binary, sign their own licenses, and bypass the
check. That's expected for $49 desktop software — the goal is to
discourage casual sharing, not stop targeted piracy. The 30-day
refund policy and the personal-name embedded in every license cover
the same gap from a different angle.
"""
from __future__ import annotations
import base64
import hashlib
import hmac
import json
import os
from typing import Any
# Build-time default. Replace via env var in shipped builds; keep this
# constant non-empty so unit tests have a stable verification key.
_DEFAULT_SECRET = (
"datatools-license-v1-development-secret-"
"replace-at-build-time-via-DATATOOLS_LICENSE_SECRET"
)
def _secret_bytes() -> bytes:
"""Return the active HMAC secret as bytes."""
return os.environ.get("DATATOOLS_LICENSE_SECRET", _DEFAULT_SECRET).encode("utf-8")
def _canonical_bytes(payload: dict[str, Any]) -> bytes:
"""Canonical JSON encoding for the HMAC input.
``sort_keys=True`` + ``separators=(",", ":")`` produce a byte-for-
byte deterministic representation across Python versions and OS
locales. Without that, two structurally-identical dicts could hash
to different signatures.
"""
return json.dumps(payload, sort_keys=True, separators=(",", ":")).encode("utf-8")
def sign(payload: dict[str, Any]) -> str:
"""Compute the HMAC-SHA256 hex digest over *payload*.
*payload* MUST NOT contain a ``signature`` key — that's the field
we're computing. The caller is responsible for stripping it.
"""
digest = hmac.new(_secret_bytes(), _canonical_bytes(payload), hashlib.sha256)
return digest.hexdigest()
def verify(payload: dict[str, Any], signature: str) -> bool:
"""Constant-time compare between the recomputed HMAC and *signature*.
Returns ``True`` on a match. Uses :func:`hmac.compare_digest` so a
timing oracle can't be used to recover the secret one byte at a
time — overkill for honor-system DRM, but free.
"""
expected = sign(payload)
return hmac.compare_digest(expected.encode("ascii"), signature.encode("ascii"))
# ---------------------------------------------------------------------------
# Blob encoding / decoding
# ---------------------------------------------------------------------------
# A "license blob" is the artifact the buyer pastes into the activation
# form. It's a base64-encoded JSON dict containing every license field
# *plus* the signature. We choose base64 over raw JSON so the blob is
# one paste-able token (no whitespace surprises) and so a typo
# truncates the blob into an obviously-invalid form rather than a
# subtly-mutated payload.
_BLOB_PREFIX = "DTLIC1:"
def encode_blob(payload_with_signature: dict[str, Any]) -> str:
"""Wrap a signed payload into the buyer-facing blob form."""
raw = json.dumps(
payload_with_signature, sort_keys=True, separators=(",", ":"),
).encode("utf-8")
return _BLOB_PREFIX + base64.urlsafe_b64encode(raw).decode("ascii")
def decode_blob(blob: str) -> dict[str, Any]:
"""Reverse of :func:`encode_blob`. Raises ``ValueError`` on a
blob that doesn't carry the expected prefix or doesn't decode
cleanly — both surface as :class:`InvalidLicenseError` at the
manager layer."""
s = blob.strip()
if not s.startswith(_BLOB_PREFIX):
raise ValueError(
f"License blob missing {_BLOB_PREFIX!r} prefix. "
"Did you paste the wrong text?"
)
encoded = s[len(_BLOB_PREFIX):]
try:
raw = base64.urlsafe_b64decode(encoded.encode("ascii"))
except (ValueError, TypeError) as e:
raise ValueError(f"License blob is not valid base64: {e}") from e
try:
return json.loads(raw.decode("utf-8"))
except (json.JSONDecodeError, UnicodeDecodeError) as e:
raise ValueError(f"License blob contains invalid JSON: {e}") from e

47
src/license/errors.py Normal file
View File

@@ -0,0 +1,47 @@
"""Structured error hierarchy for the license layer.
Mirrors the ``src.core.errors`` pattern — every subclass extends a
stdlib base so existing ``except OSError`` / ``except ValueError``
handlers keep working. The UI / CLI branches on the subclass to render
the right next step (activate, renew, contact support).
"""
from __future__ import annotations
class LicenseError(ValueError):
"""Base class for every licensing failure. Subclass-only — callers
should catch the specific failure mode they handle."""
class NotActivatedError(LicenseError):
"""No license file present, or file present but signature missing.
Recovery: open the activation page (GUI) or run
``datatools-license activate <blob>`` (CLI).
"""
class InvalidLicenseError(LicenseError):
"""The license file is present but failed verification.
Common causes: tampered signature, blob from a different build
(different secret), corrupted JSON. Recovery: re-paste the blob
from the original delivery email or contact support.
"""
class ExpiredLicenseError(LicenseError):
"""The license is structurally valid but past its expiration date.
Recovery: renew via ``datatools-license renew <blob>`` or paste a
new blob into the activation page.
"""
class UnsupportedFeatureError(LicenseError):
"""The active license's tier doesn't unlock a requested feature.
Raised by :func:`require_feature` when, e.g., a TRIAL user tries
to access an ENTERPRISE-only tool. Recovery: upgrade tier.
"""

49
src/license/features.py Normal file
View File

@@ -0,0 +1,49 @@
"""Tier → feature mapping.
A tier unlocks every feature listed for it. Adding a new SKU means
adding a new row here and (if the SKU introduces new functionality)
adding feature flags to :class:`~src.license.schema.FeatureFlag`. No
consumer code changes.
The v1 product ships only :data:`Tier.CORE`, which unlocks every tool.
TRIAL exists so a buyer can register without a paid key and still get
a 1-year working license; the difference between TRIAL and CORE is
semantic (and the basis for showing "TRIAL" in the sidebar), not
functional.
PRO and ENTERPRISE are scaffolded for future SKUs. They currently
unlock the same feature set as CORE so the architecture is exercised
by tests without committing to a particular pricing structure.
"""
from __future__ import annotations
from typing import FrozenSet
from .schema import FeatureFlag, Tier
def _all() -> FrozenSet[FeatureFlag]:
"""Every feature flag — used as the default for the v1 SKU."""
return frozenset(FeatureFlag)
FEATURES_BY_TIER: dict[Tier, FrozenSet[FeatureFlag]] = {
Tier.TRIAL: _all(),
Tier.CORE: _all(),
# Pre-wired for future SKUs. Today they mirror CORE so the gating
# tests exercise the lookup path without making a marketing claim.
Tier.PRO: _all(),
Tier.ENTERPRISE: _all(),
}
def all_features_for_tier(tier: Tier) -> tuple[str, ...]:
"""Return the canonical, sorted tuple of feature ids for *tier*.
Used by the license generator to fill the ``features`` field on a
new license, and by the manager to upgrade an older license whose
``features`` list omits a flag we've since added to its tier.
"""
flags = FEATURES_BY_TIER[tier]
return tuple(sorted(f.value for f in flags))

470
src/license/manager.py Normal file
View File

@@ -0,0 +1,470 @@
"""LicenseManager — the public face of the license layer.
Singleton-by-default (``get_manager()`` returns a process-wide
instance), but tests can construct standalone managers via the
constructor for full isolation.
Lifecycle::
mgr = get_manager()
if not mgr.is_activated():
mgr.activate_from_blob(blob, name, email)
mgr.require_feature(FeatureFlag.DEDUPLICATOR)
state = mgr.current_state() # snapshot for the sidebar / CLI status
"""
from __future__ import annotations
import os
import re
import uuid
from dataclasses import dataclass
from datetime import datetime, timezone
from pathlib import Path
from typing import Optional
from . import crypto, storage
from .errors import (
ExpiredLicenseError,
InvalidLicenseError,
LicenseError,
NotActivatedError,
UnsupportedFeatureError,
)
from .features import all_features_for_tier
from .schema import FeatureFlag, License, Tier, default_expiry_iso, _utcnow_iso
# ---------------------------------------------------------------------------
# State snapshot
# ---------------------------------------------------------------------------
@dataclass(frozen=True)
class LicenseState:
"""A read-only snapshot for status widgets / CLI ``--status`` JSON.
Always safe to render — even when no license is activated the
dataclass is populated with explanatory defaults so the GUI never
needs to None-check before formatting.
"""
activated: bool
valid: bool # activated AND not expired AND signature OK
name: str
email: str
tier: str
license_key: str
issued_at: str
expires_at: str
days_remaining: int
features: tuple[str, ...]
error_kind: str # "", "not_activated", "expired", "invalid"
error_message: str
def as_dict(self) -> dict:
from dataclasses import asdict
d = asdict(self)
d["features"] = list(self.features)
return d
_EMPTY_STATE = LicenseState(
activated=False, valid=False, name="", email="", tier="",
license_key="", issued_at="", expires_at="", days_remaining=0,
features=(),
error_kind="not_activated",
error_message="No license activated.",
)
# ---------------------------------------------------------------------------
# Manager
# ---------------------------------------------------------------------------
_EMAIL_RE = re.compile(r"^[^@\s]+@[^@\s]+\.[^@\s]+$")
class LicenseManager:
"""Read/write license state. Cheap to construct; the singleton at
module level just avoids reload churn.
Storage path defaults to :func:`storage.default_license_path` —
pass ``path=`` to override for tests.
"""
def __init__(self, *, path: Optional[Path] = None) -> None:
self._path = path
self._cached: Optional[License] = None
self._dev_mode: Optional[bool] = None
# --- Dev bypass ---------------------------------------------------------
@property
def dev_mode(self) -> bool:
"""``DATATOOLS_DEV_MODE=1`` short-circuits every check.
Cached on the instance so a test that sets the env after
construction still picks it up (re-read on each access).
"""
return _truthy_env("DATATOOLS_DEV_MODE")
# --- Load / save --------------------------------------------------------
def load(self) -> Optional[License]:
"""Read + verify the on-disk license. Returns ``None`` when no
file exists. Raises :class:`InvalidLicenseError` on signature
mismatch / tampering."""
raw = storage.read_raw(self._path)
if raw is None:
self._cached = None
return None
lic = License.from_dict(raw)
# Verify signature against the canonical payload.
if not crypto.verify(lic.to_canonical_dict(), lic.signature):
raise InvalidLicenseError(
"License signature does not verify. The file may have "
"been tampered with, or it was issued by a different "
"build. Re-paste the original license blob to recover."
)
self._cached = lic
return lic
def save(self, lic: License) -> Path:
"""Persist *lic* to the configured path. Caller is responsible
for having signed the license already; this function does
NOT re-sign."""
path = storage.write_raw(lic.to_dict(), self._path)
self._cached = lic
return path
def deactivate(self) -> bool:
"""Remove the on-disk license. Returns whether a file was
removed (False if nothing was active)."""
self._cached = None
return storage.remove(self._path)
# --- Activation ---------------------------------------------------------
def activate_from_blob(
self,
blob: str,
*,
name: str,
email: str,
) -> License:
"""Verify *blob* and write the activated license to disk.
The buyer pastes the blob; the page collects their *name* and
*email* separately. We require both registered values to
match the values embedded in the signed blob — defends
against blob-sharing between buyers.
"""
_validate_registration(name, email)
try:
payload = crypto.decode_blob(blob)
except ValueError as e:
raise InvalidLicenseError(str(e)) from e
signature = payload.get("signature", "")
if not signature:
raise InvalidLicenseError(
"License blob is missing the ``signature`` field. "
"The blob may have been truncated when pasted."
)
canonical = {k: v for k, v in payload.items() if k != "signature"}
if not crypto.verify(canonical, signature):
raise InvalidLicenseError(
"License blob signature did not verify. The blob may "
"be corrupt, intended for a different product build, "
"or modified after issue."
)
# Reconstruct the License dataclass after verification so the
# canonical dict we hashed matches the on-disk JSON.
lic = License.from_dict(payload)
# Personal-name and email matching is a soft attestation. We
# enforce case-insensitive equality after stripping whitespace,
# so " jane@Example.com " matches the embedded canonical
# form without surprising the user about case.
if name.strip().casefold() != lic.name.casefold() or (
email.strip().casefold() != lic.email.casefold()
):
raise InvalidLicenseError(
"Registered name / email do not match the values "
"embedded in the license blob. Contact support if you "
"believe this is in error."
)
if lic.is_expired():
raise ExpiredLicenseError(
f"License expired on {lic.expires_at}. "
"Paste a renewal blob to extend access."
)
self.save(lic)
return lic
def issue_trial(self, *, name: str, email: str, years: int = 1) -> License:
"""Self-sign a 1-year trial license. The seller's
``scripts/generate_license.py`` produces these for buyers; the
same code path is reused at activation time as a fallback
when a buyer wants to evaluate without a key.
Trial licenses are functionally identical to CORE in v1; only
the tier label differs (so the sidebar can say "TRIAL" if we
ever want to nudge a conversion).
"""
_validate_registration(name, email)
return self._mint(name=name, email=email, tier=Tier.TRIAL, years=years)
def renew(self, blob: str) -> License:
"""Renew an existing license using a fresh blob.
Verification: the blob must verify, its name+email must match
the currently-active license, and its expiry must be in the
future. We allow tier changes during renewal (upgrade path).
"""
current = self._cached or self.load()
if current is None:
raise NotActivatedError(
"No active license to renew. Use ``activate`` instead "
"of ``renew`` for first-time setup."
)
try:
payload = crypto.decode_blob(blob)
except ValueError as e:
raise InvalidLicenseError(str(e)) from e
signature = payload.get("signature", "")
canonical = {k: v for k, v in payload.items() if k != "signature"}
if not crypto.verify(canonical, signature):
raise InvalidLicenseError("Renewal blob signature did not verify.")
lic = License.from_dict(payload)
if (
lic.name.casefold() != current.name.casefold()
or lic.email.casefold() != current.email.casefold()
):
raise InvalidLicenseError(
"Renewal blob is for a different name/email than the "
"currently-active license."
)
if lic.is_expired():
raise ExpiredLicenseError(
"Renewal blob is itself expired. Generate a new one."
)
self.save(lic)
return lic
# --- Inspection ---------------------------------------------------------
def is_activated(self) -> bool:
if self._cached is not None:
return True
return storage.read_raw(self._path) is not None
def is_valid(self) -> bool:
if self.dev_mode:
return True
try:
lic = self._cached or self.load()
except LicenseError:
return False
if lic is None:
return False
return not lic.is_expired()
def current_state(self) -> LicenseState:
if self.dev_mode:
return LicenseState(
activated=True, valid=True,
name="dev", email="dev@local",
tier=Tier.ENTERPRISE.value,
license_key="DEV-BYPASS",
issued_at=_utcnow_iso(),
expires_at=default_expiry_iso(years=99),
days_remaining=36500,
features=all_features_for_tier(Tier.ENTERPRISE),
error_kind="",
error_message="",
)
try:
lic = self._cached or self.load()
except InvalidLicenseError as e:
return _EMPTY_STATE.__class__(
activated=True, valid=False,
name="", email="", tier="", license_key="",
issued_at="", expires_at="", days_remaining=0,
features=(),
error_kind="invalid",
error_message=str(e),
)
if lic is None:
return _EMPTY_STATE
if lic.is_expired():
return LicenseState(
activated=True, valid=False,
name=lic.name, email=lic.email, tier=lic.tier.value,
license_key=lic.license_key,
issued_at=lic.issued_at, expires_at=lic.expires_at,
days_remaining=lic.days_remaining(),
features=lic.features,
error_kind="expired",
error_message=(
f"License expired on {lic.expires_at}. "
"Paste a renewal blob to extend access."
),
)
return LicenseState(
activated=True, valid=True,
name=lic.name, email=lic.email, tier=lic.tier.value,
license_key=lic.license_key,
issued_at=lic.issued_at, expires_at=lic.expires_at,
days_remaining=max(lic.days_remaining(), 0),
features=lic.features,
error_kind="",
error_message="",
)
def require_feature(self, feature: str | FeatureFlag) -> License:
"""Raise the right error if *feature* isn't accessible.
Returns the active :class:`License` on success so callers can
log the tier / days-remaining alongside their own work.
"""
if self.dev_mode:
# Synthesize a dev license so callers expecting a return
# value don't blow up. The dev license unlocks every flag.
return License(
name="dev", email="dev@local",
license_key="DEV-BYPASS",
tier=Tier.ENTERPRISE,
features=all_features_for_tier(Tier.ENTERPRISE),
issued_at=_utcnow_iso(),
expires_at=default_expiry_iso(years=99),
signature="",
)
try:
lic = self._cached or self.load()
except InvalidLicenseError:
raise
if lic is None:
raise NotActivatedError(
"DataTools is not activated. Run "
"``datatools-license activate <blob>`` or use the "
"Activate page in the GUI."
)
if lic.is_expired():
raise ExpiredLicenseError(
f"License expired on {lic.expires_at}. "
"Renew before continuing."
)
if not lic.has_feature(feature):
tier_name = lic.tier.value if isinstance(lic.tier, Tier) else lic.tier
raise UnsupportedFeatureError(
f"Feature {feature!r} is not enabled on the active "
f"{tier_name!r} license."
)
return lic
# --- Internals ---------------------------------------------------------
def _mint(
self,
*,
name: str,
email: str,
tier: Tier,
years: int = 1,
license_key: Optional[str] = None,
) -> License:
"""Self-sign a new license. Used by ``issue_trial`` and by
the seller-side key generation utility (which calls the
same code via the bare manager)."""
now = _utcnow_iso()
exp = default_expiry_iso(years=years)
features = all_features_for_tier(tier)
key = license_key or _generate_license_key(tier)
unsigned = License(
name=name, email=email, license_key=key, tier=tier,
features=features, issued_at=now, expires_at=exp,
signature="",
)
sig = crypto.sign(unsigned.to_canonical_dict())
signed = License(
name=unsigned.name, email=unsigned.email,
license_key=unsigned.license_key, tier=unsigned.tier,
features=unsigned.features, issued_at=unsigned.issued_at,
expires_at=unsigned.expires_at, signature=sig,
)
self.save(signed)
return signed
def _generate_license_key(tier: Tier) -> str:
"""Human-readable but unguessable key id.
Format: ``DT1-{TIER}-{8 hex}-{8 hex}``. The two random hex blocks
come from a single UUID4 so the key has 64 bits of entropy. Not
used as the cryptographic identity — that's the signature — but
it's a stable handle for support emails.
"""
rid = uuid.uuid4().hex
return f"DT1-{tier.value.upper()}-{rid[:8]}-{rid[8:16]}"
def _validate_registration(name: str, email: str) -> None:
"""Reject obviously-bad inputs before touching crypto.
The activation page should call this too so the error surfaces
immediately instead of from inside the verifier.
"""
if not name or not name.strip():
raise InvalidLicenseError("Name is required for registration.")
if not email or not _EMAIL_RE.match(email.strip()):
raise InvalidLicenseError(
f"{email!r} is not a valid email address. "
"Expected: ``local@domain.tld``."
)
def _truthy_env(name: str) -> bool:
v = os.environ.get(name, "")
return v.strip().lower() in {"1", "true", "yes", "on"}
# ---------------------------------------------------------------------------
# Singleton + module-level convenience
# ---------------------------------------------------------------------------
_singleton: Optional[LicenseManager] = None
def get_manager() -> LicenseManager:
"""Return the process-wide :class:`LicenseManager`.
Re-uses the same instance across imports so the GUI's sidebar,
the chrome gate, and the CLI guard share one cached license read.
Tests that need isolation should construct their own manager
instead.
"""
global _singleton
if _singleton is None:
_singleton = LicenseManager()
return _singleton
def reset_singleton_for_tests() -> None:
"""Drop the cached singleton. Used by the test fixture so each
test session starts with a fresh manager pointed at its tmp
license path."""
global _singleton
_singleton = None
def current_state() -> LicenseState:
return get_manager().current_state()
def require_feature(feature: str | FeatureFlag) -> License:
return get_manager().require_feature(feature)

181
src/license/schema.py Normal file
View File

@@ -0,0 +1,181 @@
"""License schema — dataclasses + enums.
Wire format (the contents of ``~/.datatools/license.json`` AND the
base64-decoded activation blob)::
{
"name": "Jane Doe",
"email": "jane@example.com",
"license_key": "DT1-CORE-1A2B3C4D-5E6F7G8H",
"tier": "core",
"features": ["01_deduplicator", "02_text_cleaner", ...],
"issued_at": "2026-05-13T00:00:00Z",
"expires_at": "2027-05-13T00:00:00Z",
"signature": "<hex hmac-sha256>"
}
The signature is the HMAC over the canonical JSON of every field
*except* ``signature`` itself (see :mod:`.crypto`). Keeping the schema
strictly additive means future builds can verify older licenses as
long as they ship the same secret.
"""
from __future__ import annotations
from dataclasses import asdict, dataclass, field
from datetime import datetime, timezone
from enum import Enum
from typing import Any
class Tier(str, Enum):
"""License tier. Drives the feature set the active license unlocks.
Order matters: TRIAL < CORE < PRO < ENTERPRISE. A higher tier
inherits every feature of the lower tiers — see
:data:`.features.FEATURES_BY_TIER`.
"""
TRIAL = "trial"
CORE = "core"
PRO = "pro"
ENTERPRISE = "enterprise"
class FeatureFlag(str, Enum):
"""Stable feature identifiers. Match the ``tool_id`` field in
:mod:`src.gui.tools_registry` so the GUI's per-tool gating can
share the same string keys.
Future SKUs ship by adding new flags here and adding them to a
new tier in ``FEATURES_BY_TIER`` — no consumer code changes.
"""
DEDUPLICATOR = "01_deduplicator"
TEXT_CLEANER = "02_text_cleaner"
FORMAT_STANDARDIZER = "03_format_standardizer"
MISSING_HANDLER = "04_missing_handler"
COLUMN_MAPPER = "05_column_mapper"
OUTLIER_DETECTOR = "06_outlier_detector"
MULTI_FILE_MERGER = "07_multi_file_merger"
VALIDATOR_REPORTER = "08_validator_reporter"
PIPELINE_RUNNER = "09_pipeline_runner"
def _utcnow_iso() -> str:
"""Return current UTC time in ISO-8601 with explicit ``Z`` suffix.
``datetime.utcnow`` is deprecated in CPython 3.12; using a
tz-aware UTC datetime and slicing off the ``+00:00`` keeps the
serialized form short and human-readable.
"""
return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
def _parse_iso(s: str) -> datetime:
"""Parse one of our ISO strings into a tz-aware datetime."""
# Accept both ``...Z`` and ``...+00:00`` so future format tweaks
# don't break old files.
if s.endswith("Z"):
s = s[:-1] + "+00:00"
dt = datetime.fromisoformat(s)
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
return dt
@dataclass(frozen=True)
class License:
"""One activated license. Immutable — renew/upgrade produces a new
instance, never mutates an existing one."""
name: str
email: str
license_key: str
tier: Tier
features: tuple[str, ...]
issued_at: str # ISO-8601 UTC
expires_at: str # ISO-8601 UTC
signature: str = "" # populated by ``crypto.sign``
# --- Convenience accessors ------------------------------------------------
@property
def issued_dt(self) -> datetime:
return _parse_iso(self.issued_at)
@property
def expires_dt(self) -> datetime:
return _parse_iso(self.expires_at)
def is_expired(self, *, now: datetime | None = None) -> bool:
ref = now or datetime.now(timezone.utc)
return ref >= self.expires_dt
def days_remaining(self, *, now: datetime | None = None) -> int:
ref = now or datetime.now(timezone.utc)
delta = self.expires_dt - ref
# ``int(days)`` floors towards 0 for negatives — we use
# ``max(..., 0)`` for the display path and the raw value for
# the test path. Callers wanting "expired by N days" should
# use ``is_expired`` first.
return delta.days
def has_feature(self, feature: str | FeatureFlag) -> bool:
key = feature.value if isinstance(feature, FeatureFlag) else feature
return key in self.features
# --- Serialization --------------------------------------------------------
def to_canonical_dict(self) -> dict[str, Any]:
"""Return the JSON-canonical dict the HMAC is computed over.
Excludes ``signature`` so signing and verifying both agree on
the message bytes.
"""
d = asdict(self)
d.pop("signature", None)
d["tier"] = self.tier.value if isinstance(self.tier, Tier) else self.tier
d["features"] = list(self.features)
return d
def to_dict(self) -> dict[str, Any]:
"""Return the on-disk dict, signature included."""
d = self.to_canonical_dict()
d["signature"] = self.signature
return d
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "License":
"""Inverse of :meth:`to_dict`. Tolerant of missing optional
fields (defaults), strict on required ones (raises KeyError).
"""
tier_raw = data["tier"]
tier = tier_raw if isinstance(tier_raw, Tier) else Tier(tier_raw)
return cls(
name=str(data["name"]),
email=str(data["email"]),
license_key=str(data["license_key"]),
tier=tier,
features=tuple(data.get("features", ())),
issued_at=str(data["issued_at"]),
expires_at=str(data["expires_at"]),
signature=str(data.get("signature", "")),
)
# Public helper exposed for the activation flow (1-year default).
def default_expiry_iso(years: int = 1, *, now: datetime | None = None) -> str:
"""Return an ISO timestamp *years* from *now* (default: current UTC)."""
ref = now or datetime.now(timezone.utc)
# ``replace(year=...)`` handles leap-year edge cases via the
# ``timedelta`` fallback below for Feb-29 issued dates.
try:
target = ref.replace(year=ref.year + years)
except ValueError:
# Feb 29 + N years where target year isn't a leap year — slide
# to Feb 28. Acceptable; the buyer is one day short of an
# exact year boundary on a date they almost certainly didn't
# pick on purpose.
target = ref.replace(year=ref.year + years, day=28)
return target.strftime("%Y-%m-%dT%H:%M:%SZ")

76
src/license/storage.py Normal file
View File

@@ -0,0 +1,76 @@
"""Where the activated license lives on disk.
Default path: ``~/.datatools/license.json``. Overridable via
``$DATATOOLS_LICENSE_PATH`` for tests (the conftest fixture uses this
to point each test session at a tmp file).
The directory is created lazily on first write — we don't want to
create the user's config dir just for reading.
"""
from __future__ import annotations
import json
import os
from pathlib import Path
from typing import Any, Optional
def default_license_path() -> Path:
"""The resolved license file path for the current process.
Order of resolution:
1. ``$DATATOOLS_LICENSE_PATH`` (absolute path; used by tests).
2. ``~/.datatools/license.json`` (everyone else).
"""
override = os.environ.get("DATATOOLS_LICENSE_PATH")
if override:
return Path(override).expanduser().resolve()
return Path.home() / ".datatools" / "license.json"
def read_raw(path: Optional[Path] = None) -> Optional[dict[str, Any]]:
"""Return the on-disk license dict, or ``None`` if no file exists.
Anything else (truncated file, invalid JSON) raises ``ValueError``
so the caller surfaces it as :class:`InvalidLicenseError`. We
don't try to recover from a corrupt license file — a user that
sees "invalid license" can paste their blob again.
"""
p = path or default_license_path()
if not p.exists():
return None
try:
return json.loads(p.read_text(encoding="utf-8"))
except (json.JSONDecodeError, UnicodeDecodeError) as e:
raise ValueError(f"License file at {p} is corrupted: {e}") from e
def write_raw(data: dict[str, Any], path: Optional[Path] = None) -> Path:
"""Atomically write *data* to the license path.
Atomic = write-to-temp-then-rename, so a crashed write doesn't
leave a half-written license file that would fail verification on
the next launch.
"""
p = path or default_license_path()
p.parent.mkdir(parents=True, exist_ok=True)
tmp = p.with_suffix(p.suffix + ".tmp")
tmp.write_text(
json.dumps(data, indent=2, sort_keys=True), encoding="utf-8",
)
tmp.replace(p)
return p
def remove(path: Optional[Path] = None) -> bool:
"""Delete the license file. Returns ``True`` if a file was
removed, ``False`` if nothing was there. Used by the
``datatools-license deactivate`` command and by test cleanup."""
p = path or default_license_path()
try:
p.unlink()
return True
except FileNotFoundError:
return False