datatools-dev

Author	SHA1	Message	Date
Michael	673b902377	feat(license): datatools-admin CLI for the mint API New operator CLI at src/admin_cli.py: mint, list, revoke, ping — talks to the server's /internal/* endpoints over a local SSH tunnel. Stdlib-only on the desktop side (urllib + typer), no new top-level deps. Auth via $DATATOOLS_ADMIN_TOKEN. scripts/generate_license.py is now annotated as a break-glass tool for when the server is unreachable — routine work goes through the new CLI so the authoritative `licenses` row is created. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 00:47:01 +00:00
Michael	23c51fd759	feat(license): local issuance log for minted blobs generate_license.py now appends every minted license to ~/.datatools-creator/issued.jsonl (overridable via env). This is the creator-side system of record until the server-side flow lands. The full blob is stored alongside name/email/tier/expiry so buyers who lose their delivery email can be re-served without re-minting. File is created mode 600 and lives outside the buyer-facing ~/.datatools/ dir so it never gets bundled into a shipped install. Log failures are non-fatal (warning to stderr) — the mint already succeeded by the time we try to log, and forcing a re-mint after a log error would invalidate any device the buyer had activated. Pass --no-log for test mints. ADMIN.md adds a "Customer record-keeping" section with the path, schema, jq one-liners, and migration note pointing at the upcoming LICENSE-SERVER.md design doc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 22:25:19 +00:00
Michael	e534fb4989	sec(license): Ed25519 sigs + production-safe tripwire Two coupled hardening upgrades. 1. Asymmetric signatures (HMAC → Ed25519) The previous HMAC scheme used a symmetric secret that any motivated reverse engineer could pull out of the shipped binary and use to mint blobs for any tier / name / email. With Ed25519, the binary ships only the public verification key; the signing key never leaves the seller's environment, so binary compromise no longer yields forgery. - src/license/crypto.py rewritten around cryptography.hazmat.primitives.asymmetric.ed25519. Same public API surface (sign/verify/encode_blob/decode_blob), same canonical JSON encoding — drop-in for the manager / cli / GUI layers. - DATATOOLS_LICENSE_PRIVKEY (seller-side) and DATATOOLS_LICENSE_PUBKEY (build-time) env vars supply the keys; the in-source dev keypair (src/license/_dev_keypair.py) deterministically derives from a seed phrase for repro builds and tests. - Blob prefix bumped DTLIC1: → DTLIC2:. Decoding a DTLIC1 blob surfaces a clear "old format" error rather than a confusing signature mismatch. - scripts/generate_keypair.py mints fresh production keypairs for the seller (run once, stash the private key offline). Adds cryptography>=41,<46 to requirements.txt (was an undeclared transitive dep). 2. Production-safe tripwire assert_production_safe() refuses to boot a frozen / shipped build when either: - DATATOOLS_DEV_MODE=1 is set (would unconditionally bypass every license check — fine in source/test but catastrophic in a buyer install). - The active verification key is still the embedded dev key (the build pipeline forgot to set DATATOOLS_LICENSE_PUBKEY). No-op in source / pytest runs (sys.frozen is unset) so test fixtures and dev workflows keep working without ceremony. Called from src/cli_license_guard.guard() and from hide_streamlit_chrome — so it fires on every CLI invocation and every GUI page load. Tests: 49 license-layer unit tests (was 40); added Ed25519 wrong-key rejection, dev-keypair seed pin, blob v2 prefix, v1 rejection with clear message, and four production-safe scenarios (no-op in source, fires on DEV_MODE in frozen, fires on dev key in frozen, passes in frozen with prod pubkey). Total: 2024 → 2033. Docs (REQUIREMENTS §17a, DEVELOPER licensing recipe, DECISIONS §9b + decision log) updated with the new threat-model write-up, key-storage workflow, and tripwire behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 17:34:48 +00:00
Michael	e435103113	feat(license): registration + 1-year licenses + tier scaffolding A complete offline licensing layer (no internet at any step): Core - src/license/ — schema (License, Tier, FeatureFlag), HMAC crypto, JSON storage, LicenseManager singleton with activate/renew/ deactivate/issue_trial. Tier-scaffolded so future SKUs can carve per-tool feature sets without consumer-code edits. - scripts/generate_license.py — creator-only key generator. Mints a DTLIC1: blob the buyer pastes into the activation page. GUI - New activation form component (src/gui/components/activation.py). - hide_streamlit_chrome() now inline-renders the activation form when no valid license is present (every page short-circuits to the form until activated). - Sidebar shows tier + days remaining; renewal warning under 30 days. - New pages/_Activate.py for revisiting the form after activation. CLI - src/license_cli.py — activate / renew / status / trial / deactivate commands. Exempt from the guard. - src/cli_license_guard.py — drop-in guard call added to every tool CLI's main(). Lets --help through; respects DATATOOLS_DEV_MODE. i18n - New activation.* and license.* keys in en.json + es.json (page title, form labels, status badges, renewal warnings, error messages). Pack parity test stays green. Test infrastructure - tests/conftest.py autouse fixture sets DATATOOLS_DEV_MODE=1 so the existing 1916 tests continue to pass. - isolated_license_path / activated_license_manager / unactivated_license_manager fixtures for tests that want to drive the real check. Tests (+79) - tests/test_license.py (40): schema, crypto roundtrip, blob encode/decode, tier→feature mapping, activation flow, name/email mismatch rejection, tamper detection, expiration, renewal, dev-mode bypass. - tests/test_license_cli.py (26): every license_cli command + subprocess tests confirming every tool CLI refuses to run without a license, --help always works, DEV_MODE bypasses. - tests/gui/test_activation.py (13): gate blocks without license, passes with trial, activation form submission unlocks the gate, sidebar status, renewal warning, i18n. Total: 1916 → 1995 tests. All pass under the strict warning filter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 16:54:23 +00:00
Michael	ec56b1994b	chore: remove one-time 1.25GB stress harness The stress benchmark served its purpose — perf findings shipped in `438bc0f` (1 GB-class file efficiency for the analyzer + gate pipeline). Removing the script and the (already auto-deleted) test fixture so the repo doesn't carry one-time scaffolding. Future ad-hoc benchmarks can resurrect this from git history. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 21:15:58 +00:00
Michael	70ed695027	test(scripts): one-shot 1.25GB stress harness for the gate pipeline Generates a synthetic messy CSV at the target size, then runs every pipeline stage end-to-end (detect_encoding, repair_bytes, analyze, auto_fix on sample + full file) capturing wall-clock and peak RSS at each stage. Not part of the automated suite — invoke directly via ``python scripts/stress_1_25gb.py``. ``--keep`` to preserve the file between runs, ``--target-gb`` to tune the size. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 20:52:27 +00:00

6 Commits