datatools-dev

Author	SHA1	Message	Date
Michael	00d3f28865	feat(pipeline): plain-English per-step result summaries Replaces the raw-JSON summary column in the Results table with the mockup's plain-English phrasing: "312 duplicates removed across 147 groups (18,442 → 18,130 rows)", "1,204 cells cleaned in name & city", etc. (correct singular/plural via a small _n helper). Adds step_phrase() and step_status() to pipeline_modules.py. step_status derives the status pill (✓ ok / ⚠ ok · N skipped / ✗ error / ⏭ skipped) and, for warn/error steps (e.g. format_standardize unparseable cells, column_map coercion failures / missing required targets), an inline detail callout rendered directly below the results table — surfacing non-fatal issues in context without a dedicated always-empty column. Extends tests/gui/test_pipeline_builder.py with phrasing + status assertions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 18:21:17 +00:00
Michael	837f4b88b5	feat(pipeline): visual module-card builder for Automated Workflows Replaces the raw options_json data-editor table with a per-step "module card" builder matching the locked design mockup (layout-review/09_pipeline_runner.html): each step shows a friendly name + caption, an enable toggle, ▲/▼/✕ reorder/remove controls, and a Configure expander that renders that tool's own controls in plain language. Raw JSON is demoted to an Advanced import/export section. New src/gui/components/pipeline_modules.py holds the adapter-key→tool_id friendly-name bridge, one plain-language config renderer per tool (text_clean, format_standardize, missing, column_map, dedup — emitting the exact JSON option shapes the core adapters accept), and render_step_card. Steps live in session state as an ordered list with stable ids so widget keys survive reorder/remove. Reorder is ▲/▼ buttons (no JS drag dependency). The on-disk/CLI pipeline JSON format is unchanged — CLI and src/core untouched. Adds tests/gui/test_pipeline_builder.py (AppTest) covering seed, configure panels, toggle/add/remove, and a full run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-22 18:16:09 +00:00
Michael	4955fb239b	test: cover help_md keys, header smoke, and bilingual ES smoke Two stale Spanish smoke assertions still expected English page titles for PDF Extractor and Reconciler — the i18n work landed real translations ("PDF a CSV", "Reconciliar dos archivos"), so refresh the expected substrings and the surrounding comment. Add new coverage for the help-popover feature: - TestHelpPopoverKeys (test_lang_packs): every tool_id resolves a non-empty tools.<id>.help_md in BOTH packs; help.button_label and help.missing_body resolve in both. - TestDescriptionCopy (test_tools_registry): every Tool.description non-empty and under 120 chars — pins the post-jargon-scrub copy so future drift back into multi-clause prose is loud. - TestRenderToolHeaderSmoke: render_tool_header is callable, listed in components.__all__, and every i18n key it touches resolves in both packs. Runs without a Streamlit script context. Suite: 2427 passed (+9 new), 91 skipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-02 18:07:19 +00:00
Michael	6627895a10	test: fix v3 branding drift, add reconcile CLI + registry coverage GUI/lang-pack tests were asserting against pre-v3 strings ("Data Cleaning Mastery", "Maestría en limpieza…") that the brand refresh replaced with "UNALOGIX DataTools" + "Clean. Normalize. Transform." Updated assertions to the current copy and switched the findings panel tests to the redesigned flat-list layout (per-finding "Open Tool →" buttons instead of per-tool expanders). New coverage: - tests/test_cli_reconcile.py (13) — preview/apply, tolerance flags, sign inversion, key flags, error paths, Excel input. - tests/test_tools_registry.py (27) — unique tool_ids, page_slug → real file, valid sections/tiers, localized accessor fallbacks, explicit pins for PDF Extractor + Reconciler entries. - tests/test_reconcile.py — one-side-empty, key-pass tagging, additional validation cases, input-DataFrame immutability. - tests/gui/test_smoke.py — PAGE_SLUGS now includes 10_PDF_Extractor and 11_Reconciler in both en/es. - tests/gui/test_workflows.py — TestPdfExtractorWorkflow and TestReconcilerWorkflow render checks. Net: 2317 passed → 2418 passed, 0 failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 19:30:02 +00:00
Michael	7ad19ac7f4	feat(nav,i18n): sticky footer with Back-to-Home + localized tool headers Two unrelated UX issues addressed in one sweep across all nine tool pages because they share the same edit surface. (1) Sticky footer replaces the top + bottom back-link buttons. Reported: a big white empty footer space at the bottom of every page; the Back to Home button at the top scrolled out of view on long pages. New ``render_sticky_footer()`` helper in ``components/_legacy.py`` injects a fixed-position bar at ``bottom: 0`` of the viewport with: - A border-top so it visually reads as a non-movable bar. - A semi-transparent background (rgba 0.96 + ``backdrop-filter: blur``) so content underneath shows through faintly when the user scrolls. - A styled ``<a href="home">`` anchor (not an ``st.button``) because Streamlit widgets can't be CSS-positioned reliably — Streamlit owns the widget's DOM container and re-mounts it on every rerun. A real anchor sits exactly where the CSS puts it and triggers Streamlit's URL routing to the home page. - ``padding-bottom: 3.5rem`` on the main container so the last widget isn't hidden behind the bar. Called once per tool page, immediately after ``hide_streamlit_chrome()`` so it renders even on pages that ``st.stop()`` early before any other content runs. The old top-and-bottom ``back_to_home_link()`` calls are removed from every tool page; their entry/exit points were dropping the button when the script short-circuited. (2) Tool-page headers now localize. Reported: switching the sidebar language picker to Spanish left the tool page's title + caption in English. Root cause: every page had hard-coded ``st.title("✂️ Clean Text")`` / ``st.caption("Trim whitespace...")`` strings. Added per-tool ``tools.<id>.page_title`` and ``tools.<id>.page_caption`` keys to ``en.json`` and ``es.json`` for all nine tools. Routed each page's title/caption call through ``t()``. Verified: with ``ui_lang=es`` set, the Clean Text page now renders "✂️ Limpiar texto" + the Spanish caption. Updated ``tests/gui/test_smoke.py::EXPECTED_SUBSTRINGS`` so the ``es`` column for each tool page asserts the actual Spanish string (was a duplicate of the English string back when the page bodies were English-only). 2220 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 00:42:45 +00:00
Michael	c568aec8a7	feat(gui): one-click Close in its own bottom sidebar section Close is now a direct shutdown trigger: visiting the Close page (the sidebar entry) fires shutdown_app() immediately — no confirm step, no intermediate body. The farewell overlay paints and os._exit(0) lands ~1s later from a daemon thread. Layout: Close moved into its own bottom-of-sidebar section so the destructive action is visually separated from Account/Activate. - New shutdown_app() in components/_legacy.py replaces quit_button. os._exit thread is skipped when "pytest" is in sys.modules so the test suite doesn't suicide on rendering 99_Close. - pages/99_Close.py shrinks to set_page_config + chrome + shutdown_app. - app.py nav grows a new "Close" section header (new nav.section_close key in en/es packs) pinned at the bottom of the navigation dict. Tests updated: - TestQuitButtonRenders → TestClosePageShutsDownImmediately. Assert the shutdown caption renders + no confirm button exists. - test_smoke EXPECTED_SUBSTRINGS["99_Close"] now pins "Shutting down" / "Cerrando" (the visible page body) instead of the removed page title. 2008 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 20:17:14 +00:00
Michael	ff2eaeb6c4	feat(home): multi-file upload + per-file analysis, drop tool grid Home is now upload + analysis only. The page accepts multiple files in one go, analyzes each independently, and renders findings grouped by filename in bordered containers. The 3-section tool-card grid is gone — discovery happens via the sidebar now. Mechanics: - file_uploader uses accept_multiple_files=True. Each file's findings cache in session_state["home_findings_by_file"] keyed by filename so removing a file via Streamlit's "x" button drops its findings too, and re-clicking Run only re-analyzes pending files. - The first uploaded file is mirrored into the singular home_uploaded_{name,bytes,size} keys so tool pages continue to pick up an "active" upload through pickup_or_upload — no tool-page changes. - New i18n keys: upload.intro_multi, upload.uploader_label_multi, upload.clear_results, upload.empty_state. upload.heading text is updated to "Upload one or more files to start" (EN + ES). Dropped tests pinning the tool grid: - TestHomeToolGridLocalization (test_chrome.py) - test_home_tool_card_uses_es_name (test_smoke.py) - TestLiteHomeGridBadges (test_lite_tier.py — locked-card lock-badge assertions; locking is still enforced per-tool-page via require_feature_or_render_upgrade) 2009 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 20:12:48 +00:00
Michael	dad744f17f	refactor(gui): drop Review page + normalization gate Home is now the only entry point: the "Run analysis" button on the upload section IS the review step (findings render inline via render_findings_panel). Tool pages no longer gate on a passed normalization — running the analyzer is sufficient context. Removed: - src/gui/pages/0_Review.py - src/gui/components/gate.py (re-export seam) - require_normalization_gate() in src/gui/components/_legacy.py - "review" section enum in tools_registry.py - Data Review entry in app.py navigation - require_normalization_gate() calls + imports in all nine tool pages - tests/gui/test_gate.py (whole file) - TestReviewWorkflow in tests/gui/test_workflows.py - 0_Review entry in tests/gui/test_smoke.py PAGE_SLUGS - stash_upload's normalization_result+normalization_for stashing - stash_upload_without_gate (was the gate's negative-path helper) 2017 tests pass (16 retired with the gate flow). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 20:04:33 +00:00
Michael	fc6c22c6a7	feat(review): inline file uploader instead of redirect home When a user lands on Review without an upload, show a file uploader on the page itself and auto-run the analyzer once a file is picked, rather than bouncing them to the home page with a "Back to home" button. Auto-analyze is the right default here: the user is already on the Review page, so they've implicitly committed to a scan. Stashing the bytes in the same session-state keys the home page uses keeps the rest of the flow (encoding picker, gate, tool pages) unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 19:57:01 +00:00
Michael	db5ec084da	docs+code: rename tool labels everywhere Sweep follow-up to `93e43fc`. Display labels now consistent across docs, landing pages, CLI output, code comments, docstrings, and test prose. Five parallel surfaces touched: - docs (EN + ES): README, USER-GUIDE, CLI-REFERENCE, and 11 internal design/planning docs - landing pages: index + bookkeeper/revops/shopify-pet - src: CLI module docstrings, _TOOL_DISPLAY dicts in cli_analyze.py and gui/components/_legacy.py, core module headers, every tool page's module docstring - tests: class/method/module docstrings and section-header comments - test-cases READMEs Page slugs (1_Deduplicator etc.), tool_id strings (01_deduplicator etc.), Python class names (TestDeduplicatorWorkflow, FeatureFlag.*), URL paths, anchor IDs, CSS classes, and asset filenames were left intact since they're code identifiers / structural references. All 2033 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 19:50:09 +00:00
Michael	93e43fc0d9	feat(gui): sidebar sections + non-technical tool labels Sidebar nav now groups tools under Data Review / Data Cleaners / Transformations / Automations via st.navigation, replacing the flat auto-discovered list. Tool display names switch to action-first phrasing (Find Duplicates, Fix Missing Values, Find Unusual Values, Standardize Formats, Clean Text, Quality Check, Map Columns, Combine Files, Automated Workflows) in EN + ES packs and on each page's H1. The Data Cleaners section follows the requested order: Missing Values → Outliers → Text Cleaner → Format Standardizer → Deduplicator → Quality Check. (Text Cleaner kept inside cleaners since the request didn't list it but the tool still ships.) Registry now carries a section field; helpers added: tools_in_section(), section_label(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 19:36:01 +00:00
Michael	d32b58e61a	feat(license): add Lite SKU; remove user-facing free trial Two coupled changes: 1. Lite tier - New Tier.LITE in src/license/schema.py. - FEATURES_BY_TIER[Tier.LITE] = {Deduplicator, Text Cleaner, Format Standardizer}. The three universally-useful tools that cover the most common bookkeeping / RevOps / Klaviyo prep workflows. Other six tools require Core. - i18n: license.tier_lite, license.feature_locked_title, license.feature_locked_body, license.upgrade_link, license.status_locked (en + es). - Per-tool feature gate at every GUI tool page (require_feature_or_render_upgrade) and every tool CLI (guard(feature=...)). A locked tool renders an upgrade prompt + Manage-license button (GUI) or exits with code 2 (CLI). - Home grid: tool cards the user's tier doesn't unlock get a red 🔒 Locked badge in place of green Ready. 2. Trial removed - Activation form's "Start 1-year trial" button removed. - license_cli's `trial` subcommand removed. - activation.trial_button / activation.trial_help i18n keys dropped (pack parity test stays green). - Tier.TRIAL stays in the enum (back-compat with any field- tested trial licenses); LicenseManager._mint stays internal for tests and the seller's key generator. - Decision logged in DECISIONS §9b: a 1-year all-features trial undercuts paid Lite; paid-only keeps tier economics clean. Tests (+29 net): +17 Lite-tier unit/guard tests + 13 Lite-tier GUI tests + 1 trial-absent assertion - 2 trial CLI tests - 1 trial GUI button test. Total: 1995 → 2024. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 17:19:30 +00:00
Michael	e435103113	feat(license): registration + 1-year licenses + tier scaffolding A complete offline licensing layer (no internet at any step): Core - src/license/ — schema (License, Tier, FeatureFlag), HMAC crypto, JSON storage, LicenseManager singleton with activate/renew/ deactivate/issue_trial. Tier-scaffolded so future SKUs can carve per-tool feature sets without consumer-code edits. - scripts/generate_license.py — creator-only key generator. Mints a DTLIC1: blob the buyer pastes into the activation page. GUI - New activation form component (src/gui/components/activation.py). - hide_streamlit_chrome() now inline-renders the activation form when no valid license is present (every page short-circuits to the form until activated). - Sidebar shows tier + days remaining; renewal warning under 30 days. - New pages/_Activate.py for revisiting the form after activation. CLI - src/license_cli.py — activate / renew / status / trial / deactivate commands. Exempt from the guard. - src/cli_license_guard.py — drop-in guard call added to every tool CLI's main(). Lets --help through; respects DATATOOLS_DEV_MODE. i18n - New activation.* and license.* keys in en.json + es.json (page title, form labels, status badges, renewal warnings, error messages). Pack parity test stays green. Test infrastructure - tests/conftest.py autouse fixture sets DATATOOLS_DEV_MODE=1 so the existing 1916 tests continue to pass. - isolated_license_path / activated_license_manager / unactivated_license_manager fixtures for tests that want to drive the real check. Tests (+79) - tests/test_license.py (40): schema, crypto roundtrip, blob encode/decode, tier→feature mapping, activation flow, name/email mismatch rejection, tamper detection, expiration, renewal, dev-mode bypass. - tests/test_license_cli.py (26): every license_cli command + subprocess tests confirming every tool CLI refuses to run without a license, --help always works, DEV_MODE bypasses. - tests/gui/test_activation.py (13): gate blocks without license, passes with trial, activation form submission unlocks the gate, sidebar status, renewal warning, i18n. Total: 1916 → 1995 tests. All pass under the strict warning filter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 16:54:23 +00:00
Michael	35d46a0c1a	test(gui): add Streamlit AppTest layer (139 tests) Until now every test ran against core or the CLI; the Streamlit GUI was verified by hand. This commit adds tests/gui/ — 139 AppTest- driven tests behind a 'gui' marker so the quick loop (``pytest -m 'not gui'``) stays at 1777 tests / ~10s while ``pytest`` runs everything (1916 / ~14s). Coverage: - test_smoke.py (59): every page renders in EN and ES, expected substring present, sidebar selector mounted. - test_chrome.py (18): language selector flips session state and re-renders; quit button + farewell strings localize; tool-card names use the active language. - test_gate.py (9): require_normalization_gate no-op / warning / short-circuit / hash-mismatch invariants; warning + button localized. - test_workflows.py (14): happy path per Ready tool — stash upload, render, find primary action, verify result lands in session state. - test_dedup_review.py (8): Accept All / Reject All / Clear Decisions wire through to review_decisions; apply_review_decisions semantics (keep-all, merge, column override). - test_advanced_panels.py (15): config_panel widget defaults and options (algorithm, threshold, survivor rule, merge, multiselects, config save/load). - test_errors.py (4): garbage / empty / single-column uploads don't crash; duplicate-target mapping raises InputValidationError. - test_findings_panel.py (12): driven via a small standalone harness page so we test the component without faking a file_uploader. EN + ES strings, per-tool grouping, open-tool button label, untargeted expander, severity summary. Shared infrastructure in tests/gui/conftest.py: - ``stash_upload`` / ``stash_upload_without_gate`` — populate session_state to pre-pass or block the gate. - ``with_language`` — set ``ui_lang`` before run(). - ``collected_text`` — flatten title/caption/markdown/etc. into one string for substring assertions. - Auto-marking: every test in tests/gui/ gets ``@pytest.mark.gui`` via ``pytest_collection_modifyitems``, so the marker isn't per-test boilerplate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 16:13:40 +00:00

14 Commits