refactor(gui): tool registry + components package for per-tool builds

Two low-risk seam moves to enable selling per-tool subsets without
breaking the existing all-in-one bundle. Behaviour identical; every
existing import still resolves; full pytest suite + every page returns
HTTP 200.

1. **Tool registry** (src/gui/tools_registry.py) — replaces the
   inline dict-of-dicts in app.py with a Tool dataclass and a TOOLS
   list. Adds a tier field ("core" today, "pro" / "enterprise" later)
   and tools_for_tier() / tool_by_id() / display_name() helpers. A
   per-tool build slices TOOLS at import time without code changes.

2. **components package** (src/gui/components/) — converts the former
   single components.py into a package with:
     _legacy.py        — original file, unchanged.
     __init__.py       — re-exports the legacy surface; existing
                         "from src.gui.components import …" calls
                         continue to work.
     shared.py         — hide_streamlit_chrome, pickup_or_upload
                         (every build needs these).
     gate.py           — require_normalization_gate (Pro / Suite SKUs).
     findings.py       — analyzer-finding widgets (drops out of a
                         standalone-Dedup build).
     dedup_review.py   — match-group cards + apply pipeline (drops out
                         of a non-dedup build).

   The seam modules are narrow re-exports today. As code migrates out
   of _legacy.py into the focused modules, the public import path
   stays stable via the shim.

E2E: 765 passed, 17 xfailed (unchanged); home page + all 9 tool pages
+ Review page render HTTP 200; full pipeline (analyze → auto_fix →
apply_decisions → output bytes) round-trips on the kitchen-sink
fixture with zero high-confidence findings remaining post-fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-29 20:56:21 +00:00
parent 70ed695027
commit f891c6116d
8 changed files with 294 additions and 79 deletions

View File

@@ -0,0 +1,57 @@
"""Reusable Streamlit widgets for the DataTools GUI.
This package replaces the former single ``components.py`` module. Public
behaviour is identical — every name that used to be importable from
``src.gui.components`` is still importable from the same path because
this ``__init__`` re-exports the legacy surface in full.
The package layout exists so per-tool builds can ship only the seams
they need without dragging the entire kitchen-sink module:
components/
__init__.py ← compatibility shim (this file)
_legacy.py ← original components.py, unchanged
gate.py ← gate-only seam (require_normalization_gate)
findings.py ← analyzer-finding rendering seam
dedup_review.py ← dedup match-group cards + review pipeline
shared.py ← chrome / file-pickup helpers used by every tool
A standalone Deduplicator build, for example, can ship without
``findings.py`` and ``gate.py`` — those modules import the analyzer /
gate code that the Lite SKU does not include.
Adding new tooling: drop new helpers into the appropriate seam module.
Add their names to its ``__all__`` and to this file's ``__all__`` if
they should remain importable from ``src.gui.components`` directly.
"""
from __future__ import annotations
# Re-export the full legacy surface so existing pages continue to
# import unchanged. Once individual tool packages start consuming
# the focused seam modules directly, names can migrate out of
# _legacy.py without breaking those imports — this shim is what
# absorbs the move.
from ._legacy import * # noqa: F401,F403
from . import _legacy as _legacy # noqa: F401 (keep for direct access)
# Names exported from _legacy.py that pages currently use. Kept here as
# the canonical public list so a removal from _legacy is a visible
# breaking change instead of a silent drop.
__all__ = [
# Shared chrome / pickup / gate
"hide_streamlit_chrome",
"pickup_or_upload",
"require_normalization_gate",
# Dedup widgets
"config_panel",
"match_group_card",
"results_summary",
"apply_review_decisions",
# Analyzer widgets
"tool_display_name",
"render_findings_panel",
"render_hidden_aware_preview",
"upload_and_analyze_section",
"findings_count_for_tool",
]

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,24 @@
"""Dedup match-group cards and review pipeline.
The interactive dedup-review surface — config panel, match-group cards,
results summary, and apply-decisions glue. This is the largest single
chunk of the GUI by line count; isolating it in a seam module means a
non-dedup SKU never has to import it (and never has to drag in
``src.core.dedup`` along the way).
"""
from __future__ import annotations
from ._legacy import (
apply_review_decisions,
config_panel,
match_group_card,
results_summary,
)
__all__ = [
"apply_review_decisions",
"config_panel",
"match_group_card",
"results_summary",
]

View File

@@ -0,0 +1,25 @@
"""Analyzer-finding widgets.
Surfaces for the analyzer's home-page section, the per-tool findings
panel, and the hidden-character-aware preview table. A build that
doesn't ship the analyzer (e.g. standalone Dedup-only) does not need
this module — its import would drag ``src.core.analyze`` along with it.
"""
from __future__ import annotations
from ._legacy import (
findings_count_for_tool,
render_findings_panel,
render_hidden_aware_preview,
tool_display_name,
upload_and_analyze_section,
)
__all__ = [
"findings_count_for_tool",
"render_findings_panel",
"render_hidden_aware_preview",
"tool_display_name",
"upload_and_analyze_section",
]

View File

@@ -0,0 +1,16 @@
"""Normalization-gate guard for tool pages.
``require_normalization_gate`` short-circuits a tool page when the
current upload has not yet passed the gate, redirecting the user to the
Review & Normalize page. Pulled into its own seam module so:
* A build that includes the gate (Pro / Suite SKUs) imports this.
* A standalone single-tool build that bypasses the gate can omit this
module entirely without removing the helper from a shared file.
"""
from __future__ import annotations
from ._legacy import require_normalization_gate
__all__ = ["require_normalization_gate"]

View File

@@ -0,0 +1,14 @@
"""Chrome and file-pickup helpers — every build needs these.
This is the smallest seam: any DataTools build, regardless of which
tools it bundles, needs ``hide_streamlit_chrome`` (the app-like UI
polish) and ``pickup_or_upload`` (lets a tool page reuse the home-page
upload). Importing from here instead of the kitchen-sink ``components``
package keeps a Lite build's dependency graph tight.
"""
from __future__ import annotations
from ._legacy import hide_streamlit_chrome, pickup_or_upload
__all__ = ["hide_streamlit_chrome", "pickup_or_upload"]