docs+code: rename tool labels everywhere

Sweep follow-up to 93e43fc. Display labels now consistent across docs, landing pages, CLI output, code comments, docstrings, and test prose. Five parallel surfaces touched: - docs (EN + ES): README, USER-GUIDE, CLI-REFERENCE, and 11 internal design/planning docs - landing pages: index + bookkeeper/revops/shopify-pet - src: CLI module docstrings, _TOOL_DISPLAY dicts in cli_analyze.py and gui/components/_legacy.py, core module headers, every tool page's module docstring - tests: class/method/module docstrings and section-header comments - test-cases READMEs Page slugs (1_Deduplicator etc.), tool_id strings (01_deduplicator etc.), Python class names (TestDeduplicatorWorkflow, FeatureFlag.*), URL paths, anchor IDs, CSS classes, and asset filenames were left intact since they're code identifiers / structural references. All 2033 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 19:50:09 +00:00
parent 93e43fc0d9
commit db5ec084da
57 changed files with 205 additions and 205 deletions
--- a/docs/REQUIREMENTS.md
+++ b/docs/REQUIREMENTS.md
@@ -76,7 +76,7 @@ Sample size: 1,000 rows (configurable).
 - Full-DataFrame `auto_fix`: ~5 min (~30 µs/cell).
 - Output write: ~10 s.
 - Recommended RAM: 3–4× input size for the full-Apply path.
- **Format standardizer** (`standardize_dataframe`): ~2.7M rows/sec on
+- **Standardize Formats** (`standardize_dataframe`): ~2.7M rows/sec on
  cache-warm repetition-heavy columns (synthetic 1M-row in-memory
  benchmark, 2 typed columns); the fused single-pass loop replaced a
  3-pass ``.tolist()`` cycle, so per-call overhead is now dominated by
@@ -87,20 +87,20 @@ Sample size: 1,000 rows (configurable).
  thread-pool scaffolding; on CPython 3.12 with the GIL it's
  roughly neutral, but the API is ready for the free-threaded
  (PEP 703) Python 3.13+ build where it will help.
- **Text cleaner** (`clean_dataframe`): ~1M rows/sec on
+- **Clean Text** (`clean_dataframe`): ~1M rows/sec on
  repetition-heavy columns (per-call string cache: the pipeline runs
  once per *unique* cell value, not once per row).
- **Missing handler** (`handle_missing`): lazy-copy — when sentinel
+- **Fix Missing Values** (`handle_missing`): lazy-copy — when sentinel
  standardization runs but finds nothing, AND no drops AND no fills
  apply, the input frame is returned as-is. On a clean 1 GB file this
  saves the 1 GB allocation that the unconditional upfront copy used
  to take.
- **Column mapper** (`map_columns`): rename + drop both already
+- **Map Columns** (`map_columns`): rename + drop both already
  return fresh frames; the explicit upfront `df.copy()` is now
  removed and downstream mutating steps (schema-add, coerce) copy on
  demand via `_ensure_owned()`. Rename-only and identity-mapping
  paths run with zero explicit copies.
- **Deduplicator**:
+- **Find Duplicates**:
  - **Exact-only strategies** (every column uses `Algorithm.EXACT` at
    threshold 100 — covers strong-key dedup like email/phone, the
    fallback drop-duplicates path, and explicit "match on this exact
@@ -117,19 +117,19 @@ Sample size: 1,000 rows (configurable).
    (the common dedup workload) skip re-parsing.

 ## 11. Tools
-1. Deduplicator — Ready
-2. Text Cleaner — Ready
-3. Format Standardizer — Ready
-4. Missing Value Handler — Ready
-5. Column Mapper — Ready
-6. Outlier Detector — Coming Soon
-7. Multi-File Merger — Coming Soon
-8. Validator & Reporter — Coming Soon
-9. Pipeline Runner — Ready
+1. Find Duplicates — Ready
+2. Clean Text — Ready
+3. Standardize Formats — Ready
+4. Fix Missing Values — Ready
+5. Map Columns — Ready
+6. Find Unusual Values — Coming Soon
+7. Combine Files — Coming Soon
+8. Quality Check — Coming Soon
+9. Automated Workflows — Ready

 ### 11.a Recommended pipeline order (soft, not enforced)

-The Pipeline Runner ships with a `SOFT_DEPENDENCIES` table; the
+Automated Workflows ships with a `SOFT_DEPENDENCIES` table; the
 following ordering is the default and the basis of the warning
 surface. Re-ordering is allowed; the runner emits a warning string
 and proceeds.
@@ -214,7 +214,7 @@ and proceeds.
  fresh blob without losing the embedded buyer identity. Tier may
  change during renewal (Lite → Core upgrade path).
 - **Tiers**:
-  - ``lite`` — Deduplicator + Text Cleaner + Format Standardizer.
+  - ``lite`` — Find Duplicates + Clean Text + Standardize Formats.
    Buyer pays once, gets the three universally-useful tools.
  - ``core`` — every Ready tool (all 9 in v1.6).
  - ``pro``, ``enterprise`` — scaffolded for future SKUs; currently