Sweep follow-up to 93e43fc. Display labels now consistent across docs,
landing pages, CLI output, code comments, docstrings, and test prose.
Five parallel surfaces touched:
- docs (EN + ES): README, USER-GUIDE, CLI-REFERENCE, and 11 internal
design/planning docs
- landing pages: index + bookkeeper/revops/shopify-pet
- src: CLI module docstrings, _TOOL_DISPLAY dicts in cli_analyze.py
and gui/components/_legacy.py, core module headers, every tool
page's module docstring
- tests: class/method/module docstrings and section-header comments
- test-cases READMEs
Page slugs (1_Deduplicator etc.), tool_id strings (01_deduplicator
etc.), Python class names (TestDeduplicatorWorkflow, FeatureFlag.*),
URL paths, anchor IDs, CSS classes, and asset filenames were left
intact since they're code identifiers / structural references.
All 2033 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fix Missing Values — corpus
Acceptance fixtures for src/core/missing.py. Each .csv under
test_data/ is paired with assertions in tests/test_missing_corpus.py.
Add a new case by dropping a CSV here and adding a parametrize entry to
the runner.
Use cases (target client profiles)
| File | Buyer profile | Strategy under test |
|---|---|---|
uc01_shopify_export.csv |
SMB / Shopify operator | detect-only |
uc02_marketing_audience.csv |
Marketing / RevOps analyst | safe-fill |
uc03_consultant_intake.csv |
Analyst / consultant | drop-incomplete + threshold |
Edge cases
| File | What it stresses |
|---|---|
ec01_all_nan_column.csv |
column 100 % missing — fill must skip, drop_col must catch at threshold |
ec02_no_missing.csv |
clean file — must be a no-op |
ec03_zero_is_not_missing.csv |
numeric 0, boolean false, "0" must NOT be treated as missing |
ec04_excel_errors.csv |
#N/A, #NULL!, #VALUE! Excel error sentinels |
ec05_unicode_whitespace.csv |
NBSP, tab-only, ideographic-space cells treated as whitespace |
ec06_mixed_dtypes.csv |
mixed numeric/string in same column — graceful degrade to mode |
ec07_real_data_with_padding.csv |
leading/trailing whitespace around real data must NOT be dropped |
ec08_single_row.csv |
one-row file — every operation must still work |
ec09_single_column.csv |
one-column file with header-only line + sentinels |
ec10_all_sentinel_variants.csv |
every DEFAULT_SENTINELS entry exercised in one file |
ec11_constant_per_column.csv |
column_fill_values differs per column |
ec12_drop_threshold_boundary.csv |
boundary values for row_drop_threshold (0.5, 0.99, 1.0) |
ec13_ffill_leading_nan.csv |
leading-NaN run survives ffill (no fabrication) |
ec14_interpolate_fallback.csv |
numeric-only strategy on string column triggers fallback |
ec15_headers_only.csv |
empty body — must not crash |
ec16_idempotent_apply.csv |
running handle_missing twice yields the same DataFrame |