Three real issues surfaced when running the suite with strict warnings:
1. src/core/format_standardize.py: ``datetime.utcfromtimestamp`` is
deprecated in CPython 3.12 and slated for removal. Replace with
``datetime.fromtimestamp(ts, tz=timezone.utc)``. Output for the
date-only format codes we use is byte-identical.
2. src/core/io.py: ``list_sheets`` leaked the openpyxl file handle by
returning ``xl.sheet_names`` from an unclosed ``pd.ExcelFile``.
Wrap in a ``with`` block so the FD closes deterministically — also
prevents the Windows-only "file is locked" repro path.
3. tests/test_corpus.py: ``TestXlsxPollution.workbook`` fixture
returned the bare ``pd.ExcelFile`` instead of yielding + closing.
Convert to a yield-and-finally pattern so the class-scoped handle
isn't leaked across the whole test file.
Also harden pytest.ini's warning policy: escalate
``ResourceWarning`` from ``src`` to an error, alongside the existing
``DeprecationWarning`` rule. Third-party warnings stay filtered — we
can't fix pandas/openpyxl/streamlit churn from here.
All 1916 tests pass under the strict filter; full and split runs
(``pytest``, ``pytest -m 'not gui'``, ``pytest -m gui``) all clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Until now every test ran against core or the CLI; the Streamlit GUI
was verified by hand. This commit adds tests/gui/ — 139 AppTest-
driven tests behind a 'gui' marker so the quick loop
(``pytest -m 'not gui'``) stays at 1777 tests / ~10s while
``pytest`` runs everything (1916 / ~14s).
Coverage:
- test_smoke.py (59): every page renders in EN and ES, expected
substring present, sidebar selector mounted.
- test_chrome.py (18): language selector flips session state and
re-renders; quit button + farewell strings localize; tool-card
names use the active language.
- test_gate.py (9): require_normalization_gate no-op / warning /
short-circuit / hash-mismatch invariants; warning + button
localized.
- test_workflows.py (14): happy path per Ready tool — stash
upload, render, find primary action, verify result lands in
session state.
- test_dedup_review.py (8): Accept All / Reject All / Clear
Decisions wire through to review_decisions; apply_review_decisions
semantics (keep-all, merge, column override).
- test_advanced_panels.py (15): config_panel widget defaults and
options (algorithm, threshold, survivor rule, merge, multiselects,
config save/load).
- test_errors.py (4): garbage / empty / single-column uploads don't
crash; duplicate-target mapping raises InputValidationError.
- test_findings_panel.py (12): driven via a small standalone harness
page so we test the component without faking a file_uploader. EN
+ ES strings, per-tool grouping, open-tool button label, untargeted
expander, severity summary.
Shared infrastructure in tests/gui/conftest.py:
- ``stash_upload`` / ``stash_upload_without_gate`` — populate
session_state to pre-pass or block the gate.
- ``with_language`` — set ``ui_lang`` before run().
- ``collected_text`` — flatten title/caption/markdown/etc. into
one string for substring assertions.
- Auto-marking: every test in tests/gui/ gets ``@pytest.mark.gui``
via ``pytest_collection_modifyitems``, so the marker isn't
per-test boilerplate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a top-level test infrastructure layer addressing four needs at once:
a single command to run anything, cross-platform automation, install/e2e
sanity, and zero-config pickup of new fixtures dropped into test-cases/.
Top-level runner — run_tests.py
python run_tests.py # everything (default)
python run_tests.py --tool dedup # one tool's tests
python run_tests.py --unit # category scopes
python run_tests.py --e2e # end-to-end CLI
python run_tests.py --install # import / dependency sanity
python run_tests.py --fixtures # corpus + dropped-file sweep
python run_tests.py --coverage # term-missing report
python run_tests.py --quick # skip @pytest.mark.slow
Tools: analyze, cli, config, dedup, io, normalizers, text_clean.
Cross-platform — tox.ini
Envs for py310-py313 plus install / e2e / fixtures / coverage / lint.
Forces UTF-8 (PYTHONUTF8=1, PYTHONIOENCODING=utf-8) so identical fixture
bytes parse the same on Linux/macOS/Windows.
Shared config — pytest.ini
testpaths, python_files conventions, custom markers (slow, e2e, install,
fixture_sweep), warning filters that fail on our own DeprecationWarnings
while tolerating third-party ones.
New test layers
tests/test_install.py — required deps import; project modules import;
src.core public API surface; CLI --help exits 0; streamlit app.py
parses as valid Python; run_tests.py --help works.
tests/test_e2e.py — CLI roundtrips: cli_analyze table + JSON, cli_text_clean
--apply writes a real file with NBSP/smart-quote folded, dedup CLI
removes duplicates, run_tests.py self-tests.
tests/test_fixtures_sweep.py — parametrizes over every CSV/TSV/XLSX
inside test-cases/ (excluding text-cleaner-corpus/, which has its own
suite). Each fixture must: load through repair_bytes, run analyze()
cleanly, and survive clean_dataframe() with row/col counts unchanged
plus idempotency. Drop a CSV in, re-run — no test code changes needed.
tests/test_gap_coverage.py — closes audit gaps: clean_headers=False
toggle, repair_bytes with tab/semicolon delimiters, BOM+NUL+smart-
quote combined-fix scenario, analyze() over an XLSX path, sample_rows
larger than the DataFrame, mid-cell BOM, findings_by_tool edges, plus
a strict xfail documenting the known §4.17 numeric/phone whitespace
heuristic gap.
Test count
Before: 288 passed + 1 xfailed
After: 475 passed + 2 xfailed (the second xfail is the documented
collapse_whitespace gap on phone-shaped cells; spec §4.17 calls
for a heuristic that hasn't been implemented yet).
Functional gaps surfaced (not fixed in this commit):
- Text cleaner: collapse_whitespace runs unconditionally on every string
cell, including phone/numeric/date-shaped ones. Spec §4.17 requires a
skip heuristic. Captured as strict xfail so the gap stays visible.
- io.read_file does not run pre-parse repair; only analyze() and direct
callers of read_csv_repaired() get it. CLI tool pages and the dedup
CLI miss the safety net.
- Analyzer has no mixed_line_endings detector or near_duplicate_rows
detector; both planned but require additional plumbing.
- GUI tool pages each have their own uploader instead of picking up the
home-page upload through session_state.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>