Commit Graph

230 Commits

Author SHA1 Message Date
b86828d791 feat(pdf): visual region picker on rendered sample page
Phase 5/6. Adds a "Visual picker" tab as the first stop in the
template-build flow. The sample PDF page is rasterized with
``pypdfium2`` (capped at ~900px wide for sensible display), and
``streamlit-drawable-canvas`` overlays drawing tools on top.

UX:

- **Line mode** — drag short (roughly vertical) strokes where you
  want columns to split. Each stroke's x-midpoint becomes one
  boundary in PDF point coordinates.
- **Rect mode** — drag a rectangle around the transactions
  table; bbox is preserved on the template as
  ``visual.table_bbox`` for round-trip, future use as a hard
  crop region.
- **Transform mode** — move/resize already-drawn shapes after
  the fact.

Round-trip: re-entering Build mode with an existing template
seeds the canvas with full-height vertical lines for every
boundary already on the template, plus the saved bbox if any,
so editing-after-save matches the user's mental model.

Coordinate translation: the canvas reports pixel positions; we
divide by the renderer's pixels-per-PDF-point scale to get back
to PDF coordinates that ``apply_template`` already expects. No
template-schema change required — the boundaries the picker
writes are the same list the text-input editor wrote in
commit 3, just sourced visually.

New helper in the extraction module:

- ``render_page_image(pdf_bytes, page_no, target_width=900)`` —
  rasterize a single 1-indexed page to a PIL image; returns
  ``(image, scale)`` for coordinate translation.

The text-input boundary editor in the Columns tab remains as a
fallback for power users / keyboard-only workflows and for
copy-paste from spreadsheet-derived x-positions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 22:52:54 +00:00
5a8e2ec9e1 feat(pdf): batch extract polish — ZIP output, sort-by-date, status block
Phase 4/6. Polishes the batch workflow shipped in commit 3:

- **st.status progress block** replaces the simple progress bar.
  Each file appears as its own line as it's processed; the block
  auto-collapses on completion with a "12/13 extracted" summary
  and turns red if any file errored.
- **Sort combined output by date** checkbox (default ON) sorts
  the merged CSV ascending by date, with source_file as a stable
  secondary sort so multiple statements interleave by date but
  same-day rows from the same file stay together.
- **ZIP-of-per-PDF-CSVs output option** alongside the combined
  CSV. When the accountant has 12 statements from 12 different
  account periods and wants to feed them into 12 separate ledger
  imports, the ZIP keeps each file's rows in its own CSV named
  after the original PDF stem.
- **Per-file summary table** gets a ``status`` column ("ok" /
  "no rows" / "error: ExceptionName") so error grouping is
  obvious at a glance — already present from commit 3, now
  upgraded with the status field.

Cancellation is intentionally not added — Streamlit's single-
thread rerun model has no clean way to interrupt a tool-run
mid-stream without architectural changes to extraction. If a
user mis-fires Extract on 50 PDFs they can refresh the browser
tab; the task will be killed when the next interaction comes in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 22:51:05 +00:00
2f349e8191 feat(pdf): tool page with Extract / Build / Manage modes
Phase 3/6. Wires the PDF Extractor into the GUI as a new
"transformations" tool with three modes selected by a horizontal
radio at the top of the page:

**Extract** — pick a saved template, upload one or more
statement PDFs (single + batch shipping together to keep the
common case one-step), get a previewed DataFrame + CSV download.
Per-file row counts and warnings are surfaced; failures on one
file don't kill the whole batch. The combined CSV gets a
``source_file`` first column so the accountant can sort/filter
by statement.

**Build template** — load an existing template or start fresh,
upload a sample PDF, edit every schema field across four tabs
(Pages & table / Columns / Parsing / Save). A live preview below
re-runs ``apply_template`` against the sample on each re-render
so the user sees their changes hit rows immediately. The column-
boundary editor is text-input ("comma-separated x-positions") for
now — replaced by the drawable-canvas visual picker in commit 5.

**Manage templates** — list with rename / delete / export
(downloads the canonical JSON) / import (uploads someone else's
JSON, validated through ``template_from_json``).

Heavy work (``extract_pages_auto``) only runs on explicit user
action (Extract / a new sample upload), and the parsed Page list
is cached in ``st.session_state`` so widget-edit reruns don't
re-parse the PDF.

Logging: tool runs and template saves both hit the audit log via
``log_event("tool_run", …)``, matching every other tool's
instrumentation pattern.

Registered in ``tools_registry.py`` under ``transformations``
with status ``Ready`` and the picture-as-pdf Material icon. i18n
keys added for en + es ("PDF to CSV" / "PDF a CSV").

OCR is wired in this commit — ``extract_pages_auto`` already
falls back through ``pytesseract`` when the binary is available,
and the warning strings it returns surface as ``st.info`` /
``st.warning`` per-file. Commit 6 will polish the OCR UX with a
status row.

Next commits build on this page:
  4 — batch progress + cancellation + per-file error grouping
  5 — drawable-canvas visual picker replaces text x-positions
  6 — OCR availability banner + scanned-page indicators

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 22:49:44 +00:00
aea520d2f7 feat(pdf): template storage layer (load/save/list/import/export)
Phase 2/6. Persists "how to read this bank's statements" as JSON
files under ``~/.datatools/pdf_templates/<slug>.json`` so an
accountant can build one template per source and reuse it across
every statement that follows the same layout.

Public API:

- ``new_template(name)`` — blank with sensible defaults
- ``save_template(t)`` — validate + atomic write (temp + rename)
- ``load_template(slug)`` / ``delete_template(slug)``
- ``list_templates()`` — sorted summaries, skips corrupt files
- ``template_to_json`` / ``template_from_json`` — portability
- ``validate_template(t)`` — returns (ok, errors) list for GUI

Schema is documented in the module docstring. Versioned via
``schema_version: 1`` so future fields don't break saved files
silently — ``load_template`` refuses unknown versions instead of
limping along with missing keys.

Validation contract enforces:
- non-empty name + slug (lowercase alphanumeric + hyphens)
- at least two output columns
- at least one column mapped to ``date``
- either one ``amount`` column OR both ``amount_debit`` +
  ``amount_credit``
- column boundary count consistent with source-column count

Storage is atomic: ``_atomic_write`` goes through a temp file +
``os.replace`` so a crashed save can't leave a half-written JSON
at the canonical path. The GUI's build flow saves on most
visual-picker changes, so this matters more here than for a
"save button" workflow.

24 tests cover slugify, defaults, validation branches, round-trip
load/save, missing/corrupt file handling, delete, list (incl.
skipping corrupt files), atomic-write rollback, and import/export.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 22:46:44 +00:00
b8aff862ed feat(pdf): add pure PDF→DataFrame extraction module
Phase 1/6 of the PDF Extractor tool. Pure module — no Streamlit,
no user-config I/O — that turns a PDF blob plus a template dict
into a ``pandas.DataFrame`` of transaction rows. Primary use case
is accountant-style extraction of bank-statement transactions,
where each bank's format is encoded as a reusable template.

Pipeline:

1. ``extract_pages(pdf_bytes)`` reads with pdfplumber and surfaces
   words with bounding boxes.
2. ``cluster_rows(words)`` groups words into rows by ``top``
   tolerance — no reliance on PDF table-line detection (most bank
   statements have no visible cell borders).
3. ``assign_columns(row_words, boundaries)`` buckets each word by
   its horizontal midpoint into N+1 columns defined by N interior
   x-boundaries.
4. ``_within_table_window`` slices to the band between the header
   line and the end-marker (e.g. "Closing balance").
5. ``apply_template`` orchestrates the above, handling:
   - parens-style negative amounts, currency stripping, custom
     decimal/thousands separators
   - separate debit + credit columns combined into a single signed
     ``amount`` (credit positive, debit negative — accounting
     register convention; matches QuickBooks/Xero imports)
   - multi-line description wrapping (rows with empty date column
     attach to the previous row's description)
   - row-level regex skip filters (e.g., "Total", "Subtotal")
   - page-range filters ("all", "2-", "1,3-5")

Optional OCR fallback for scanned statements:

- ``page_has_extractable_text`` heuristic flags pages with <5
  words as likely-scanned.
- ``ocr_available()`` checks both the ``pytesseract`` Python
  binding and the Tesseract binary; surfaces a clear reason
  string when either is missing.
- ``extract_pages_auto`` does text-first, OCR-the-blanks, and
  returns warnings the UI can surface.

29 unit tests cover the parsing pipeline against synthetic
WordBox/Page data — no fixture PDFs required, runs in 0.1s. Real
PDF extraction is exercised by hand on the user's statements.

Dependencies added:
- ``pdfplumber>=0.10,<1`` — text + position extraction
- ``pypdfium2>=4,<6`` — page rasterization for OCR + visual picker
- ``streamlit-drawable-canvas>=0.9,<1`` — visual region picker
  (used in commit 5)
- ``pytesseract>=0.3,<1`` — OCR (used in commit 6; system
  Tesseract binary required separately)
- ``cryptography>=41,<49`` — bumped upper bound; pdfminer.six
  transitively requires a recent release. Internal ed25519
  license-signing usage is API-stable across the bump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 22:44:51 +00:00
c16e2a5e29 feat(audit): surface log path + /logs link in Help popover
Adds a "Log file" section to the sticky-footer Help popover with
two affordances:

1. The current audit-log path rendered as monospace text with
   ``user-select: all`` so a single click selects the whole path
   for copy-paste into a file manager. Works on every platform —
   no subprocess required.
2. A "View all logs →" link to the new ``/logs`` page (added in
   the previous commit) for download/inspection of today's and
   prior days' files.

i18n keys ``footer.help_logs_label`` + ``footer.help_logs_link``
added to en + es packs, matching the existing
``footer.help_*`` naming.

``audit_log_path()`` is wrapped in try/except because a broken
audit module MUST NOT take the footer down — falls back to "—".
Same defensive pattern the license section uses.

Rollback: ``git revert HEAD`` removes the section; the popover
and its layout return to the prior shape with zero coupling to
the audit module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 21:26:53 +00:00
7c9139f199 feat(audit): /logs page — view + download recent audit log files
Adds a Streamlit page at ``/logs`` listing every
``datatools-*.jsonl`` file in ``audit_log_dir()`` (7-day window
per the retention sweep in b3ae913). Each entry shows filename,
mtime, byte size, and a ``st.download_button``. Today's file
gets its own section at the top.

The page also surfaces both paths as copyable monospace text:
the active log path (so users can grep/cat it directly on their
machine) and the folder path (so they can paste into Explorer /
Finder).

Wired into navigation via ``st.Page("pages/_Logs.py", ...)`` with
``url_path="logs"``. The sidebar entry is hidden by the same
``hide_streamlit_chrome`` CSS rule that hides ``/activate`` and
``/close`` — same pattern, same ``:has()`` + plain-fallback
selectors so the LinkContainer collapses cleanly in modern
browsers and the anchor is at least un-clickable in older ones.

License gate is OFF for this page (``gate_license=False``) — if a
user's license expires they may need logs to file a support
request; locking them out of their own audit history would be
hostile.

Next commit will wire the popover link.

Rollback: ``git revert HEAD`` removes the page and its nav entry;
the audit log itself keeps working.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 21:24:46 +00:00
b3ae913bb9 feat(audit): daily filename + 7-day retention sweep
Replaces the per-session ``datatools-<ts>-<sid>.jsonl`` filename
with a single daily file ``datatools-YYYY-MM-DD.jsonl`` (local
date). Sessions on the same calendar day share a file via the
writer thread's per-batch open+append; multiple DataTools
instances running concurrently on the same day fan into the same
file (append-mode small writes are atomic on POSIX, safe-enough on
Windows under realistic load).

Drops the ``_LOG_PATH`` module global and the lock around it —
``audit_log_path()`` is now pure date math, recomputed on every
call so a session that crosses midnight follows the rollover into
the next day's file.

Adds ``_sweep_old_logs()`` invoked once per process at writer-
thread start. Deletes any ``datatools-*.jsonl`` whose mtime is
older than 7 days. The glob deliberately matches the legacy
per-session filename too, so users upgrading from the previous
build don't keep a permanent backlog of pre-retention files.

Event ``ts`` fields stay UTC; only the filename uses local date,
because users go looking for "today's log" on their wall clock.

Tests cover: daily filename shape, sweep removes stale files,
sweep keeps fresh files, sweep also clears legacy filenames.

Rollback: ``git revert HEAD`` restores the per-session filename
and removes the sweep. No data migration needed either way —
existing files keep working as JSONL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 21:22:47 +00:00
ba07dcb6c7 feat(audit): re-enable audit log (kill switch off by default)
Phase 1 diagnostic build validated end-to-end on the user's machine:
session cf2ebbd5 (2026-05-19) produced session/upload/analyze/nav/
session-end events with no blank-pages regression. Root cause of the
original symptom was the audit_log_path/_session_id deadlock fixed in
a8ff8f4 — the kill switch is no longer load-bearing.

Flips ``_DISABLED: True`` → ``False`` so the default install writes a
log. The three env-var overrides (``DATATOOLS_AUDIT_ENABLED``,
``DATATOOLS_AUDIT_TRACE``, ``DATATOOLS_AUDIT_PROBE``) and the writer-
thread BaseException guard from 76c9f5a stay in place as escape
hatches if the symptom ever recurs.

TestKillSwitchContract continues to pass — it monkeypatches
``_DISABLED = True`` explicitly and doesn't rely on the module default.

Rollback: ``git revert HEAD`` flips the switch back without removing
the diagnostic instrumentation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 17:50:28 +00:00
76c9f5a679 feat(audit): diagnostic instrumentation env vars + writer-thread guard
Phase 1 of the audit-log re-enablement plan. Adds three opt-in env
vars that let us ship one instrumented build for the user to run,
without flipping the kill switch on for everybody. **Default
behaviour is byte-identical to today**: with no env vars set the
kill switch wins, no writer thread starts, no file is written, no
stderr line is printed.

Env vars (do NOT set in prod):

- ``DATATOOLS_AUDIT_ENABLED=1`` — bypass ``_DISABLED`` for one
  session. ``_DISABLED = True`` stays in the source so an upgrade
  with no env var is still safe.
- ``DATATOOLS_AUDIT_TRACE=1`` — print ``[audit] ...`` lines to
  stderr at module import, every writer-thread state change, and
  every producer entry point. Lets the user share a small log
  instead of attaching a debugger.
- ``DATATOOLS_AUDIT_PROBE=<value>`` — bisect the producer path
  for Phase 2. Values: ``full`` (default), ``noop``, ``no-events``,
  ``no-page-open``, ``no-session-start``. The named variants
  return early from the corresponding ``log_*`` function so we can
  isolate which call is implicated in the blank-pages symptom.

Also:

- ``_writer_loop`` gets an outer ``try/except BaseException`` so
  silent thread death now surfaces a ``"writer thread died: ..."``
  line in the launcher terminal instead of looking like a hang.
- Existing first-write-failure stderr print gets ``flush=True`` so
  the user actually sees it before the process is killed.
- Test fixture switches from the previous-commit ``_DISABLED = False``
  override to ``_ENABLE_OVERRIDE = True`` so tests exercise the same
  bypass path the diagnostic build uses.
- Two new tests pin the safety contract: with the kill switch on
  and no override, every producer is a true no-op (no writer
  thread, no file). And ``DATATOOLS_AUDIT_PROBE=no-events`` bypasses
  ``log_event`` even when the override is on — guards the bisect.

Rollback: ``git revert HEAD`` removes Phase 1 cleanly. The deadlock
fix from the previous commit stays in place.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 14:46:27 +00:00
a8ff8f4bd0 fix(audit): break audit_log_path/_session_id deadlock
Pre-existing latent bug since d9e32e5: ``audit_log_path()`` acquires
the non-reentrant ``_LOCK`` and, while holding it, calls
``_session_id()`` which also takes ``_LOCK``. On a clean module state
(both ``_LOG_PATH`` and ``_SESSION_ID`` unset) the first caller
deadlocks.

``log_session_start`` triggers it in practice — it's the first GUI
call after import and the ``log_file=str(audit_log_path())`` arg is
evaluated before any ``log_event`` has had a chance to lazy-init the
session id. Strong candidate contributor to the blank-pages symptom
the kill switch was put back to mask: the writer thread (and any
producer reaching ``audit_log_path``) would freeze forever, and
Ctrl+C would not free the GIL — matches the launcher-can't-be-killed
behaviour reported in 1caedbb.

Fix: resolve the session id BEFORE acquiring ``_LOCK`` in
``audit_log_path``. ``_session_id`` already double-checks under its
own lock, so the call is safe and self-synchronising.

Test fixture in ``tests/test_audit.py`` now bypasses the kill switch
via ``monkeypatch.setattr(audit, "_DISABLED", False)`` — env vars are
captured at import time and ``monkeypatch.setenv`` won't reach the
module-level flag. With the fix in place, all 6 tests pass in 0.15s;
without it, ``test_session_start_renders`` (and any test exercising
the log_session_start path) hangs indefinitely.

Kill switch behaviour is unchanged in production (`_DISABLED = True`
in the shipped module); this is purely a correctness fix for the
code path that gets exercised when the switch is off.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 14:45:08 +00:00
4451f74895 fix(layout): bump bottom block-container padding 4rem → 7rem
Last lines on long tool pages were still grazing the fixed Help/Close
footer when scrolled all the way down. 4rem gave the cursor of free
space the footer claims but no breathing room — the bottom button
or text was visually flush against the footer's top edge. 7rem buys
~3rem of clear space on every page so the last content row reads
without obstruction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 02:32:13 +00:00
a022059b1e chore: drop accidentally-tracked scratch screenshot 2026-05-19 02:30:01 +00:00
69240fc922 fix(home,close): tool-link preserves file context + drop close-page explanation
(1) ``[Tool] →`` action links inside per-file finding rows now
preserve the file that the card belongs to. Previously the home page
re-set ``home_uploaded_*`` to the FIRST imported file on every rerun
— so when a user with multiple imports clicked
``Clean Text →`` on file_B's findings card, the tool page loaded
file_A. The click handler in ``_render_finding_row_v2`` now looks
the file up in ``home_uploads`` by the findings-card filename and
writes ``home_uploaded_name / size / bytes`` BEFORE
``st.switch_page``, so the tool's ``pickup_or_upload`` reads the
right context.

The filename threads through ``render_findings_panel(..., header=)``
→ ``_render_finding_row_v2(..., filename=)``; ``header`` is already
the filename today, so no call-site change needed.

(2) Close screen "explanation" removed. The long browser-restriction
hint paragraph (``quit.close_hint``: "Browsers don't let JavaScript
close a tab you opened yourself …") is gone from the farewell overlay
— the auto-dismiss path lands the user on about:blank within ~1.5s
of the close click, so the explanation never had a chance to be
useful. ``autoDismiss`` simplified to "try close, else redirect"
without the hint-surface step. The i18n key is retained as a no-op
in case the hint comes back.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 02:29:49 +00:00
9a7d861903 fix(ui): bottom padding + close-screen button removed + sidebar collapse + quiet loguru
Four issues batched together since they all touch the GUI shell:

- ``stMainBlockContainer``'s ``padding-bottom`` bumped from 0.75rem
  → 4rem (~one button-height of free space above the fixed Help/Close
  footer). The last line of content on a page that fills the viewport
  was previously sitting flush against the footer's top border.

- Farewell overlay's "Close this window" button removed per UX
  request. The auto-dismiss path is now the only flow: try
  programmatic close (works in Chrome/Edge ``--app`` windows);
  failing that, surface the hint and redirect the parent window to
  ``about:blank`` after a short timeout. Previously the user had to
  click the button to get the same fallback. The
  ``quit.close_window_button`` i18n key is retained as a no-op for
  now in case the button comes back; nothing references it.

- Sidebar collapse → expand was broken: clicking « collapsed the
  sidebar but the » expand-back affordance was invisible. Two causes
  pulled apart:

   1. ``.dt-brand { flex: 1 }`` was eating the entire
      ``stSidebarHeader`` width, squeezing Streamlit's
      ``stSidebarCollapseButton`` off the right edge. Changed to
      ``margin: 0 auto 0 0`` so the brand keeps its natural width
      and the chevron has room to live next to it.

   2. The "hide Streamlit chrome" toolbar block was listing
      ``stToolbar`` and ``stToolbarActions`` for ``display: none``
      — but the post-collapse re-open button
      (``stExpandSidebarButton``) lives inside ``stToolbar``, so
      hiding the container killed the button too. Dropped both
      container testids from the hide list and kept the per-icon
      rules for ``stMainMenu`` / ``stAppDeployButton`` /
      ``stStatusWidget`` / ``stDecoration``.

- Loguru's stderr sink quieted in GUI mode. ``src/gui/app.py`` now
  runs ``logger.remove()`` + ``logger.add(sys.stderr, level="ERROR",
  …)`` at the top so internal ``logger.debug`` / ``logger.warning``
  breadcrumbs (e.g.
  ``standardize_dataframe: 7/31 cells were unparseable``) no longer
  print to the terminal when the user runs ``python -m src.gui``.
  CLI entry points already do the same configuration per-script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 02:21:41 +00:00
1016a4d2c4 feat(home,sidebar): brand hero + sidebar = footer style + PNG icon
Bundles a handful of UX cleanups:

- Findings-card chevron moved to the LEFT side of the head. CSS still
  rotates it 90° between collapsed/expanded states.

- Tool-link buttons in findings rows (``Clean Text →`` etc.) are now
  left-justified against the icon column with minimal surrounding
  whitespace. Action column ratio dropped from 1.8 → 1.4 and the
  button switched from ``width="stretch"`` (centered text) to
  ``width="content"`` (shrinks to fit, left-aligned within column).

- Home-page hero now mirrors the sidebar brand block: 56px ink "D"
  chip on the left + "UNALOGIX" eyebrow stacked above "DataTools"
  wordmark, then the "Clean. Normalize. Transform." tagline beneath.
  New ``.dt-page-brand / -row / -words / -mark / -eyebrow /
  -wordmark`` rules in ``_DESIGN_TOKENS_CSS``. Streamlit wraps h1
  elements in an emotion-cache div with extra padding; a descendant
  flattener (``.dt-page-brand-words *`` margin:0 / padding:0) keeps
  the eyebrow + wordmark stack the same height as the chip so they
  center-align cleanly.

- Sidebar nav restyled to match the sticky-footer Help/Close buttons
  exactly: 13px / 500 / 1.3 line-height, 5×10px padding, 8px gap
  between icon and label, transparent background. Active item gets
  the same ``rgba(0,0,0,0.04)`` tint as the hover state (no white
  pill, no shadow), only the heavier weight + ink text distinguishes
  it.

- OS app icon (page_icon) switched from SVG to a Pillow-rendered
  ``datatools_icon_256.png`` so Windows / macOS taskbar+dock pick
  it up reliably (some OS shells fall back to a default icon for
  SVG favicons). Rounded-square ink ground with cream "D" centered —
  same mark as the sidebar chip + hero chip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 02:04:53 +00:00
6c3939d21b feat(brand): "Letter D (sans)" app icon — favicon + sidebar chip
Implements ``Business/DataTools/app_icons.html`` §03 "Letter D (sans)"
as the canonical app mark.

- New ``src/gui/assets/datatools_icon.svg`` — 64×64 SVG, 14px corner
  radius, ink ground (#1c1917), cream "D" (#fef4ed) in
  Geist 700 / -0.04em tracking. Pure SVG so it renders sharp at
  every favicon size; font stack falls back through Geist →
  system sans where the webfont isn't installed (favicons can't load
  Google Fonts).

- ``_home.py``, ``_Activate.py``, ``99_Close.py``: page_icon now
  resolves the SVG path via ``Path(__file__).parent / "assets" /
  "datatools_icon.svg"`` instead of the broom 🧹 / 🔑 / 🛑
  emojis. Streamlit inlines it as a ``data:image/svg+xml;base64,...``
  link tag so the browser tab + OS app-icon for ``python -m src.gui``
  matches the sidebar chip.

- Sidebar ``.dt-brand-mark`` tightened to match the spec's "Letter D
  (sans)" rendering: ``font-weight: 700`` and
  ``letter-spacing: -0.04em`` (was 600 / -0.02em). The on-screen
  chip is now a scaled-up copy of the OS icon.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:50:18 +00:00
d436e34a45 feat(brand): rebrand to UNALOGIX DataTools + Clean. Normalize. Transform.
User-facing copy + brand updates landed together:

- Page H1 + browser-tab title: "DataTools — Data Cleaning Mastery"
  → "UNALOGIX DataTools". Same change in es.json (was "DataTools —
  Maestría en limpieza de datos").
- Hero subtitle: long descriptive caption replaced with the tagline
  "Clean. Normalize. Transform." (es: "Limpia. Normaliza.
  Transforma.").
- Sidebar brand block: wordmark is now two lines — UNALOGIX in tiny
  uppercase tracked eyebrow style on top, DataTools in the 15px
  semibold wordmark beneath. The 28px "D" chip stays as the
  recognizable mark. New ``.dt-brand-eyebrow`` rule in
  ``_DESIGN_TOKENS_CSS``.

Top-right Streamlit chrome cleanup — the user reported two stacked
icon buttons. ``.streamlit/config.toml`` bumped to
``toolbarMode = "viewer"`` (most aggressive — suppresses status
indicator + deploy button + running glyph). CSS belt-and-suspenders
hides ``stToolbar``, ``stToolbarActions``, ``stStatusWidget``,
``stDecoration`` for newer Streamlit releases that keep emitting
these with inline styles even under toolbarMode=viewer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:45:38 +00:00
0bb72ecd7e feat(home,sidebar): brand block + collapsible findings + many polish tweaks
Batch of UX tweaks the user asked for in quick succession:

- Sidebar brand block (mockup §brand) — 28px ink chip with a "D"
  wordmark plus the "DataTools" text — injected into
  ``stSidebarHeader`` by a small JS bundled into the iframe-mounted
  script that already runs from ``hide_streamlit_chrome``. The
  Streamlit ``stLogoSpacer`` is hidden when the brand block is
  present so it sits flush at the top of the sidebar.

- Findings cards are now collapsible. Each file's card head carries
  ``data-dt-collapsed="true"`` on first render; clicking the head
  flips the attribute via the new ``_WIRE_COLLAPSIBLE_FINDINGS_JS``
  (MutationObserver re-wires after reruns). A CSS rule
  ``[stElementContainer]:has(.dt-finding-group-head[data-dt-collapsed
  ="true"]) ~ *`` hides every later sibling of the head's element
  container — covers both ``stLayoutWrapper`` (the columns rows in
  this Streamlit release) and ``stElementContainer`` so the rule
  survives future Streamlit layout renames. A chevron icon
  (``chevron_right``) rotates 90° when expanded. The head itself
  gets ``cursor: pointer`` + an accent-fill hover.

- Tool-link buttons in finding rows dropped the leading ``Open`` —
  now read ``Clean Text →``, ``Standardize Formats →`` etc.

- Finding-row column order: action is now LEFT of the description,
  matching user feedback (``[icon] [Tool →] [description + meta]``).

- Head padding bumped to ``16px 22px`` so the filename has visible
  breathing room from the card's left edge (previously the mono
  filename felt like it was bleeding into the rounded corner).

- Head margin-bottom bumped to 1.5rem for breathing room before the
  first finding row when expanded; collapsed state tucks the head
  flush against the card bottom with full ``--r-lg`` corner radius
  and no visible bottom border.

- Files card row layout: ``✕`` button moved to the LEFT of the
  filename (``[✕] [chip + filename] [size]``).

- Sidebar nav rows tightened: link padding 7px → 4px, line-height
  1.25, 1px margin-bottom per li, section-header padding-top reduced.
  Plus a new ``--gap: 0.25rem`` rule for vertical blocks inside
  bordered containers so the Files card and findings card body have
  denser inter-row spacing.

- Sidebar Language selector restyled: widget labels render as the
  spec's "Eyebrow" row (11.5px / 500 / 0.08em uppercase, tertiary
  ink), selectbox combobox gets a paper surface + soft border that
  matches the rest of the sidebar chrome.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:40:22 +00:00
74d0ee270f chore(home): remove "Export report" button
The disabled "Export report" placeholder is gone — it wasn't tied to
a real feature and was just noise in the action bar. Action bar is
back to two buttons (Run analysis · Clear results) on a 1:1:4
column split. ``upload.export_report`` keys removed from en + es
i18n packs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:17:43 +00:00
06f1ea6cf7 fix(buttons,footer): unify disabled state + restyle Help/Close as nav links
(3) Disabled primary buttons no longer read as a "whited-out" dark
slab. Streamlit's primary-button selector
``button[data-testid="stBaseButton-primary"]`` has the same
specificity as our previous ``button:disabled`` selector, so the
primary background + cream text kept winning the cascade tie-break.
The disabled rule's selector list now explicitly matches both the
``kind="primary"``/``kind="secondary"`` shapes AND the
``stBaseButton-primary``/``-secondary`` testids, so disabled
buttons collapse to ``surface-hover`` background, ``ink-tertiary``
label, soft border — same look regardless of starting kind. A
follow-up rule re-asserts ``color: var(--ink-tertiary)`` on every
descendant of the disabled primary so the inner
``stMarkdownContainer > p`` doesn't keep the cream label from the
"all descendants get --bg" primary rule.

(4) The sticky-footer Help + Close buttons now match the sidebar
nav-item look. Old outlined-pill chrome is gone:
``.datatools-footer-btn`` is now display:inline-flex with a
Material-Symbols ligature icon + label, borderless, ``ink-secondary``
text on a transparent surface, ``rgba(0,0,0,0.04)`` hover background.
The Close button keeps a danger tint via ``.close`` so it still reads
as the shut-down action, with a soft ``--danger-fill`` hover. Help
uses the ``help_outline`` icon, Close uses ``power_settings_new``.
Built via a small ``makeFooterBtn`` helper in the iframe JS that
appends the icon span + label text node to the button — keeps the
existing soft-nav click handlers intact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:12:03 +00:00
784695e3a7 fix(home,findings): reclaim top whitespace + add padding under finding head
Two visual cleanups:

1. The block-container "claim padding" rule was a no-op — it targets
   the legacy ``stAppViewBlockContainer`` testid; Streamlit renamed
   it to ``stMainBlockContainer`` in the current release. Updated the
   selector list to match both, so the page title now sits close to
   the top edge again (~0.5rem from the hidden header) instead of
   inheriting Streamlit's default ~6rem header reservation.

2. ``.dt-finding-group-head`` margin tightened to ``margin: -1rem
   -1rem 0.75rem``: -1rem on top/sides still bleeds the head to the
   card edges, but +0.75rem on the bottom is breathing room between
   the head's bottom border and the first finding row, which were
   abutting before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:04:42 +00:00
4816da1ad6 fix(home): show file sizes in KB/MB/GB, never raw bytes
Per-row file sizes and the Files-card total-size meta both read as
human-readable units now. Smallest unit is KB even for sub-kilobyte
files (so ``538 B`` → ``0.5 KB``, ``4914 B`` → ``4.8 KB``), steps up
to MB at 1 MiB and GB at 1 GiB. Always one decimal place.

New module-level helper ``_format_size(int) -> str`` in ``_home.py``;
both the section meta (``1 file · 4.8 KB total``) and the per-row
``dt-file-size`` cell call it instead of the previous ad-hoc
``f"{n:,} B"`` formatter. Keeps the display consistent regardless of
file size — and keeps the GUI free of raw byte counts that nobody
needs to read.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:59:56 +00:00
6703e2c15c feat(home): in-card "+ Add more files" replaces Streamlit's dropzone
Mockup §file-add lands as the canonical import affordance:

- Streamlit's ``st.file_uploader`` widget is still mounted (only path
  that actually receives browser file events), but parked off-screen
  via a new ``[data-testid="stFileUploader"] { position:absolute;
  left:-10000px; … pointer-events:none }`` rule. Its hidden
  ``<input type="file">`` stays reachable to JavaScript.
- The Files card is now always rendered (header + bordered body).
  The bottom row of the card is a ``button.dt-file-add`` styled per
  mockup §file-add: dashed top border bleeding to the card edges,
  surface-hover background, ``+ Add more files`` text in
  ``--ink-secondary``, accent-fill on hover.
- A small ``<script>`` shipped through ``st.iframe`` wires the
  button: ``click → input.click()`` on the off-screen
  ``stFileUploaderDropzoneInput``. Streamlit's HTML sanitizer
  strips inline ``onclick`` from ``unsafe_allow_html`` content, so
  the binding has to come from a real script element — same pattern
  the sticky footer and Upload→Import rewriter use. A
  ``MutationObserver`` re-wires the button when Streamlit remounts
  it across reruns. The ``dataset.dtWired`` guard prevents double
  binding.

Section structure also tightened to match the mockup:

- Section heading is now ``<h2>Files</h2>`` (was ``### Import one
  or more files to start``) with the count + total size on the
  right of the same flex row. When no files: ``No files imported
  yet``. When files exist: ``1 file · 4.8 KB total``.
- Dropped the ``upload.intro_multi`` caption and the
  ``upload.empty_state`` info banner — the card itself plus the
  in-card Add button cover both prompts.
- Empty state now ends after the Files card (no stats / no action
  bar / no findings rendered) — matches mockup's single-section
  empty view.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:56:11 +00:00
a9788ba712 feat(ui): page header + files card + action bar + findings cards (mockup 2)
Closes the remaining gaps between the live home page and the
``datatools_layout_redesign2.html`` mockup. Four pieces land
together because they all consume the same new CSS scaffold:

1. Page header (§page-header)
   ``st.title`` + ``st.caption`` + ``st.divider`` collapse into one
   flex header: h1 + body subtitle on the left, ``Runs 100% locally``
   privacy pill (success-fill + lock SVG) on the right, soft border
   below. The "Runs 100% locally" phrase moved out of
   ``home.caption`` into the new ``home.privacy_pill`` i18n key
   (en + es).

2. Files card (§files-card)
   The "Imported files" list is now a single bordered card with a
   section head (count + KB total on the right, mockup §section-head).
   Each row renders a 28px accent-fill chip carrying the inline
   document SVG, a mono filename, a right-aligned mono size, and a
   compact ``✕`` button. The word-button ``Remove`` is gone —
   replaced by an icon-only tertiary button styled via a new CSS
   rule that goes transparent → danger-fill on hover (mockup
   §file-remove).

3. Action bar (§action-bar)
   Three buttons in one row: ``Run analysis`` (primary ink), a new
   disabled ``Export report`` (secondary; coming soon, tooltip), and
   ``Clear results``. New i18n key ``upload.export_report``.

4. Findings — per-file group cards (§finding-group)
   ``render_findings_panel`` rewritten end-to-end. Output is now:
     • A head row (``dt-finding-group-head``) bleeding to the card
       edges: worst-severity dot · mono filename · count pills
       enumerating non-zero severities (e.g. ``2 info`` blue,
       ``1 warning`` amber, ``1 error`` rose).
     • A flat list of finding rows sorted error → warn → info.
       Each row: tinted Material-icon chip + title (description
       with optional ``<code>`` column chip) + mono meta line
       (rows affected, samples captured) + tertiary
       ``Open <Tool> →`` action button that ``st.switch_page``s
       to the relevant tool.
   The previous tool-grouped expander stack is dropped — the new
   layout is denser and matches the mockup's single-card-per-file
   structure.

   ``_render_one_finding`` (the old per-finding helper that emitted
   markdown lines + sample tables) remains in the file but is no
   longer called from the home flow; left in place for any other
   surface that still depends on the markdown style.

   The "no issues" success state renders a green dot + mono
   filename + ``no issues`` success pill in the same card chrome,
   so empty-result files visually match the rest of the panel
   rather than getting a generic ``st.success`` callout.

CSS additions (``_DESIGN_TOKENS_CSS``):
  ``.dt-page-header / .dt-page-subtitle / .dt-privacy-pill``
  ``.dt-files-section-head / .dt-section-meta``
  ``.dt-file-row / .dt-file-icon-chip / .dt-file-name / .dt-file-size``
  ``.dt-finding-group-head / .dt-severity-dot{.warn,.info,.error,.success}``
  ``.dt-group-filename / .dt-group-counts``
  ``.dt-count-pill{.warn,.info,.error,.success}``
  ``.dt-finding-row / .dt-finding-icon{.warn,.info,.error}``
  ``.dt-finding-title / .dt-finding-meta``
  Tertiary button rule (transparent → danger-fill on hover) for
  the X button and the ``Open Tool →`` row action.

theme.py:
  Explicitly loads Material Symbols Outlined alongside Geist —
  the severity-chip ligatures (``info`` / ``warning`` / ``error``)
  need the font present even when no ``:material/`` token has been
  emitted yet on the page. Tightened ``.dt-finding-icon .dt-mui``
  selector with ``[data-testid="stMarkdownContainer"]``-scoped
  variant so the Material font wins over theme.py's base
  ``var(--font-sans) !important`` on markdown descendants.

Leading section-heading emojis stripped from i18n
(``upload.heading``) for parity with the mockup's clean ``Files``
/ ``Findings`` h2s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:43:42 +00:00
da7d86f457 feat(ui): Material icons in sidebar + stats overview on home
Two pieces of the mockup 2 layout that hadn't landed yet:

1. Sidebar nav icons — emoji glyphs (🧹 ✂️ 🔍 …) swapped for
   Streamlit's ``:material/<name>:`` syntax, picking the outline
   Material Symbol that best matches each mockup SVG:

       Home               → :material/home:
       Fix Missing Values → :material/help_outline:
       Find Unusual Vals  → :material/insights:
       Clean Text         → :material/text_format:
       Standardize Fmts   → :material/format_list_bulleted:
       Find Duplicates    → :material/search:
       Quality Check      → :material/check_circle:
       Map Columns        → :material/view_column:
       Combine Files      → :material/account_tree:
       Auto Workflows     → :material/auto_awesome:
       Activate           → :material/key:
       Close              → :material/close:

   Streamlit injects the icon name as a literal ligature inside a
   first-child ``<span>`` of the nav anchor, expected to render
   through the Material Symbols font. theme.py's base rule was
   forcing Geist on every span under ``stSidebarNav``, turning the
   ligatures back into plain text labels — added a structural
   exception that targets ``[data-testid="stSidebarNavLink"] >
   span:first-child`` (and any descendant), restoring the Material
   font family, neutralizing the inherited ``ss01/cv01/cv11``
   feature settings, and sizing to 18px.

   Also stripped the leading emojis from every page title in the
   en/es i18n packs (``home.title``, ``close_page.title``,
   ``activation.title``, ``tools.*.page_title``) — the icons live
   in the sidebar now, the page H1 no longer needs to carry one.

2. Stats overview on home — new ``_render_stats_overview`` in
   _home.py emits a 4-card grid above the per-file findings panels:
   Files analyzed, Total findings, Warnings (severity ``warn`` ∪
   ``error``), Info (severity ``info``). Card layout follows the
   mockup §stats verbatim — Geist 28px / 600 / -0.03em for the
   numeric value (the "Display number" row in spec §4), tiny
   uppercase tracked label, paper-surface card with the standard
   warm border + faint shadow. The Warnings / Info cards tint the
   number with ``--warn`` / ``--info`` when the count is non-zero.

CSS for ``.dt-stats / .dt-stat / .dt-stat-label / .dt-stat-value /
.dt-stat-unit`` added to ``_DESIGN_TOKENS_CSS``; falls to a
2-column grid below 900px viewport, matching the mockup's media
query.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:31:40 +00:00
2501119ac2 feat(ui): replace Fraunces with Geist per geist_spec.md
Switches the type system to the single-family Geist spec referenced
in ``Business/DataTools/geist_spec.md`` and the matching
``datatools_layout_redesign2.html`` mockup. Editorial-serif headings
are out; the product now reads as modern SaaS-tool typography per
the spec's positioning note (§10).

  src/gui/theme.py (new)
    Implements geist_spec.md §3 verbatim — preconnect + Google Fonts
    link for Geist (400/500/600/700) and Geist Mono (400/500), the
    canonical ``:root`` token table (§7) plus severity extensions,
    and the type scale (§4): h1 32/600/-0.035em, h2 22/600/-0.025em,
    h3 18/500/-0.018em, h4 15/500/-0.012em, body 14/400, caption
    12.5/400, mono 0.92× ss02. ``apply_theme()`` is the single entry
    point.

    Two deviations from the spec, both anticipated by spec §6.1:
    - ``font-family: var(--font-sans) !important`` on the base rule.
      Streamlit applies ``font-family: "Source Sans"`` directly to
      ``[data-testid="stMarkdownContainer"]`` and a few widget
      wrappers at equal-or-higher specificity than the spec's
      selector list, so plain inheritance loses the cascade.
    - The base selector list explicitly enumerates
      ``stSidebarNav``, ``stMarkdownContainer``, ``stVerticalBlock``
      and a few siblings so Streamlit's per-widget font reset
      doesn't reach descendant text.

  src/gui/components/_legacy.py
    - ``_DESIGN_TOKENS_CSS`` no longer redeclares fonts or the
      heading rules — those are theme.py's job (spec §9 says the
      spec is type-only; everything below is component chrome).
    - Token references switched from ``--dt-*`` to the spec names
      (``--ink``, ``--bg``, ``--surface``, ``--border``, ``--accent``,
      ``--font-sans``, ``--font-mono``, …).
    - Sidebar section-label rule tightened to 11.5px / 500 to match
      the "Eyebrow" row in spec §4.
    - Primary-button text color now also targets every descendant
      (``button[kind="primary"] *``) so the inner
      ``stMarkdownContainer > p`` doesn't pick up
      ``color: var(--ink)`` from the base rule and render
      near-invisible ink-on-ink.
    - ``hide_streamlit_chrome`` now calls ``apply_theme`` before
      injecting component CSS so the base tokens are defined first.

Acceptance criteria from spec §8 verified at 1920×1050:
  - h1 computes ``font-family: Geist``, ``font-weight: 600``,
    ``letter-spacing: -1.12px`` (= 32px × -0.035em), size ``32px``.
  - Body ``<p>`` inside ``stMarkdownContainer``: Geist 400 / 14px.
  - Caption: Geist 400 / 12.5px.
  - Inline mono filenames: Geist Mono in accent-fill chip.
  - No Source Sans Pro leaks into any text the user reads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:21:52 +00:00
444dffbc63 chore(ui): rename Upload → Import in user-facing strings
DataTools is local-first — "Upload" reads like "send data somewhere
remote", which contradicts the product positioning. Sweep replaces
the user-visible term throughout the UI:

- ``src/i18n/packs/en.json`` + ``es.json``: all ``upload.*`` strings
  (heading, intro, uploader labels, empty state, switch-back, etc.)
  and ``gate.default_name``. The ``intro_multi`` "no upload anywhere"
  phrasing dropped the verb entirely — now reads "nothing leaves
  this computer".
- All 9 tool pages: ``st.file_uploader(label="Upload …")`` →
  ``"Import …"``; matching ``st.info("Upload a …")`` empty-state
  banners; ``help="Upload …"`` strings on disabled uploaders.
- ``9_Pipeline_Runner`` + ``5_Column_Mapper``: radio-option text
  ``"Upload schema/pipeline JSON"`` → ``"Import …"`` plus the
  ``.startswith("Upload")`` branch guards that read those values.
- ``_home.py``: "**Uploaded files**" → "**Imported files**".
- ``app_demo.py``: "Uploaded file is …" → "Imported file is …".

Internal identifiers left untouched: function names
(``pickup_or_upload``, ``_StashedUpload``), session-state keys
(``home_upload``, ``home_uploads``, ``home_uploaded_*``,
``merger_file_upload``), audit-log event category (``"upload"``),
Streamlit testid CSS selectors. None of those are visible to the
user.

The file_uploader's dropzone button text is a baked-in React
literal that Streamlit's ``label=`` doesn't reach; rewritten at the
DOM level with a small ``_RENAME_UPLOAD_BUTTON_JS`` snippet shipped
through ``st.iframe`` (same pattern the sticky footer uses to mount
on ``<body>``). A ``MutationObserver`` on the parent document re-
applies the swap when Streamlit remounts the dropzone after file
add/remove or page navigation, throttled via ``requestAnimationFrame``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 23:48:31 +00:00
3c4b80895e fix(home): hide Streamlit's chip row, keep only the canonical file list
After upload, two near-identical file lists were shown stacked:
Streamlit's built-in compact chip row inside the dropzone (icon +
``messy_sales.csv`` + size) and the home page's own "Uploaded files"
section beneath it (filename + Remove button). User flagged the
duplication.

Hide ``[data-testid="stFileChip"]`` and its first-child wrapper so
the chip row collapses; the dropzone's borderless ``+`` button is
preserved as the "add more files" affordance, and our "Uploaded
files" list is now the single source of truth visually.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 23:42:22 +00:00
b0ee65e922 feat(ui): warm editorial redesign — Fraunces + Geist + stone palette
Lifts ideas from the ``datatools_layout_redesign.html`` mockup
(artistic licence, not literal). Two changes:

1. ``.streamlit/config.toml`` ``[theme]`` block — cream paper bg
   (#fafaf7), warm sidebar (#f5f4ef), stone ink (#1c1917), burnt
   orange primary (#c2410c). Streamlit threads these through its
   chrome (focus rings, file-uploader accents, link colors).

2. ``_DESIGN_TOKENS_CSS`` injected by ``hide_streamlit_chrome`` on
   every page. Imports Fraunces (display serif), Geist (body sans),
   Geist Mono. Restyles, scoped through ``--dt-*`` custom properties:

   - Page surface + sidebar — warm cream backgrounds, soft warm
     borders, no harsh white.
   - Sidebar nav — section labels in tiny uppercase tracking, nav
     items with soft hover, active item as a white pill with subtle
     shadow.
   - Typography — H1/H2/H3 in Fraunces with tightened tracking;
     body Geist; inline code Geist Mono with orange-on-cream chip.
   - Buttons — primary = dark ink (``#1c1917``) with white text;
     secondary = paper surface with warm border; disabled = muted
     cream.
   - Containers / expanders — editorial cards: 14px radius, 1px
     warm border, faint shadow, warm-cream summary headers.
   - File uploader — cream dropzone with dashed border + per-file
     paper chips.
   - Alerts — soft tinted fills (info=sky, success=mint, warn=amber,
     error=rose) over the kind-specific palette.
   - Inputs, tabs, dataframes — paper surfaces with rounded warm
     borders.

Verified at 1920x1050 + 1400x900 on home page (empty + with file
uploaded + with findings rendered) and Clean Text tool page; no
regressions in the white-bar fix from 65b663b.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 23:36:24 +00:00
65b663be97 fix(footer): stretch .stApp + sidebar + main to compensate for zoom
User screenshot pinned the actual culprit: a horizontal white band
across the FULL viewport width (including over the sidebar) above
the Help/Close footer. Diagnosis:

  - ``.stApp`` carries ``zoom: 0.85``, so any descendant sized at
    ``100vh`` only renders at ~85vh visually.
  - At 1920x1050 the visual end of ``.stApp`` is around y=893; the
    fixed footer overlays y=1017..1050; the strip in between (124px
    at this resolution) is ``body`` painting white through, because
    ``.stApp``, ``stSidebar`` and ``stMain`` are all shorter than
    the viewport.
  - The previous "min-height: 100vh/0.85" rule targeted the legacy
    ``data-testid="stAppViewBlockContainer"``. The current Streamlit
    release renamed that testid to ``stMainBlockContainer`` — so the
    rule was a no-op for months. Verified the new testid by walking
    the live DOM.

Fix: stretch ``.stApp``, ``[data-testid="stSidebar"]`` and
``[data-testid="stMain"]`` with ``min-height: calc(100vh / 0.85)``
so they fill the visible viewport. Keep the block-container's 2rem
``padding-bottom`` (now matching both the new and legacy testids in
case Streamlit rolls it back).

Verified at 1920x1050: sidebar gray extends to y=1050, content area
extends to y=1050, footer overlays the bottom 33px, no white band
between content and footer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 23:22:11 +00:00
c942b8aa19 fix(footer): offset sticky-footer's left edge past the sidebar
The "white bar" was the footer's near-white background painting
over the bottom of the sidebar. The footer is fixed at body level
with ``left: 0; right: 0`` so it spans the full viewport — its
``rgba(255, 255, 255, 0.97)`` background renders as essentially
white over the sidebar's ``rgb(240, 242, 246)`` gray, producing a
visibly different strip at the bottom of the sidebar (this is what
the diagnostic GREEN tint marked as ``stAppViewContainer``-shaped
because that is the element directly behind it).

Pixel-sampled the bottom row to confirm:
  y=860 over sidebar  →  (240, 242, 246)  (gray)
  y=870 over sidebar  →  (255, 255, 255)  (footer-painted white)

Fix: in the iframe JS that mounts the footer on ``<body>``, measure
``[data-testid="stSidebar"].getBoundingClientRect().right`` and set
the footer's (and help popover's) ``left`` to that offset with
``setProperty(..., 'important')`` so it beats the ``left:0!important``
fallback in CSS. A ``ResizeObserver`` on the sidebar plus a
``window.resize`` listener keep the offset in sync when the sidebar
collapses or expands.

Sidebar collapsed (width 0 or off-screen) clamps to 0 → footer goes
flush-left as before. Also dropped the no-op ``min-height`` on the
view container from the previous attempt; ``stAppViewContainer`` is
transparent, so stretching it never painted anything.

Verified by injecting the same offset on the live page: bottom row
at y=890 is now ``(240,242,246)`` over the sidebar and only turns
white at x=255 where the content area begins.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:52:02 +00:00
61e63913cb chore: migrate use_container_width → width (Streamlit deprecation)
``use_container_width`` is being removed after 2025-12-31. Streamlit
log was flooding the terminal with the deprecation notice on every
rerun. Mechanical sweep:

  use_container_width=True   →  width="stretch"
  use_container_width=False  →  width="content"

51 call sites across 11 page files + ``app_demo.py``. Also renamed
the ``local_download_button`` helper's ``use_container_width`` kwarg
to ``width`` (default ``"stretch"``); it has no external callers
passing the old name, so this is a safe rename.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:43:52 +00:00
e011c0b6e6 fix(footer): close white gap by stretching stAppViewContainer
Color-tag diagnostic confirmed the bottom-of-viewport strip was
painted by ``stAppViewContainer`` (it showed GREEN), not by the
block container as the previous two attempts assumed. ``.stApp``
has ``zoom: 0.85`` so 100vh visually renders at 85% — apply
``min-height: calc(100vh / 0.85)`` to the view container itself so
it spans the full visible viewport and there is no gap for its own
background to leak through as a "white bar". Reverts the diagnostic
tints (RED/BLUE/GREEN/GOLD); keeps the 2rem block-container
padding-bottom that reserves room for the fixed footer overlay.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:36:41 +00:00
2fe324279e diag(footer): color-tag every candidate bottom-area container
Option 2 (stretching the block container with ``min-height``) did
not close the white gap. Either the rule isn't applying, or the
block container isn't the element that fills the visible bottom of
the page. Tint every plausible container so the eye can tell us
instantly which one paints the bar:

  - RED    ``stAppViewBlockContainer``   (still has min-height applied)
  - BLUE   ``stMain`` / ``section[stMain]``  (with its own min-height)
  - GREEN  ``stAppViewContainer``
  - GOLD   ``.stApp`` (zoomed)

User reload + report which color shows where the "white bar"
previously was — that names the target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:33:19 +00:00
04dc326020 fix(footer): stretch block container to full viewport to close white gap
Option 1 (tightening ``padding-bottom`` from 3rem to 2rem) did not
eliminate the gap. The remaining gap is ``.stApp``'s solid white
background showing through the area below the block container's
natural (content-sized) bottom edge — visible because the home
page's content is shorter than the viewport.

Stretch the block container with ``min-height: calc(100vh / 0.85)``
so the container itself fills the visible viewport. Now the area
between the last finding card and the fixed footer is the block
container's own background, not ``.stApp`` showing through —
visually continuous with the content above.

The ``/0.85`` compensates for ``.stApp { zoom: 0.85 }`` (defined in
``_HIDE_CHROME_CSS``): inside a zoomed container, ``100vh`` renders
at 85% of true viewport height, leaving a 15% gap if used raw.
``box-sizing: border-box`` keeps the 2rem padding part of the
total height instead of stacking onto it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:30:22 +00:00
d487a44170 fix(footer): tighten block-container `padding-bottom` to close white gap
Diagnostics confirmed the "white bar" the user has been describing is
not a separate element — it's ``[data-testid=stApp]``'s solid white
background (``rgb(255,255,255)``, viewport-locked) showing through the
gap between where page content ends and where the fixed Help/Close
footer overlay begins. ``stApp`` stays put while content scrolls
inside it, which is why the bar "doesn't change when scrolling".

The gap exists because ``render_sticky_footer`` overrides the block
container's ``padding-bottom`` to ``3rem`` (48px) to reserve clear
room for the fixed footer. The footer is only ~32-33px tall (min-
height 32px + 0.25rem top/bottom padding), so ~16px of that reserve
was pure visible white space sitting above the buttons.

Reduce ``padding-bottom`` to ``2rem`` (~32px) — just enough to
prevent content from rendering under the footer overlay, no more.
Eliminates the visible gap without exposing text to clipping.

Also remove the diagnostic banner + click-to-inspect iframe from
the home page now that the bar is identified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:28:17 +00:00
f106275643 test(home): replace clutter outliner with click-to-inspect
User reported the previous diagnostic was too cluttered to read,
and the white bar showed no outline anyway — meaning the flat
``querySelectorAll('body *')`` walker missed it (likely inside an
iframe's contentDocument, which the script didn't recurse into).

New approach: a single red button "CLAUDE: click here, then click
the white bar" in the top-right. Clicking the button arms an
inspect handler. The next click anywhere on the page reports the
full element stack at that point via ``elementsFromPoint`` AND
recursively descends into any same-origin iframe at the click
location, so iframe contents are no longer invisible.

A black report panel lists every element in the stack with its
tag/id/testid/class, position, z-index, background color, and
bounding rect — TOP element highlighted in red. User clicks the
white bar exactly once and we know what it is.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:23:35 +00:00
8232ab1ca7 test(home): broader diagnostic — outline anything near viewport bottom
Previous diagnostic only outlined fixed/sticky elements; user
confirmed the offending white bar isn't one of those. Cast a much
wider net:

- Outline every element whose visible rect intersects the bottom
  200px of the viewport, regardless of position.
- Border style encodes position: solid=fixed, dashed=sticky,
  dotted=absolute, thin=static/relative.
- Render a readable list in a top-right panel showing each element's
  tag/id/testid/class, position, z-index, height, and background.
- Skip fully transparent + un-positioned elements (those can't
  actually overlay anything).

With this, scroll to the bottom and the panel + colored outlines
will identify exactly which element is the white bar — fixed or
not. The user can paste the panel list (or just name the colored
box) so we know what to remove.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:18:56 +00:00
4c8e1199a4 test(home): outline every fixed/sticky element to find the white bar
User reports: TEST #3 marker sits at the true bottom of the home
page's main content, but when scrolled the test text "goes behind"
an opaque white bar — confirming the bar is fixed/sticky (overlays
scrolling content). Our CSS only declares ONE fixed element near
the bottom (``#datatools-sticky-footer``), which the user already
ruled out. So something else — Streamlit native chrome, a third-
party widget, or a fixed element we haven't enumerated — is
overlaying the content.

Inject a small diagnostic iframe whose JS, running against the
parent document, walks every element on the page and outlines each
``position: fixed`` or ``position: sticky`` node with a distinct
color + a top-left label showing ``tagName#id[data-testid] pos=…
h=…px bg=…``. Re-runs after initial paint, on a couple of delays
(for late-mounting components), and on every scroll.

This is read-only — no DOM mutations beyond outline styles and
labels — so it's safe to ship even if I miss removing it.
The user can now visually identify which colored box is the
offending white bar and report its label.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:15:19 +00:00
e282f061dc test(home): move marker to true bottom of main content
User reported the previous TEST #2 banner appeared at the *top* of
the main content area instead of the bottom. Root cause: on the home
page, ``render_sticky_footer()`` is called at line 107 — before
``st.title()`` — so anything that function injects in document flow
lands at the top of ``stAppViewBlockContainer``. Other pages call
``render_sticky_footer()`` at the end of their script, so the flow
content lands at the bottom there.

Remove the marker from ``render_sticky_footer`` and add it directly
at the very end of ``_home._home_page()`` — after the findings
panels. If this banner lines up with the offending white strip when
scrolled to the bottom, the strip is something rendered at the tail
of the page (likely an iframe wrapper from ``render_findings_panel``
or the block container's ``padding-bottom``).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:11:24 +00:00
5daae9e5fa test(footer): move marker out of footer into main content flow
User confirmed the previous marker landed inside the Help/Close
sticky footer — which is NOT the offending white bar. They want the
sticky footer kept; the white strip they're trying to remove sits
*above* the footer in the main content area.

Move the marker out of ``#datatools-sticky-footer`` and render it
via ``st.markdown`` immediately before the ``st.iframe`` call that
injects the footer. That places it at the very bottom of
``stAppViewBlockContainer`` — exactly where the iframe wrapper
(``stElementContainer``) and the block container's
``padding-bottom: 3rem`` reservation live.

Styled as a red dashed banner so it's unmistakable. If it lines up
with the white strip clipping text on scroll, one of those two is
the culprit and the next commit can target it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:09:21 +00:00
48cb802dfb test(footer): inject visible marker into #datatools-sticky-footer
The user reports a "white bar/box" at the bottom of the main content
area that clips text when scrolling. The DOM inspector found only one
fixed-position white element near the viewport bottom —
``#datatools-sticky-footer`` (bg ``rgba(255,255,255,0.97)``,
~33px tall) — so this is my best candidate for what they're seeing.

Append a red marker span "◀ CLAUDE TEST: is this the white bar you
want removed? ▶" inside the footer div so the user can visually
confirm. If the text shows up where they see the offending white
bar, the footer is the right target; if the bar is somewhere else,
this confirms it's a different element.

Temporary — to be reverted in the next commit either way.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:06:56 +00:00
d022167ba2 fix(home): widget's "✕" Remove now actually removes the file
Reported: on the Home page after uploading data files, the Remove
buttons "on the right side" did nothing — the file kept showing up in
the list. That was the file_uploader widget's BUILT-IN ✕ icons (the
ones inside the uploader's chrome, on the right of each file row),
not our custom "Remove" buttons further down — the custom ones have
worked correctly since 84e4665.

Cause: ``_home_page`` deliberately treated the widget as add-only and
never honored widget-side removals. The reasoning, per the prior
comment, was that navigation can remount the widget with value ``[]``
— a render-time sync would then wipe ``home_uploads``. Real, but the
side effect was that the widget's own ✕ appeared to do nothing: the
file vanished from the widget chrome, stayed in ``home_uploads``, and
re-rendered immediately in the custom list below.

Fix: hook the file_uploader's ``on_change`` callback to reconcile
``home_uploads`` against the widget's current value. Streamlit's
``on_change`` fires ONLY on user-initiated value changes; the
remount-induced ``[]`` reset doesn't trigger it, so the stash still
survives navigation. Removals from the callback also drop the file's
findings entry and clear the singular ``home_uploaded_*`` keys when
the active upload was removed — matching the custom-button path.

The custom "Remove" buttons further down keep working unchanged; the
existing AppTest path through ``_home_remove_<sha1>`` still removes
exactly the file clicked. 2220 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:52:20 +00:00
24ee021314 fix(footer): hide the helper page_link row that was leaking into pages
Same wrong-testid bug as the Close click handler: the CSS rule
that's supposed to position the hidden ``st.page_link`` off-screen
was selecting ``a[data-testid="stPageLink"]``, but the bare
``stPageLink`` testid is on the OUTER wrapper div — the anchor
uses ``stPageLink-NavLink``. ``:has(a[data-testid="stPageLink"]...)``
matched nothing, so the helper rendered as a full-size visible
row at the bottom of every page (the "large white bar blocking
content" the user reported).

Fix: switch both the ``:has()`` rule and the no-:has() fallback
to ``a[data-testid="stPageLink-NavLink"][href*="close"]``. The
``href*="close"`` form also works for base-path deployments
(``/myapp/close``), matching the click handler's selector.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:07:07 +00:00
add3b866ee fix(footer): Close button now actually fires — wrong testid + bad fallback
Two bugs combined to make the footer Close a no-op:

1. The helper page_link's anchor carries
   ``data-testid="stPageLink-NavLink"`` — the bare
   ``stPageLink`` testid is on the OUTER WRAPPER div, not the
   anchor. The old selector ``a[data-testid="stPageLink"]``
   matched nothing, so ``helper`` was always ``null``.
2. The fallback ``window.location.href = './close'`` ran inside
   the component iframe, so it only navigated the (invisible)
   srcdoc iframe. The main app stayed put.

End result: click → nothing visible → shutdown_app never runs →
farewell-script's ``window.close()`` attempt never happens →
user sees the Close button as broken.

Fixes:
- Selector → ``a[data-testid="stPageLink-NavLink"][href*="close"]``.
  ``href*="close"`` covers both root (/close) and base-path
  (/myapp/close) deployments.
- Fallback → resolve the parent window via
  ``doc.defaultView`` (the parent doc's window) with a
  ``window.top`` fallback, so the hard-nav navigates the whole
  app instead of just the iframe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:02:46 +00:00
b568773a1f chore(streamlit): migrate components.v1.html → st.iframe (deprecation)
Streamlit logs a deprecation notice on every render:

  Please replace ``st.components.v1.html`` with ``st.iframe``.
  ``st.components.v1.html`` will be removed after 2026-06-01.

Replace all 9 call sites (6 tool pages + 3 in ``_legacy.py``).
Both APIs feed ``srcdoc`` to the underlying iframe so the
HTML/JS payload and the cross-frame DOM access pattern
(``window.parent.document``) are unchanged.

``st.iframe`` rejects ``height=0`` (raises ``StreamlitInvalid
HeightError``), so bump every zero-height call to ``height=1``.
1px is effectively invisible — these are script-only iframes, no
visible payload — and avoids the validator.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:57:40 +00:00
4a7f99f0ec fix(footer): restore soft-nav for Close (no page reload on shutdown)
Footer Close was using ``<a href="./close">`` which triggers a
browser hard-nav. That's a visible page-reload flash, websocket
churn, and slower shutdown than the previous sidebar Close —
which used ``st.navigation``'s soft nav.

Restore the soft-nav path:

- ``render_sticky_footer`` now renders a hidden ``st.page_link``
  pointing at ``pages/99_Close.py``. Positioned off-screen via
  CSS (``stElementContainer:has(a[data-testid=stPageLink]
  [href$=/close])``) so it occupies no layout space but stays in
  the DOM, reachable + clickable.
- Footer's Close <button> click handler now dispatches a
  programmatic click on that hidden page_link. Streamlit's React
  handler picks it up and runs the soft nav (same code path the
  old sidebar entry used). Falls back to ``window.location.href``
  if the helper link hasn't rendered yet so the button is never
  a no-op.
- The page_link call is wrapped in try/except: ``AppTest`` doesn't
  populate the page-nav session keys it needs and raises
  ``KeyError('url_pathname')``. Failure costs only the soft-nav
  optimization — Close still works via the hard-nav fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:52:00 +00:00
b2449d3139 fix(nav,footer): drop orphan _hidden section header, show footer on Activate
Two follow-ups to the prior sidebar/footer cleanup:

- The "_hidden" section header was still visible in the sidebar
  because Streamlit renders ``stNavSectionHeader`` as a sibling of
  ``stNavSection``, not a child — so the ``:has()`` rule on the
  section was hiding the items list but leaving the header
  (and its collapse/drilldown marker) behind. Move Activate +
  Close into the unlabeled section (key ``""``) alongside Home so
  there is no header to leak in the first place, then hide just
  the two links via ``stSidebarNavLinkContainer:has(...)`` (with
  a defensive ``a[href$=...]`` fallback for browsers without
  ``:has()`` support).
- The sticky footer was missing on ``pages/_Activate.py`` because
  the page never called ``render_sticky_footer`` — added the
  call so the Help / Close bar persists when the user follows
  the popover's Activate / Manage link.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:45:22 +00:00
d840230e48 fix(nav,footer): hide Activate from sidebar, surface it in Help popover
- Collapse the Account section: Activate now lives in the same
  hidden sidebar section as Close (single ``_hidden`` group). Both
  pages stay registered with ``st.navigation`` so /activate and
  /close remain URL-routable for the Help-popover / Close-button
  links — only the sidebar entries + their section header are
  hidden via CSS.
- Help popover always exposes a license-management link now:
  ``Activate now →`` when the license is inactive, ``Manage
  license →`` when it is active and valid. Both point at
  ``./activate``.
- Extend the sidebar-hide CSS to also match ``a[href$="/activate"]``
  and the section that contains it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 15:39:14 +00:00