Same wrong-testid bug as the Close click handler: the CSS rule
that's supposed to position the hidden ``st.page_link`` off-screen
was selecting ``a[data-testid="stPageLink"]``, but the bare
``stPageLink`` testid is on the OUTER wrapper div — the anchor
uses ``stPageLink-NavLink``. ``:has(a[data-testid="stPageLink"]...)``
matched nothing, so the helper rendered as a full-size visible
row at the bottom of every page (the "large white bar blocking
content" the user reported).
Fix: switch both the ``:has()`` rule and the no-:has() fallback
to ``a[data-testid="stPageLink-NavLink"][href*="close"]``. The
``href*="close"`` form also works for base-path deployments
(``/myapp/close``), matching the click handler's selector.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two bugs combined to make the footer Close a no-op:
1. The helper page_link's anchor carries
``data-testid="stPageLink-NavLink"`` — the bare
``stPageLink`` testid is on the OUTER WRAPPER div, not the
anchor. The old selector ``a[data-testid="stPageLink"]``
matched nothing, so ``helper`` was always ``null``.
2. The fallback ``window.location.href = './close'`` ran inside
the component iframe, so it only navigated the (invisible)
srcdoc iframe. The main app stayed put.
End result: click → nothing visible → shutdown_app never runs →
farewell-script's ``window.close()`` attempt never happens →
user sees the Close button as broken.
Fixes:
- Selector → ``a[data-testid="stPageLink-NavLink"][href*="close"]``.
``href*="close"`` covers both root (/close) and base-path
(/myapp/close) deployments.
- Fallback → resolve the parent window via
``doc.defaultView`` (the parent doc's window) with a
``window.top`` fallback, so the hard-nav navigates the whole
app instead of just the iframe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Streamlit logs a deprecation notice on every render:
Please replace ``st.components.v1.html`` with ``st.iframe``.
``st.components.v1.html`` will be removed after 2026-06-01.
Replace all 9 call sites (6 tool pages + 3 in ``_legacy.py``).
Both APIs feed ``srcdoc`` to the underlying iframe so the
HTML/JS payload and the cross-frame DOM access pattern
(``window.parent.document``) are unchanged.
``st.iframe`` rejects ``height=0`` (raises ``StreamlitInvalid
HeightError``), so bump every zero-height call to ``height=1``.
1px is effectively invisible — these are script-only iframes, no
visible payload — and avoids the validator.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Footer Close was using ``<a href="./close">`` which triggers a
browser hard-nav. That's a visible page-reload flash, websocket
churn, and slower shutdown than the previous sidebar Close —
which used ``st.navigation``'s soft nav.
Restore the soft-nav path:
- ``render_sticky_footer`` now renders a hidden ``st.page_link``
pointing at ``pages/99_Close.py``. Positioned off-screen via
CSS (``stElementContainer:has(a[data-testid=stPageLink]
[href$=/close])``) so it occupies no layout space but stays in
the DOM, reachable + clickable.
- Footer's Close <button> click handler now dispatches a
programmatic click on that hidden page_link. Streamlit's React
handler picks it up and runs the soft nav (same code path the
old sidebar entry used). Falls back to ``window.location.href``
if the helper link hasn't rendered yet so the button is never
a no-op.
- The page_link call is wrapped in try/except: ``AppTest`` doesn't
populate the page-nav session keys it needs and raises
``KeyError('url_pathname')``. Failure costs only the soft-nav
optimization — Close still works via the hard-nav fallback.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups to the prior sidebar/footer cleanup:
- The "_hidden" section header was still visible in the sidebar
because Streamlit renders ``stNavSectionHeader`` as a sibling of
``stNavSection``, not a child — so the ``:has()`` rule on the
section was hiding the items list but leaving the header
(and its collapse/drilldown marker) behind. Move Activate +
Close into the unlabeled section (key ``""``) alongside Home so
there is no header to leak in the first place, then hide just
the two links via ``stSidebarNavLinkContainer:has(...)`` (with
a defensive ``a[href$=...]`` fallback for browsers without
``:has()`` support).
- The sticky footer was missing on ``pages/_Activate.py`` because
the page never called ``render_sticky_footer`` — added the
call so the Help / Close bar persists when the user follows
the popover's Activate / Manage link.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Collapse the Account section: Activate now lives in the same
hidden sidebar section as Close (single ``_hidden`` group). Both
pages stay registered with ``st.navigation`` so /activate and
/close remain URL-routable for the Help-popover / Close-button
links — only the sidebar entries + their section header are
hidden via CSS.
- Help popover always exposes a license-management link now:
``Activate now →`` when the license is inactive, ``Manage
license →`` when it is active and valid. Both point at
``./activate``.
- Extend the sidebar-hide CSS to also match ``a[href$="/activate"]``
and the section that contains it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Bump version to 3.0 (src/__init__.py).
- Switch support address to support@unalogix.com.
- Help popover now includes a License section that reads
``src.license.current_state()``:
* When activated + valid: name + expiry date + days remaining.
* Otherwise: "Not activated" + an ``Activate now →`` link
pointing at ``./activate``.
License-state queries are wrapped so a corrupted license file
can't take the footer down — it falls through to the inactive
branch.
- Popover HTML is now built in Python (so the license branch
lives in one place) and passed to the JS as a single string.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three small follow-ups to the sticky-footer rework:
- Left-justify the footer buttons (and reposition the Help popover
to anchor at the left edge so it lines up with its trigger).
- Remove the per-page ``st.divider() + st.caption("Runs locally…")``
trailing block from all 9 tool pages. The new sticky footer
covers that text, so it was rendering as an empty white bar at
the bottom of each tool page.
- Hide the Close entry from the sidebar nav via CSS. The page stays
registered with st.navigation so /close is still routable for the
sticky-footer Close button — only the sidebar link + its section
header are hidden (via :has() on stNavSection).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The duplicate full-width Back-to-Home button at the bottom of every
tool page was reading as a "huge footer." Replace it with a real
slim sticky footer holding two controls:
- Close: <a href="./close"> to the Close page (which shuts down).
Full-page nav is fine here — the process is terminating, so the
session-state-loss concern that retired the previous sticky
footer doesn't apply.
- Help: JS-toggled popover showing version + support@datatools.app.
No navigation, no state loss.
Top-of-page Back-to-Home stays (uses st.switch_page, preserves
state). Add footer.* i18n keys for en + es.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User pulled d9e32e5 (async-writer audit log + re-enabled diagnostics
sidebar) and still sees blank pages. The synchronous-write theory
from the previous round was at most a partial explanation; something
ELSE in the audit-log code path is also taking the page render down
on the user's machine.
Restore the kill switch so the user has a working app while we
diagnose:
- ``src/audit.py``: ``_DISABLED = True`` re-introduced at module
top, each of ``log_event`` / ``log_session_start`` /
``log_page_open`` / ``flush_audit_log`` early-returns. The async
writer thread is never started.
- ``hide_streamlit_chrome``: ``_render_diagnostics_sidebar()`` call
re-gated behind ``if False:``.
The async writer code stays in place — easier to flip the flag back
when we identify the real cause than to rewrite a third time. The
shutdown-flush call in ``shutdown_app`` also stays; it early-returns
on the kill switch and is harmless.
Diagnostic plan for the next session: ask the user for the launcher
terminal output (the new stderr "DataTools audit: writes failing..."
message would tell us if the writer thread DID start and DID fail),
and whether ``~/.datatools/logs/`` is being created at all.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported earlier: synchronous file writes in ``log_event`` blocked
the GUI render thread on hostile filesystems (Windows antivirus on
``~/.datatools/logs/`` is the prime suspect). A blocking ``open``
call doesn't raise — try/except can't recover from it — so the
only safe re-enable is to take file I/O off the render path.
Refactor:
- ``log_event`` and friends push events onto a ``deque(maxlen=5000)``
via ``put_nowait`` and return in microseconds.
- A single daemon thread (``datatools-audit-writer``) drains the
queue and writes batches. Holds the queue lock only long enough to
snapshot + clear, then does I/O outside the lock so producers can
keep enqueueing.
- ``audit_log_path()`` is now pure path arithmetic — no ``mkdir``
no ``open``. The writer thread does the directory creation off
the request path, so any hang there only affects the writer.
- Bounded queue means an unwritable disk doesn't unbounded-grow
memory; the queue caps at 5000 and overflow drops OLDEST events
so the most-recent (most-diagnostic) ones survive.
- First write failure prints once to stderr; subsequent failures
are silent so logs don't drown the launcher terminal.
- ``flush_audit_log(timeout_s=0.5)`` drains the queue and signals
the writer to exit; bounded so a stuck disk can't delay shutdown.
Other changes in this commit:
- ``shutdown_app`` now emits a "Session ending" event and calls
``flush_audit_log`` before kicking the os._exit timer, so the
closing session's events make it to disk.
- The Diagnostics sidebar in ``hide_streamlit_chrome`` is
re-enabled (the ``if False:`` gate is removed). Wrapped in
try/except defensively — render errors print to stderr, never
blank the page.
- ``_DISABLED`` kill-switch is gone. The async design IS the
safety mechanism now.
Tests in ``tests/test_audit.py``:
- log_event burst of 1000 events completes in well under 1s
(proves non-blocking).
- Events queued before flush land on disk with the expected JSON
shape; session_start renders; idempotent.
- Pointing the audit dir at a file (so mkdir fails) doesn't hang
or crash the producer.
- Non-JSON extras are str()-coerced rather than dropped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: after the sticky-footer href fix (be7191a) the back-to-home
click worked but the home-page upload list disappeared. Full-page
navigation via ``<a href>`` doesn't preserve ``st.session_state`` on
the user's Streamlit build.
Trade-off forced: pick visible-from-anywhere sticky footer OR state
preservation. Can't have both because ``st.switch_page`` (soft nav,
preserves state) needs a real Streamlit button widget, and Streamlit
widgets can't be reliably CSS-positioned to the viewport bottom —
Streamlit owns the widget DOM and remounts it on every rerun.
State preservation wins. Going back to the pre-sticky design:
- ``render_sticky_footer()`` becomes a no-op shim. Kept as a callable
so the call sites in every tool page don't have to be touched in
this commit; the original implementation is preserved as
``_render_sticky_footer_DISABLED`` if we ever decide to revisit.
- Every Ready/Coming-Soon tool page (1-9) gets ``back_to_home_link()``
reinstated near the top of the page (visible at scroll-top) AND
``back_to_home_link(key="_back_to_home_link_bottom")`` reinstated
near the bottom of the page (visible at scroll-bottom). Both
instances call ``st.switch_page`` via the existing helper — soft
nav, no full reload, ``st.session_state["home_uploads"]`` and
every other session-state key survive.
User trades the "always-visible while scrolling" sticky behavior for
the upload-list-survives-navigation behavior. The two-button pattern
(top + bottom) was what we had before the sticky-footer experiment;
on short pages both are visible at once, on long pages the user has
one in reach at either end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: clicking Back to Home in the sticky footer surfaced
Streamlit's "Page not found — Running the app's main page" message
in the user's build.
Root cause: ``url_path="home"`` on the home page's ``st.Page``
registration is treated as an alias for the default page in some
Streamlit minor versions, but the user's build doesn't honour the
alias for the page that ALSO has ``default=True``. The default page
is served at the root URL ``/``; ``/home`` is treated as a missing
page on that build.
Switch the footer anchor's href from ``"home"`` (which resolved to
``/home`` from any tool-page URL) to ``"./"`` (resolves to the
current document's directory, which on a single-segment URL is the
server root → default page → Home). Robust across Streamlit minor
versions regardless of how the url_path alias is interpreted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User confirmed: with the audit-log kill switch (1caedbb) in place,
pages render. So the hang was 100% in the audit-log file writes —
``open()`` blocking on Windows somewhere — not in the chrome
additions disabled during bisection.
Two of those three additions are pure UI and have no filesystem
exposure, so they're safe to re-enable now:
- **Sticky footer**: pure CSS + a components-html iframe whose JS
appends a div to ``parent.document.body``. No disk touch. The
user just reported losing the Back-to-Home button to the
bisection commit — restoring this brings it back.
- **Compact-spacing CSS layer**: gap reductions on stVerticalBlock
/ stHorizontalBlock, slim heading margins, slim hr / caption /
expander / button / metric padding. Pure CSS.
What stays disabled:
- **Audit-log writes** (``src/audit.py:_DISABLED = True``). Any
resumption needs an async-write design with a hard timeout so a
stuck filesystem can't hang the GUI render.
- **Diagnostics sidebar**: it calls ``audit_log_path()`` which
itself does a ``mkdir()`` — and a hanging mkdir would re-introduce
the same blank-pages symptom. Will re-enable once the audit log
is rewritten not to block.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: uploading multiple files on the home page and clicking Run
analysis blew up with
StreamlitDuplicateElementKey: key='_findings_open_02_text_cleaner'
when two uploaded files both had Clean Text findings.
Root cause: ``render_findings_panel`` is invoked once per uploaded
file from ``_home.py``, but the per-tool jump button used a
filename-agnostic key:
key=f"_findings_open_{tool_id}"
Two files both flagging Clean Text → two buttons with identical keys
→ Streamlit rejects the second one.
Fix:
- Add ``key_namespace: str = ""`` to ``render_findings_panel``. The
helper hashes it (sha1 truncated to 8 chars) and appends to every
button key, so different namespaces produce different keys but the
same namespace stays stable across reruns.
- The home page now passes the filename:
``render_findings_panel(findings, header=f"📄 {name}", key_namespace=name)``.
- The single-call site in ``upload_and_analyze_section`` (the legacy
helper, only used outside the new home-page path) keeps the default
empty namespace, which is fine because that path renders findings
for ONE file at a time.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: every page renders empty in the main body even after the
audit-log defensive-wrap commit (59c6d0f). Close button also doesn't
trigger shutdown — that page is blank too. Sidebar nav still renders,
so the chrome path that runs on every page is the suspect.
Three chrome additions land all at once and are temporarily turned
off so the user can see whether bare chrome restores rendering:
1. **Sticky footer (``render_sticky_footer``)**: short-circuited with
``return`` at the top of the function. The CSS-injection +
components-html iframe mechanic is the highest-suspicion item —
if the iframe script throws or the CSS interacts badly with the
user's Streamlit / Python build, the side effects can be
page-killing on theirs while invisible on ours. The original body
is preserved as ``_render_sticky_footer_DISABLED`` so re-enabling
is a one-line change.
2. **Diagnostics sidebar (``_render_diagnostics_sidebar``)**: call
site in ``hide_streamlit_chrome`` is gated by ``if False:``.
Wrapping in try/except (the previous commit) caught exceptions
but didn't help — silent partial renders inside
``with st.sidebar: with st.expander: ...`` can still leave the
render stack in a bad state on some Streamlit versions.
3. **Compact-spacing CSS layer**: the ``gap: 0.5rem !important;`` on
``stVerticalBlock`` / ``stHorizontalBlock``, the slim heading
margins, the slim hr / caption / expander / button / metric
rules — all stripped back to the pre-compact ``_HIDE_CHROME_CSS``.
The ``gap`` rule in particular is a suspect: if the user's
Streamlit version doesn't render stVerticalBlock as a flex
container, the rule is harmless; if it does and interacts badly
with overflow, content could be clipped.
What's deliberately KEPT enabled:
- The audit-log calls (already wrapped from 59c6d0f).
- ``log_page_open`` calls in tool pages (already wrapped internally).
- All UI changes pre-compact (the unified tool-page layout, the
download-button helper, etc.).
If pages render after this commit, we know it's one of the three
disabled items above and can bisect further. If they still don't
render, the cause is in code that pre-dated the audit-log work and
the bisection has to keep going.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: after pulling commit c73d716 (audit log) the main body of
every page showed empty. Sidebar nav still worked.
Diagnosis: the most likely path is that something inside the audit
calls — ``_render_diagnostics_sidebar()`` calling ``audit_log_path()``,
or ``log_session_start()`` itself — raises during ``hide_streamlit_chrome``
on the user's environment (Python 3.14 on Windows, a less-tested
combo than the test environment). Streamlit's script runner sees the
exception, and on some chrome paths it eats it without surfacing an
error block, leaving the page body empty.
The audit log is best-effort by design. Make that contract real:
1. ``hide_streamlit_chrome`` now wraps both ``log_session_start()``
and ``_render_diagnostics_sidebar()`` in try/except. Errors print
to stderr (so the developer running ``python -m src.gui`` sees
them in the launcher's console) but never bubble up to kill the
page render.
2. ``audit_log_path()`` already had a tempdir fallback for the
primary mkdir failure, but the SECOND mkdir wasn't protected
either. Restructured to a two-level fallback: configured dir →
tempdir → ``/dev/null`` (or ``NUL`` on Windows). The last fallback
ensures the function never raises; ``log_event``'s own try/except
handles the eventual unwritable-file case.
3. ``log_page_open(slug)`` now has an outer try/except so it cannot
raise either — protecting every tool page's render path.
If a user reports the same symptom again, the launcher terminal will
now show a real traceback explaining what's wrong, and the GUI will
still render normally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New ``src/audit.py`` module records GUI actions to a per-session
JSONL file under ``~/.datatools/logs/`` (overrideable via
``DATATOOLS_AUDIT_DIR``). The file is human-readable (one JSON
object per line, each with a ``message`` field) AND trivially
machine-parseable — the support flow is "client mails the file,
we read it and explain what went wrong."
Format example::
{"ts":"2026-05-17T05:30:00.123+00:00","level":"info","category":"session",
"session":"a1b2c3d4","message":"Session started",
"platform":"Windows 11","python":"3.14.0","user":"Michael Dombaugh",
"log_file":"C:\\Users\\Michael Dombaugh\\.datatools\\logs\\datatools-...jsonl"}
{"ts":"...","category":"upload","message":"Uploaded customers.csv",
"filename":"customers.csv","bytes":24813}
{"ts":"...","category":"analyze","message":"Analyzed customers.csv (3 findings)",
"filename":"customers.csv","findings":3,"rows":120,"cols":8}
{"ts":"...","category":"tool_run","message":"Clean Text run",
"page":"2_Text_Cleaner"}
{"ts":"...","category":"error","level":"error",
"message":"analyze(weird.csv): EmptyDataError: No columns to parse",
"filename":"weird.csv","outcome":"empty_after_repair"}
Public API:
- ``log_event(category, message, **extra)``
- ``log_session_start()`` — idempotent banner with platform info
- ``log_page_open(slug)`` — emit a ``nav`` event, deduplicated per
Streamlit session so reruns don't spam the log
- ``log_exception(where, exc, **extra)`` — convenience wrapper
- ``audit_log_path()`` / ``audit_log_dir()`` — for the UI
Wired in at:
- ``hide_streamlit_chrome``: stamps session start, mounts a small
"🩺 Diagnostics" expander in the sidebar with the log path and
an "Open log folder" button so the user can grab the file to
attach to a support email.
- Home page: ``upload`` event on every new file, ``upload`` event
on per-file remove, ``analyze`` event with file count when
Run-analysis fires.
- ``_run_analysis_on_upload``: ``analyze`` event with rows / cols /
findings count per file, plus ``error`` events on every caught
exception (empty upload, empty after repair, pandas EmptyDataError,
generic Exception).
- Every Ready tool page (1, 2, 3, 4, 5, 9): ``tool_run`` event
immediately after the primary action stashes its result.
- Every tool page (1-9): ``log_page_open(slug)`` on render — deduped
via session state so we don't get one event per Streamlit rerun.
Safety:
- ``log_event`` wraps every write in try/except. A broken audit
log must NOT crash the GUI.
- Non-JSON-serializable extras are ``str()``-coerced before writing.
- File CONTENTS are never logged. We capture filename, byte count,
and (in the analyzer) a 12-char sha1 fingerprint of the bytes so
the same file re-uploaded gets the same trace.
- License keys, session cookies, etc. are not logged.
- ``DATATOOLS_AUDIT_DIR`` env var lets tests redirect writes into a
tmp dir.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two reported issues addressed together because they're the same UX
flow (home findings panel → jump to relevant tool).
(1) Format-Standardizer recommendations weren't firing.
Reported: uploading a file from the format-cleaner test corpus
(``24_format_dates.csv``, ``25_format_phones.csv``,
``29_format_currencies.csv``, ``30_format_integration.csv``) showed
zero "Standardize Formats" recommendations even though the columns
clearly mixed multiple date / phone / currency formats.
Two underlying causes:
- ``_detect_inconsistent_date_format`` required two MATCHES per
distinct format. A test column with N rows each in a different
format had ≤1 match per format and was silently passed over.
Loosened to "≥1 match per format" — the inconsistency signal is
the presence of ≥2 distinct formats, not their volume.
- Only date inconsistency was detected. Phones, currency, and
booleans (the other format-standardizer fix categories) had no
detector at all.
Added three new detectors:
- ``_detect_inconsistent_phone_format``: nine phone-format regexes
(plain-10, US paren / dash / dot / space, +country, extension,
intl plus). Fires when a column is ≥35% phone-shaped AND mixes
≥2 formats.
- ``_detect_inconsistent_currency_format``: thirteen currency regexes
covering US ($1,234.56 / $1234.56), EU (€1.234,56), India lakh
notation, Swiss apostrophe, trailing-symbol, parens-negative,
prefix-currency-code, suffix-currency-code, and negative variants.
Same fire criteria as phone.
- ``_detect_inconsistent_boolean_format``: column is ≥80% boolean
tokens (yes/no/y/n/true/false/1/0) AND uses ≥3 distinct surface
forms (e.g. yes / Y / true / 1 mixed together).
Verified on every file in ``test-cases/format-cleaner-corpus/``:
24_format_dates, 25_format_phones, 29_format_currencies all now
produce a format-standardizer Finding. The integration test file
flags all three.
The threshold loosening (from 50% to 35% of values format-shaped) is
still strict enough to avoid false-positives on free-text comment
columns where a few cells happen to look phone- or date-shaped.
(2) The "Open <Tool>" jump links blended into the page.
Reported: the per-tool jump links inside the home findings panel
were too subtle to notice.
Replaced ``st.page_link`` with ``st.button(type="primary")`` so the
buttons render in Streamlit's primary-action red colour, matching the
"Clean Text" / "Find Duplicates" / etc. run buttons. Click handler
delegates to ``st.switch_page(page_slug)`` so it's still a soft
in-app navigation (no full reload).
2220 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: the sticky footer rendered, but the Back to Home button
inside it wasn't visible.
Likely cause: ``st.markdown`` inserts the footer div inside Streamlit's
content tree, which sits under ``.stApp { zoom: 0.85 }`` (our compact
scaler) and several nested padding/positioning contexts. Streamlit's
own ``<a>`` styling rules can also colour-collide with our anchor.
Switch the mount strategy. Two passes:
1. CSS rules go to the parent document via ``st.markdown`` as before,
but every property carries ``!important`` and the selectors key on
``#datatools-sticky-footer`` (id, not class) plus a dedicated
``.datatools-sticky-footer-link`` class on the anchor — so
Streamlit's default ``<a>`` styles can't override colour or
padding. ``z-index: 2147483646`` keeps the footer above
anything else in the page.
2. The footer DOM node itself is created by a script inside a
zero-height ``streamlit.components.v1.html`` iframe. The script
does ``window.parent.document.body.appendChild(...)`` so the div
lives as a direct child of ``<body>`` — outside ``.stApp``,
outside every Streamlit container, free of every parent's
``zoom`` / ``transform`` / ``overflow`` rules.
If the cross-frame access ever fails (Streamlit sandbox config
change), the script falls through to appending inside the
iframe's own document — degraded but still visible.
Each rerun replaces any prior ``#datatools-sticky-footer`` so we
don't accumulate stacked footers on every script pass.
2220 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two unrelated UX issues addressed in one sweep across all nine tool
pages because they share the same edit surface.
(1) Sticky footer replaces the top + bottom back-link buttons.
Reported: a big white empty footer space at the bottom of every page;
the Back to Home button at the top scrolled out of view on long pages.
New ``render_sticky_footer()`` helper in ``components/_legacy.py``
injects a fixed-position bar at ``bottom: 0`` of the viewport with:
- A border-top so it visually reads as a non-movable bar.
- A semi-transparent background (rgba 0.96 + ``backdrop-filter: blur``)
so content underneath shows through faintly when the user scrolls.
- A styled ``<a href="home">`` anchor (not an ``st.button``) because
Streamlit widgets can't be CSS-positioned reliably — Streamlit owns
the widget's DOM container and re-mounts it on every rerun. A real
anchor sits exactly where the CSS puts it and triggers Streamlit's
URL routing to the home page.
- ``padding-bottom: 3.5rem`` on the main container so the last widget
isn't hidden behind the bar.
Called once per tool page, immediately after ``hide_streamlit_chrome()``
so it renders even on pages that ``st.stop()`` early before any other
content runs. The old top-and-bottom ``back_to_home_link()`` calls are
removed from every tool page; their entry/exit points were dropping
the button when the script short-circuited.
(2) Tool-page headers now localize.
Reported: switching the sidebar language picker to Spanish left the
tool page's title + caption in English. Root cause: every page had
hard-coded ``st.title("✂️ Clean Text")`` / ``st.caption("Trim
whitespace...")`` strings.
Added per-tool ``tools.<id>.page_title`` and
``tools.<id>.page_caption`` keys to ``en.json`` and ``es.json`` for
all nine tools. Routed each page's title/caption call through ``t()``.
Verified: with ``ui_lang=es`` set, the Clean Text page now renders
"✂️ Limpiar texto" + the Spanish caption.
Updated ``tests/gui/test_smoke.py::EXPECTED_SUBSTRINGS`` so the
``es`` column for each tool page asserts the actual Spanish string
(was a duplicate of the English string back when the page bodies
were English-only).
2220 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: too much whitespace between widgets, dividers, and headings.
Compact-spacing CSS layer added to ``_HIDE_CHROME_CSS`` (so it applies
on every page that calls ``hide_streamlit_chrome``):
- ``[data-testid="stVerticalBlock"]`` and ``stHorizontalBlock`` gap
trimmed from Streamlit's default ~1rem to 0.5rem.
- Heading margins (h1-h4) tightened — h1/h2/h3 used to leave 1-1.5rem
above; now 0.25-0.5rem.
- ``hr`` (``st.divider()``) drops from 1rem above+below to 0.4rem.
- Markdown paragraphs and captions: 0.25rem bottom margin instead of
the default 1rem.
- Expander summary padding reduced (0.35rem top/bottom).
- File-uploader, button, and metric tiles: trimmed internal padding.
Also slimmed the main-container padding from 1rem top / Streamlit
default bottom (~6rem) to 0.5rem top / 0.75rem bottom.
The existing ``zoom: 0.85`` on ``.stApp`` is kept — the user wanted
*less white space*, not *smaller content*, and dropping zoom would
shrink type alongside everything else.
2220 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: user asked whether we can send Alt+F4 / Ctrl+W to the
browser from JavaScript to force-close a tab.
Honest answer that's now baked into the hint message: NO. Synthesized
keyboard events from page JS only reach DOM event listeners, not the
browser chrome or the OS. There is no flag, API, or trick that lets
a page close a tab the user opened themselves. The page CAN close a
window it opened (window.opener trail) or one whose display-mode is
``standalone`` (Chrome/Edge ``--app=URL``) — that's what
``python -m src.gui`` arranges, and that's the path that actually
closes the window without a manual Ctrl+W.
Improvements landed:
1. ``isStandalone(win)`` detects Chrome --app windows up front
(``matchMedia('(display-mode: standalone)').matches``). In a
regular tab the manual hint surfaces immediately on the
"Close this window" click; in --app mode we only show it if the
close attempt actually fails.
2. ``fallbackToBlank(win)`` navigates the tab to ``about:blank``
via ``location.replace`` (no history pollution) so the user
sees a clean empty tab instead of the farewell overlay frozen
over Streamlit's connection-error banner. They still have to
Ctrl+W the blank tab, but the screen is no longer a misleading
"did it close or not?" mess. Fires 250 ms after a failed close
in --app mode (very rare path), or 1.5 s in a regular tab so
the user has time to read the hint.
3. Hint message rewritten in en + es to explain WHY the close is
blocked (browser security — not something we can override), to
acknowledge the Alt+F4 / Ctrl+W question directly (those don't
work either, for the same reason), and to point at
``python -m src.gui`` as the path that gives a clean auto-close.
2220 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two issues, same fix surface.
(1) Reported crash on Back-to-Home:
StreamlitAPIException: Could not find page: app.py.
``st.switch_page("app.py")`` doesn't work under ``st.navigation`` —
the entry script is the nav manager itself and is not a registered
page. The fix needs to pass an ``st.Page`` object whose script
identity matches one registered in the nav.
First-pass attempt (``from src.gui.app import _home_page``) hit a
worse failure: importing ``app.py`` from inside a tool-page render
re-executes the nav setup with the WRONG "main script" context, so
every ``st.Page("pages/N_foo.py", ...)`` call in ``_build_navigation``
fails with "file could not be found".
Extract the home renderer into its own module ``src/gui/_home.py``
which has no top-level Streamlit side effects. Both the nav manager
and the back-link helper import ``_home_page`` from there. The Page
object built at click time has the same callable identity as the one
registered, so ``st.switch_page`` resolves it.
(2) Reported UX: the back button scrolled out of view on long pages.
Add a second ``back_to_home_link(key="_back_to_home_link_bottom")``
call near the footer of every tool page (1-9). The unique key avoids
widget-id collision with the top instance. Coming-Soon stubs get it
unconditionally; Ready tools render it only after a result exists
because the page short-circuits with ``st.stop()`` before then —
when no result is on screen the page is short enough that the top
link is sufficient.
2220 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: clicking "Open Downloads folder" was opening the Documents
folder instead of Downloads. Root cause is the classic Windows
gotcha: when the path contains a space (e.g.
``C:\Users\Michael Dombaugh\Downloads``), Python's
``subprocess.Popen`` packs the ``/select,...`` argument into a single
quoted token, and Explorer's ``/select`` argument parser does NOT
accept that form — it silently falls back to whatever the user's
default Explorer view is (typically Documents).
Resolution paths considered:
- ``shell=True`` with a hand-built command string — works but opens
the door to shell-injection if a file_name ever contained a quote
or special char.
- ``cmd /c start "" explorer /select,...`` — same parsing issue.
- ctypes ShellExecuteW — pulls in a Windows-only dependency.
- **Skip /select. Open the folder directly.** ✓
Going with the last. ``explorer <folder>`` reliably opens the folder
regardless of spaces in the path; the user finds the freshly-saved
file by its name. The previous "highlight the file" nicety wasn't
worth the path-parsing fragility — every user folder on Windows is
``C:\Users\<name>`` and every Windows username can contain a space.
macOS keeps the ``open -R <file>`` reveal-in-Finder path because
macOS argument parsing is sane and that's a strict UX win.
2220 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: clicking "Open Downloads folder" did nothing visible. The
previous implementation called ``os.startfile(folder)`` on Windows,
which is known to silently no-op or open Explorer behind the active
window in some configurations (Streamlit running headless, no
foreground rights inherited by the click handler thread, etc.).
Switch to the more reliable ``explorer /select,<file>`` form:
- Opens Explorer with the just-saved file pre-highlighted instead of
just navigating to the folder — better UX than the old behavior.
- explorer.exe is a real GUI process that's spawned in the user's
session with foreground rights, so it shows up on top.
- Fallback chain on Windows: ``/select`` first, then plain
``explorer <folder>``, then ``os.startfile`` as a last resort.
macOS upgraded the same way: ``open -R <file>`` reveals in Finder
rather than opening the directory.
Linux: no reliable cross-distro reveal, so ``xdg-open <folder>``.
Plus user feedback at the call site:
- On successful dispatch: ``st.toast("Opening <folder>", icon="📂")``
— confirms we tried, in case the window comes up behind the
browser.
- On dispatch failure: ``st.warning`` with the full path the user
can copy/paste into their file manager manually.
2220 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch the download mechanic from "browser <a download> with a data:
URL" to "write the bytes directly to the user's Downloads folder and
show them the exact path". DataTools runs as a local Streamlit app,
so the "server" IS the user's machine — there's no reason to go
through the browser save dialog at all.
Flow:
1. Click "Download <something>" button (rendered as a regular
``st.button``, so no widget-collision issues).
2. Bytes are written to ``Path.home() / "Downloads" / file_name``
(overwriting any same-named file).
3. The page reruns and renders a success caption with the absolute
path the file landed at.
4. An "📂 Open Downloads folder" button appears. Clicking it pops the
OS file manager via ``os.startfile`` (Windows), ``open`` (macOS),
or ``xdg-open`` (Linux).
Why this is better than the previous HTML-data-URL helper:
- Unambiguous about where the file went — user sees the full path,
not "wherever your browser was configured to save".
- The data: URL approach base64-inflated the page payload by 33% and
bloated for large outputs; server-side write is byte-for-byte.
- No more browser-side widget collision class of bug.
- The save action is a real Streamlit button, so the existing widget
semantics (disabled, help tooltip, key isolation) work without
workarounds.
API surface unchanged. New canonical name ``local_download_button``;
``html_download_button`` is kept as a back-compat alias that points
at the same implementation — every existing call site continues to
work without edits.
Tests are protected from polluting the developer's home dir via a
``DATATOOLS_DOWNLOADS_DIR`` env var override returned by the new
``_downloads_dir()`` helper. Smoke verified end-to-end via AppTest:
click → file appears in tmp dir → success banner shows path →
open-folder button renders.
2220 tests pass, 91 skipped, 35 s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported: uploading 13_non_latin_scripts.csv made the home page bubble
a ``pandas.errors.EmptyDataError`` traceback up through the page
chrome instead of surfacing as a per-file error. In a multi-file
analysis run that kills every other file's results too, which is
worse than the symptom itself.
Wrap ``_run_analysis_on_upload`` in proper error handling:
- Empty bytes ``getvalue() == b""`` short-circuits with a synthetic
error Finding telling the user the upload was zero-byte and to
re-upload.
- Empty ``repair.repaired_bytes`` (file was all NULs / BOM / stripped
to nothing) likewise surfaces as a synthetic Finding rather than
reaching pd.read_csv.
- ``pd.errors.EmptyDataError`` from pandas is caught and rendered as
a Finding that names the file, its byte size, and suggests opening
it in a text editor to verify the header row matches the data row
delimiter.
- Any other exception during read/analyze is caught and surfaces as
a Finding via ``format_for_user`` so the user gets a clean message,
not a Python traceback.
Each file in a multi-file run now stands alone: a bad file produces
one red banner in its own card, every other file analyzes normally.
The 13_non_latin_scripts.csv corpus file is 249 bytes of valid UTF-8
on disk and parses cleanly under the same code path locally — the
user's specific symptom is likely a zero-byte upload (browser /
network / Python 3.14 + Streamlit edge case). The new ``empty_upload``
finding will name the bytes count so they can confirm.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There is no JavaScript override for browser tab-close security:
``window.close()`` only succeeds on windows JS opened (Chrome --app
windows qualify; a regular browser tab does not). What we can do is
make the --app path easier to hit and the failure case more
actionable.
Three changes:
1. ``src/gui/__main__.py`` — extend browser detection. PATH lookup
now also looks for ``msedge`` / ``microsoft-edge``; Windows install
candidates include the Edge install path; macOS candidates include
Edge and Chromium. Edge is Chromium-based, supports ``--app``, and
ships on every Windows 10+ machine — so users without Chrome no
longer fall through to the regular browser tab. When the fallback
IS hit, print a warning to stderr explaining why Close-from-page
will require Ctrl+W. Renamed ``_find_chrome`` to
``_find_app_browser`` to reflect the broader scope.
2. ``_FAREWELL_SCRIPT_TEMPLATE`` in ``components/_legacy.py`` —
factor close attempts into a ``tryClose`` helper that runs three
escalating tries: standard ``win.close()``, the
``win.open('', '_self')`` history-rewrite trick (no-op in modern
Chrome but free), and ``win.top.close()``. Auto-close on paint AND
the manual button now both call this helper. Skip the manual hint
if the close eventually succeeded between the click and the 250 ms
timeout.
3. ``quit.close_hint`` in en/es i18n packs — rewrite the message to
tell the user honestly that this is a browser security restriction,
tell them the Ctrl+W keystroke that works, and point them at
``python -m src.gui`` for the auto-closing app-mode experience.
2008 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported symptom: only the FIRST download button in a multi-button
row pops the browser save dialog. The second and third do nothing on
click. Affects every tool page that exposes (cleaned + audit + config)
downloads.
Root cause is ``st.download_button`` itself — when several render in
the same script pass, the click-to-bytes wiring on the browser side
mis-routes and only one button's data is actually exposed. Explicit
``key`` arguments don't fix it; ``use_container_width=True`` doesn't
help either; we confirmed this in the Text Cleaner reverts.
Replace the widget with a real ``<a download="file" href="data:...">``
anchor rendered via ``st.markdown(..., unsafe_allow_html=True)``.
Bypasses Streamlit's widget machinery entirely; behaves identically to
a native browser download. Side benefit: clicking it does NOT trigger
a script rerun, so other in-flight UI state survives.
New helper ``html_download_button`` lives in
``src/gui/components/_legacy.py`` (exported from ``components``). API:
html_download_button(
label, data,
*, file_name, mime="application/octet-stream",
disabled=False, help=None, use_container_width=True,
)
Translation pattern applied across every tool page (and shared
``results_summary`` / ``config_panel`` widgets in ``_legacy.py``):
- ``st.download_button(`` -> ``html_download_button(``
- ``data=foo_bytes`` kwarg -> positional second arg
- ``key="..."`` -> dropped (helper has no widget identity)
- ``use_container_width=True`` -> dropped (default)
- ``disabled=`` and ``help=`` pass through unchanged
- Pre-computed byte buffers kept where they were
Total: 17 sites replaced (3 in Text Cleaner, 3 in Format
Standardizer, 3 in Fix Missing Values, 3 in Map Columns, 3 in
Automated Workflows, 2 in Find Duplicates page + 4 in shared
_legacy.py widgets used by Find Duplicates).
Caveat: data: URLs balloon by 33% (base64). Fine for tool output
sizes we ship; if a future result topped a few hundred MB we'd want a
Blob-URL fallback.
The marketing demo at src/gui/app_demo.py keeps its single
st.download_button — single button, no collision, no need to switch.
2008 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The farewell overlay already attempted ``window.top.close()`` after a
Close click — but browsers only honour that for tabs that JS opened
(Chrome --app windows qualify; a regular browser tab does not). For
users whose Chrome wasn't auto-detected and who fall back to
``webbrowser.open``, the overlay stays put and they had no in-page
way to close.
Add to the overlay HTML:
- A "Close this window" button (uses the user-gesture path, which has
slightly looser browser rules than auto-close).
- A hidden hint paragraph that reveals itself 250 ms after the
button is clicked IF the window is still here, telling the user to
press Ctrl+W (⌘W on Mac).
Wired through the existing _farewell_script template + ``_js_html_safe``
escaping so neither label can break out of the JS string literal.
New i18n keys (en + es): ``quit.close_window_button`` and
``quit.close_hint``.
The existing auto-close attempt remains — Chrome --app users still get
their window closed without touching the button.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Multi-file workflow: a user uploads several files on Home, clicks
"Open <Tool>" on one file's findings, lands on a tool page. The
sidebar lets them get back to Home, but a top-of-page back affordance
is more discoverable and keeps the hand in the same screen region as
the upload list they're working through.
- New ``back_to_home_link()`` helper in components/_legacy.py renders
a secondary button that calls ``st.switch_page("app.py")`` — under
``st.navigation`` that routes to the default (Home) page.
- Wired into every tool page (1-9) directly after
``hide_streamlit_chrome()`` and BEFORE the license gate so a Lite
user who lands on a locked tool can navigate away without paying.
- New i18n key ``nav.back_to_home`` ("← Back to Home" /
"← Volver al inicio") in en/es packs.
2008 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Close is now a direct shutdown trigger: visiting the Close page (the
sidebar entry) fires shutdown_app() immediately — no confirm step, no
intermediate body. The farewell overlay paints and os._exit(0) lands
~1s later from a daemon thread.
Layout: Close moved into its own bottom-of-sidebar section so the
destructive action is visually separated from Account/Activate.
- New shutdown_app() in components/_legacy.py replaces quit_button.
os._exit thread is skipped when "pytest" is in sys.modules so the
test suite doesn't suicide on rendering 99_Close.
- pages/99_Close.py shrinks to set_page_config + chrome + shutdown_app.
- app.py nav grows a new "Close" section header (new
nav.section_close key in en/es packs) pinned at the bottom of the
navigation dict.
Tests updated:
- TestQuitButtonRenders → TestClosePageShutsDownImmediately.
Assert the shutdown caption renders + no confirm button exists.
- test_smoke EXPECTED_SUBSTRINGS["99_Close"] now pins
"Shutting down" / "Cerrando" (the visible page body) instead of
the removed page title.
2008 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Home is now the only entry point: the "Run analysis" button on the
upload section IS the review step (findings render inline via
render_findings_panel). Tool pages no longer gate on a passed
normalization — running the analyzer is sufficient context.
Removed:
- src/gui/pages/0_Review.py
- src/gui/components/gate.py (re-export seam)
- require_normalization_gate() in src/gui/components/_legacy.py
- "review" section enum in tools_registry.py
- Data Review entry in app.py navigation
- require_normalization_gate() calls + imports in all nine tool pages
- tests/gui/test_gate.py (whole file)
- TestReviewWorkflow in tests/gui/test_workflows.py
- 0_Review entry in tests/gui/test_smoke.py PAGE_SLUGS
- stash_upload's normalization_result+normalization_for stashing
- stash_upload_without_gate (was the gate's negative-path helper)
2017 tests pass (16 retired with the gate flow).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two coupled hardening upgrades.
1. Asymmetric signatures (HMAC → Ed25519)
The previous HMAC scheme used a symmetric secret that any motivated
reverse engineer could pull out of the shipped binary and use to
mint blobs for any tier / name / email. With Ed25519, the binary
ships only the public verification key; the signing key never
leaves the seller's environment, so binary compromise no longer
yields forgery.
- src/license/crypto.py rewritten around
cryptography.hazmat.primitives.asymmetric.ed25519. Same public
API surface (sign/verify/encode_blob/decode_blob), same canonical
JSON encoding — drop-in for the manager / cli / GUI layers.
- DATATOOLS_LICENSE_PRIVKEY (seller-side) and
DATATOOLS_LICENSE_PUBKEY (build-time) env vars supply the keys;
the in-source dev keypair (src/license/_dev_keypair.py)
deterministically derives from a seed phrase for repro builds and
tests.
- Blob prefix bumped DTLIC1: → DTLIC2:. Decoding a DTLIC1 blob
surfaces a clear "old format" error rather than a confusing
signature mismatch.
- scripts/generate_keypair.py mints fresh production keypairs for
the seller (run once, stash the private key offline). Adds
cryptography>=41,<46 to requirements.txt (was an undeclared
transitive dep).
2. Production-safe tripwire
assert_production_safe() refuses to boot a frozen / shipped build
when either:
- DATATOOLS_DEV_MODE=1 is set (would unconditionally bypass every
license check — fine in source/test but catastrophic in a buyer
install).
- The active verification key is still the embedded dev key (the
build pipeline forgot to set DATATOOLS_LICENSE_PUBKEY).
No-op in source / pytest runs (sys.frozen is unset) so test
fixtures and dev workflows keep working without ceremony. Called
from src/cli_license_guard.guard() and from hide_streamlit_chrome
— so it fires on every CLI invocation and every GUI page load.
Tests: 49 license-layer unit tests (was 40); added Ed25519
wrong-key rejection, dev-keypair seed pin, blob v2 prefix, v1
rejection with clear message, and four production-safe scenarios
(no-op in source, fires on DEV_MODE in frozen, fires on dev key in
frozen, passes in frozen with prod pubkey). Total: 2024 → 2033.
Docs (REQUIREMENTS §17a, DEVELOPER licensing recipe, DECISIONS
§9b + decision log) updated with the new threat-model write-up,
key-storage workflow, and tripwire behaviour.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A complete offline licensing layer (no internet at any step):
Core
- src/license/ — schema (License, Tier, FeatureFlag), HMAC crypto,
JSON storage, LicenseManager singleton with activate/renew/
deactivate/issue_trial. Tier-scaffolded so future SKUs can carve
per-tool feature sets without consumer-code edits.
- scripts/generate_license.py — creator-only key generator. Mints a
DTLIC1: blob the buyer pastes into the activation page.
GUI
- New activation form component (src/gui/components/activation.py).
- hide_streamlit_chrome() now inline-renders the activation form when
no valid license is present (every page short-circuits to the form
until activated).
- Sidebar shows tier + days remaining; renewal warning under 30 days.
- New pages/_Activate.py for revisiting the form after activation.
CLI
- src/license_cli.py — activate / renew / status / trial / deactivate
commands. Exempt from the guard.
- src/cli_license_guard.py — drop-in guard call added to every tool
CLI's main(). Lets --help through; respects DATATOOLS_DEV_MODE.
i18n
- New activation.* and license.* keys in en.json + es.json
(page title, form labels, status badges, renewal warnings, error
messages). Pack parity test stays green.
Test infrastructure
- tests/conftest.py autouse fixture sets DATATOOLS_DEV_MODE=1 so the
existing 1916 tests continue to pass.
- isolated_license_path / activated_license_manager /
unactivated_license_manager fixtures for tests that want to drive
the real check.
Tests (+79)
- tests/test_license.py (40): schema, crypto roundtrip, blob
encode/decode, tier→feature mapping, activation flow, name/email
mismatch rejection, tamper detection, expiration, renewal,
dev-mode bypass.
- tests/test_license_cli.py (26): every license_cli command +
subprocess tests confirming every tool CLI refuses to run without
a license, --help always works, DEV_MODE bypasses.
- tests/gui/test_activation.py (13): gate blocks without license,
passes with trial, activation form submission unlocks the gate,
sidebar status, renewal warning, i18n.
Total: 1916 → 1995 tests. All pass under the strict warning filter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces ``src/i18n`` with a tiny JSON-backed t() lookup, an in-session
language preference, and a sidebar selector wired through
``hide_streamlit_chrome`` so every page picks up the same picker. Covers
home, tool cards, findings panel, gate, shutdown, and pickup banner
strings. Tests pin pack parity and the farewell-overlay JS escape so
future packs can't silently regress.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update the Close page intro, the shutdown overlay, and the toast so
they all read "you can close this window" — clearer for users running
the app in a dedicated browser window rather than a tab.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the data:-URL navigation (blocked by Chrome since v60 for
top-frame navigation) with a direct DOM-append of a full-screen
overlay onto the parent document. Uses z-index 2147483647 so it sits
above Streamlit's connection-error banner when the websocket drops.
Note: still doesn't fully suppress the connection-error banner in
testing — the next iteration will render the overlay through
Streamlit's own page rather than via a component iframe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the shutdown control out of the inline sidebar widget and into
its own page (pages/99_Close.py), so it appears in the sidebar nav
alongside the tool pages. An explicit confirm button on the page
prevents accidental nav clicks from killing a live session.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signalling the process with SIGTERM/SIGINT didn't reliably shut Streamlit
down — its tornado/asyncio loop swallowed or deferred the signal, so the
browser saw the websocket drop ("Connection error") while the python
process kept running. Replace the signal with a daemon-thread
``os._exit(0)`` after a short delay so the current rerun can paint the
"shutting down" message before the process is hard-killed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The footer placement was easy to miss (below all tool cards) and only
rendered on the home page. Hook the button into hide_streamlit_chrome()
so every page that hides default chrome — home + all 9 tool pages — gets
the Quit button at the bottom of the sidebar without per-page edits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The chrome-hiding CSS was removing the Streamlit header wholesale,
which also took the sidebar's expand chevron with it — a collapsed
sidebar became unreopenable. Make the header transparent instead and
explicitly preserve the sidebar collapsed-control.
Also add a Quit button in the app footer that signals the Streamlit
server (SIGTERM, falling back to SIGINT) so closing the GUI returns
the shell prompt cleanly instead of leaving Python hung.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Streamlit's default file_uploader footer reads "Limit 200MB per file —
CSV, TSV, XLSX, XLS" which contradicts the 1 GB efficiency target shipped
in 438bc0f and codified in docs/REQUIREMENTS.md §1.1.
Three changes:
1. .streamlit/config.toml — set [server] maxUploadSize = 1024. Footer
now reads "Limit 1024MB per file".
2. upload_and_analyze_section (home page) — adds an explicit caption
above the uploader stating size limit, supported formats, the four
auto-detected delimiters, and the 13 auto-detected encodings (with
the Review-page override as the safety net).
3. pickup_or_upload (every tool page that falls back to its own
uploader when no home-page upload is present) — same caption,
only rendered when the upload accepts CSV/TSV/XLSX/XLS so JSON
schema / config uploaders aren't decorated.
Test suite: 765 passed, 17 xfailed (no regressions). Home + Review +
Deduplicator pages all serve HTTP 200 under the new config.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two low-risk seam moves to enable selling per-tool subsets without
breaking the existing all-in-one bundle. Behaviour identical; every
existing import still resolves; full pytest suite + every page returns
HTTP 200.
1. **Tool registry** (src/gui/tools_registry.py) — replaces the
inline dict-of-dicts in app.py with a Tool dataclass and a TOOLS
list. Adds a tier field ("core" today, "pro" / "enterprise" later)
and tools_for_tier() / tool_by_id() / display_name() helpers. A
per-tool build slices TOOLS at import time without code changes.
2. **components package** (src/gui/components/) — converts the former
single components.py into a package with:
_legacy.py — original file, unchanged.
__init__.py — re-exports the legacy surface; existing
"from src.gui.components import …" calls
continue to work.
shared.py — hide_streamlit_chrome, pickup_or_upload
(every build needs these).
gate.py — require_normalization_gate (Pro / Suite SKUs).
findings.py — analyzer-finding widgets (drops out of a
standalone-Dedup build).
dedup_review.py — match-group cards + apply pipeline (drops out
of a non-dedup build).
The seam modules are narrow re-exports today. As code migrates out
of _legacy.py into the focused modules, the public import path
stays stable via the shim.
E2E: 765 passed, 17 xfailed (unchanged); home page + all 9 tool pages
+ Review page render HTTP 200; full pipeline (analyze → auto_fix →
apply_decisions → output bytes) round-trips on the kitchen-sink
fixture with zero high-confidence findings remaining post-fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>