feat(gui): hidden-char-aware preview tables in Text Cleaner
The Text Cleaner had two st.dataframe previews — the initial upload
preview ("Preview: filename") and the post-clean "Cleaned preview"
table — that both rendered cells with the same browser-collapses-
whitespace, hides-invisibles problem the analyzer findings panel had
before commit 1049c03.
components.render_hidden_aware_preview(df, n_rows, caption) renders a
DataFrame as an HTML table where:
- every cell uses visualize_hidden_html(mark_outer_whitespace=True),
so leading/trailing ASCII spaces appear as per-character "·" badges
- white-space: pre-wrap on every cell preserves internal multi-space
runs and embedded newlines visually
- headers route through the same visualizer so dirty column names
(NBSP padding, ZWSP, smart quotes) show their badges too
- NaN cells render as a faint "NaN" placeholder
- rows are sticky-headed and scrollable inside a 26rem capped
container so a 10-row preview doesn't push the rest of the UI off
screen
2_Text_Cleaner.py wires it into both previews:
- The upload preview gains its own "Show hidden characters in preview"
toggle (default on).
- The cleaned preview reuses the existing show_hidden toggle that
already governs the Examples changes table, so one switch controls
the whole results section.
Either toggle off falls back to the original st.dataframe view.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -14,7 +14,11 @@ _project_root = Path(__file__).resolve().parent.parent.parent.parent
|
||||
if str(_project_root) not in sys.path:
|
||||
sys.path.insert(0, str(_project_root))
|
||||
|
||||
from src.gui.components import hide_streamlit_chrome, pickup_or_upload
|
||||
from src.gui.components import (
|
||||
hide_streamlit_chrome,
|
||||
pickup_or_upload,
|
||||
render_hidden_aware_preview,
|
||||
)
|
||||
from src.core.text_clean import (
|
||||
PRESETS,
|
||||
CleanOptions,
|
||||
@@ -81,7 +85,16 @@ except Exception as e:
|
||||
|
||||
st.subheader(f"Preview: {uploaded.name}")
|
||||
st.caption(f"{len(df)} rows, {len(df.columns)} columns")
|
||||
st.dataframe(df.head(10), use_container_width=True)
|
||||
preview_show_hidden = st.toggle(
|
||||
"Show hidden characters in preview",
|
||||
value=True,
|
||||
help="Highlights NBSP, zero-width chars, smart quotes, and leading/trailing whitespace.",
|
||||
key="textclean_preview_show_hidden",
|
||||
)
|
||||
if preview_show_hidden:
|
||||
render_hidden_aware_preview(df, n_rows=10)
|
||||
else:
|
||||
st.dataframe(df.head(10), use_container_width=True)
|
||||
|
||||
st.divider()
|
||||
|
||||
@@ -257,7 +270,12 @@ if result.cells_changed:
|
||||
st.dataframe(examples, use_container_width=True, hide_index=True)
|
||||
|
||||
st.markdown("**Cleaned preview (first 10 rows)**")
|
||||
st.dataframe(result.cleaned_df.head(10), use_container_width=True)
|
||||
# Reuse the same toggle the Examples table uses so the user controls both
|
||||
# the changes audit and the cleaned preview with one switch.
|
||||
if show_hidden:
|
||||
render_hidden_aware_preview(result.cleaned_df, n_rows=10)
|
||||
else:
|
||||
st.dataframe(result.cleaned_df.head(10), use_container_width=True)
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Downloads
|
||||
|
||||
Reference in New Issue
Block a user