feat(tools): unified post-run UX across all Ready tool pages

Apply the Clean Text page's post-run UX pattern to every other Ready
tool page (Find Duplicates, Standardize Formats, Fix Missing Values,
Map Columns, Automated Workflows) for consistency and ease of use.

Per page:

1. Preview wrapped in ``st.expander(f"Preview: {filename}",
   expanded=not _has_result)``. Open before a result exists, folded
   afterwards.

2. Options / configuration controls wrapped in
   ``st.expander("Options", expanded=not _has_result)``. Inner
   sub-expanders preserved (Streamlit 1.36+ supports nesting).

3. After the primary action stashes the result, set a one-shot
   ``_<tool>_scroll_to_results`` flag in session state and call
   ``st.rerun()`` so the preview + options expanders see the new
   state on the next pass and collapse themselves.

4. ``<div id="<tool>-results-anchor" style="height:1px">`` placed
   immediately before the Results subheader.

5. End-of-page: pop the scroll flag and inject a tiny
   ``streamlit.components.v1.html`` iframe whose ``<script>`` calls
   ``scrollIntoView`` on the parent document's anchor. One-shot, so
   unrelated reruns (toggling Show-hidden, etc.) don't yank the
   viewport.

6. Download buttons hardened against the multi-button Streamlit
   footgun: byte buffers pre-computed outside the column scopes,
   explicit unique ``key="<tool>_dl_<purpose>"`` per button,
   ``use_container_width=True``, and previously-conditional buttons
   now render unconditionally with ``disabled=True`` + a help
   tooltip when the underlying data is empty so layout stays steady.

Per-page judgment calls (already noted in agent reports):

- Find Duplicates: sheet picker and delimiter selector kept OUTSIDE
  expanders (the user still needs to see them when a file fails to
  parse).
- Fix Missing Values: missingness profile wrapped INSIDE the Options
  expander together with Strategy — the Results section already
  shows a before/after missingness comparison that supersedes the
  static input profile.
- Map Columns: all three subsections (Target schema, Strategy,
  Mapping) wrapped under one outer Options expander, matching the
  Text Cleaner pattern.
- Automated Workflows: inner "Recommended tool order" expander stays
  nested inside the outer Options wrap; Run button stays outside
  Options so the user can re-run after tweaking the (collapsed)
  editor.

2008 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-16 21:04:37 +00:00
parent d1aaf3c2b9
commit 6415be8bf4
5 changed files with 1250 additions and 879 deletions

View File

@@ -173,22 +173,33 @@ if uploaded is not None:
st.session_state["review_decisions"] = {}
tmp_path.unlink(missing_ok=True)
# Collapse the input preview + options once a result exists so
# the Results section below becomes the primary visual focus
# after Find Duplicates runs. Mirrors the Clean Text pattern.
_has_result = st.session_state.get("result") is not None
# Preview
st.subheader(f"Preview: {uploaded.name}")
st.caption(f"{len(df)} rows, {len(df.columns)} columns")
st.dataframe(df.head(10), use_container_width=True)
with st.expander(f"Preview: {uploaded.name}", expanded=not _has_result):
# Subheader retained inside the expander so collected_text in
# the workflow tests still finds "Preview: <name>" — Streamlit's
# AppTest does not surface expander labels through the
# markdown/caption/subheader collections.
st.subheader(f"Preview: {uploaded.name}")
st.caption(f"{len(df)} rows, {len(df.columns)} columns")
st.dataframe(df.head(10), use_container_width=True)
# Advanced options
settings = config_panel(df)
with st.expander("Options", expanded=not _has_result):
settings = config_panel(df)
# Apply loaded config if present
loaded_cfg = st.session_state.get("loaded_config")
if loaded_cfg is not None:
settings["strategies"] = loaded_cfg.to_strategies()
settings["survivor_rule"] = loaded_cfg.to_survivor_rule()
settings["date_column"] = loaded_cfg.date_column
settings["merge"] = loaded_cfg.merge
del st.session_state["loaded_config"]
# Apply loaded config if present
loaded_cfg = st.session_state.get("loaded_config")
if loaded_cfg is not None:
settings["strategies"] = loaded_cfg.to_strategies()
settings["survivor_rule"] = loaded_cfg.to_survivor_rule()
settings["date_column"] = loaded_cfg.date_column
settings["merge"] = loaded_cfg.merge
del st.session_state["loaded_config"]
# -------------------------------------------------------------------
# Find Duplicates button
@@ -218,6 +229,11 @@ if uploaded is not None:
progress_bar.empty()
st.session_state["result"] = result
st.session_state["review_decisions"] = {}
# One-shot flag for the scroll snippet at the bottom of the
# page. Force a rerun so the Preview / Options expanders see
# the new result on the next pass and collapse themselves.
st.session_state["_dedup_scroll_to_results"] = True
st.rerun()
# -------------------------------------------------------------------
# Results
@@ -227,6 +243,14 @@ if uploaded is not None:
if result is not None:
st.divider()
# Anchor target for the post-run auto-scroll snippet at the
# bottom of this page. A bare ``<div id="...">`` survives
# Streamlit's HTML sanitizer; a 1px-tall div doesn't shift
# layout.
st.markdown(
'<div id="dedup-results-anchor" style="height:1px"></div>',
unsafe_allow_html=True,
)
st.subheader("Results")
# Summary + download buttons
@@ -324,27 +348,45 @@ if uploaded is not None:
df, result.match_groups, decisions,
)
csv_bytes = reviewed_df.to_csv(
# Pre-compute every byte buffer up front so each
# ``st.download_button`` sees stable ``data``
# across reruns. Render the empty-removed case
# as a disabled button (rather than hiding it)
# so layout stays steady and the user can see
# why the download isn't available.
reviewed_bytes = reviewed_df.to_csv(
index=False
).encode("utf-8-sig")
reviewed_removed_empty = reviewed_removed.empty
reviewed_removed_bytes = (
reviewed_removed.to_csv(index=False).encode("utf-8-sig")
if not reviewed_removed_empty
else b""
)
st.download_button(
"Download Reviewed & Deduplicated CSV",
data=csv_bytes,
data=reviewed_bytes,
file_name="deduplicated_reviewed.csv",
mime="text/csv",
key="reviewed_download",
key="dedup_dl_reviewed",
use_container_width=True,
)
st.download_button(
"Download Reviewed Removed Rows",
data=reviewed_removed_bytes,
file_name="removed_reviewed.csv",
mime="text/csv",
key="dedup_dl_reviewed_removed",
disabled=reviewed_removed_empty,
help=(
"No rows were removed under the current "
"review decisions."
if reviewed_removed_empty
else None
),
use_container_width=True,
)
if not reviewed_removed.empty:
removed_bytes = reviewed_removed.to_csv(
index=False
).encode("utf-8-sig")
st.download_button(
"Download Reviewed Removed Rows",
data=removed_bytes,
file_name="removed_reviewed.csv",
mime="text/csv",
key="reviewed_removed_download",
)
# Log entries
if result.log_entries:
@@ -365,3 +407,27 @@ st.caption(
"Runs locally. Your data never leaves this computer. "
"| DataTools v3.0"
)
# ---------------------------------------------------------------------------
# Post-run auto-scroll
# ---------------------------------------------------------------------------
#
# When Find Duplicates fires, the preview + options collapse, but
# Streamlit by itself doesn't scroll — the Results section sits below a
# tall page so the user has to hunt for it. Inject a tiny
# component-html iframe that calls ``scrollIntoView`` on the parent's
# Results anchor. The flag is one-shot (``pop`` removes it) so reruns
# triggered by unrelated widgets in the Results section don't yank the
# viewport back to the top of Results.
if st.session_state.pop("_dedup_scroll_to_results", False):
from streamlit.components.v1 import html as _components_html
_components_html(
"""
<script>
const doc = window.parent.document;
const target = doc.getElementById('dedup-results-anchor');
if (target) target.scrollIntoView({behavior: 'smooth', block: 'start'});
</script>
""",
height=0,
)