feat(tools): unified post-run UX across all Ready tool pages
Apply the Clean Text page's post-run UX pattern to every other Ready
tool page (Find Duplicates, Standardize Formats, Fix Missing Values,
Map Columns, Automated Workflows) for consistency and ease of use.
Per page:
1. Preview wrapped in ``st.expander(f"Preview: {filename}",
expanded=not _has_result)``. Open before a result exists, folded
afterwards.
2. Options / configuration controls wrapped in
``st.expander("Options", expanded=not _has_result)``. Inner
sub-expanders preserved (Streamlit 1.36+ supports nesting).
3. After the primary action stashes the result, set a one-shot
``_<tool>_scroll_to_results`` flag in session state and call
``st.rerun()`` so the preview + options expanders see the new
state on the next pass and collapse themselves.
4. ``<div id="<tool>-results-anchor" style="height:1px">`` placed
immediately before the Results subheader.
5. End-of-page: pop the scroll flag and inject a tiny
``streamlit.components.v1.html`` iframe whose ``<script>`` calls
``scrollIntoView`` on the parent document's anchor. One-shot, so
unrelated reruns (toggling Show-hidden, etc.) don't yank the
viewport.
6. Download buttons hardened against the multi-button Streamlit
footgun: byte buffers pre-computed outside the column scopes,
explicit unique ``key="<tool>_dl_<purpose>"`` per button,
``use_container_width=True``, and previously-conditional buttons
now render unconditionally with ``disabled=True`` + a help
tooltip when the underlying data is empty so layout stays steady.
Per-page judgment calls (already noted in agent reports):
- Find Duplicates: sheet picker and delimiter selector kept OUTSIDE
expanders (the user still needs to see them when a file fails to
parse).
- Fix Missing Values: missingness profile wrapped INSIDE the Options
expander together with Strategy — the Results section already
shows a before/after missingness comparison that supersedes the
static input profile.
- Map Columns: all three subsections (Target schema, Strategy,
Mapping) wrapped under one outer Options expander, matching the
Text Cleaner pattern.
- Automated Workflows: inner "Recommended tool order" expander stays
nested inside the outer Options wrap; Run button stays outside
Options so the user can re-run after tweaking the (collapsed)
editor.
2008 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -95,175 +95,186 @@ except Exception as e:
|
||||
)
|
||||
st.stop()
|
||||
|
||||
st.subheader(f"Preview: {uploaded.name}")
|
||||
st.caption(f"{len(df)} rows, {len(df.columns)} columns")
|
||||
st.dataframe(df.head(10), use_container_width=True)
|
||||
# Collapse the input preview + options once the user has clicked
|
||||
# Handle Missing Values so the Results section below is the primary
|
||||
# visual focus. The user can re-expand to re-inspect the source rows
|
||||
# or tweak strategy and rerun.
|
||||
_has_result = st.session_state.get("missing_result") is not None
|
||||
|
||||
with st.expander(f"Preview: {uploaded.name}", expanded=not _has_result):
|
||||
st.caption(f"{len(df)} rows, {len(df.columns)} columns")
|
||||
st.dataframe(df.head(10), use_container_width=True)
|
||||
|
||||
st.divider()
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Initial profile (read-only)
|
||||
# Options (Missingness profile + Strategy)
|
||||
# ---------------------------------------------------------------------------
|
||||
#
|
||||
# Wrapped in an outer expander whose default state mirrors the preview
|
||||
# expander above: open before a result exists, folded once the user has
|
||||
# clicked Handle Missing Values. The Missingness profile lives inside
|
||||
# this expander too — after a run the Results section shows a richer
|
||||
# before-vs-after comparison that supersedes the static input profile,
|
||||
# so keeping it tucked away with the controls cleanly pushes Results
|
||||
# to the top of the visible area.
|
||||
|
||||
st.subheader("Missingness profile")
|
||||
with st.expander("Options", expanded=not _has_result):
|
||||
st.subheader("Missingness profile")
|
||||
|
||||
initial_profile = profile_missing(df, MissingOptions())
|
||||
prof_df = initial_profile.to_dataframe()
|
||||
initial_profile = profile_missing(df, MissingOptions())
|
||||
prof_df = initial_profile.to_dataframe()
|
||||
|
||||
m1, m2, m3, m4 = st.columns(4)
|
||||
m1.metric("Rows", initial_profile.rows_total)
|
||||
m2.metric("Cells missing", initial_profile.cells_missing)
|
||||
m3.metric("% cells missing", f"{initial_profile.cells_missing_pct:.1f}%")
|
||||
m4.metric("Complete rows", initial_profile.rows_complete)
|
||||
m1, m2, m3, m4 = st.columns(4)
|
||||
m1.metric("Rows", initial_profile.rows_total)
|
||||
m2.metric("Cells missing", initial_profile.cells_missing)
|
||||
m3.metric("% cells missing", f"{initial_profile.cells_missing_pct:.1f}%")
|
||||
m4.metric("Complete rows", initial_profile.rows_complete)
|
||||
|
||||
st.dataframe(prof_df, use_container_width=True, hide_index=True)
|
||||
st.dataframe(prof_df, use_container_width=True, hide_index=True)
|
||||
|
||||
if initial_profile.cells_missing == 0:
|
||||
st.success("No missing values or disguised nulls detected. Nothing to handle.")
|
||||
if initial_profile.cells_missing == 0:
|
||||
st.success("No missing values or disguised nulls detected. Nothing to handle.")
|
||||
|
||||
st.divider()
|
||||
st.divider()
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Options
|
||||
# ---------------------------------------------------------------------------
|
||||
st.subheader("Strategy")
|
||||
|
||||
st.subheader("Strategy")
|
||||
preset_label = st.radio(
|
||||
"Preset",
|
||||
[
|
||||
"detect-only (standardize sentinels to NaN, no fill or drop)",
|
||||
"safe-fill (numeric → median, categorical → mode)",
|
||||
"drop-incomplete (drop any row with missing)",
|
||||
],
|
||||
index=0,
|
||||
help=(
|
||||
"detect-only: replace 'N/A', '-', 'NULL', etc. with real NaN, then stop. "
|
||||
"safe-fill: also fill — numeric columns with median, others with mode. "
|
||||
"drop-incomplete: also drop every row that has any missing cell."
|
||||
),
|
||||
)
|
||||
preset_key = preset_label.split(" ", 1)[0]
|
||||
options = MissingOptions.from_preset(preset_key)
|
||||
|
||||
preset_label = st.radio(
|
||||
"Preset",
|
||||
[
|
||||
"detect-only (standardize sentinels to NaN, no fill or drop)",
|
||||
"safe-fill (numeric → median, categorical → mode)",
|
||||
"drop-incomplete (drop any row with missing)",
|
||||
],
|
||||
index=0,
|
||||
help=(
|
||||
"detect-only: replace 'N/A', '-', 'NULL', etc. with real NaN, then stop. "
|
||||
"safe-fill: also fill — numeric columns with median, others with mode. "
|
||||
"drop-incomplete: also drop every row that has any missing cell."
|
||||
),
|
||||
)
|
||||
preset_key = preset_label.split(" ", 1)[0]
|
||||
options = MissingOptions.from_preset(preset_key)
|
||||
with st.expander("Advanced options"):
|
||||
col_a, col_b = st.columns(2)
|
||||
|
||||
with st.expander("Advanced options"):
|
||||
col_a, col_b = st.columns(2)
|
||||
|
||||
with col_a:
|
||||
st.markdown("**Detection**")
|
||||
options.standardize_sentinels = st.checkbox(
|
||||
"Standardize disguised nulls to NaN",
|
||||
value=options.standardize_sentinels,
|
||||
help="Replace 'N/A', '-', 'NULL', whitespace-only cells, etc. with real NaN.",
|
||||
)
|
||||
sentinels_text = st.text_input(
|
||||
"Sentinel values (comma-separated)",
|
||||
value=", ".join(options.sentinels),
|
||||
disabled=not options.standardize_sentinels,
|
||||
help="Matched case-insensitively after stripping whitespace.",
|
||||
)
|
||||
options.sentinels = [
|
||||
s.strip() for s in sentinels_text.split(",") if s.strip()
|
||||
]
|
||||
|
||||
with col_b:
|
||||
st.markdown("**Strategy override**")
|
||||
strat_options = [
|
||||
"(use preset)",
|
||||
"none", "drop_row", "drop_col", "drop_both",
|
||||
"mean", "median", "mode", "constant",
|
||||
"ffill", "bfill", "interpolate",
|
||||
]
|
||||
strat_choice = st.selectbox(
|
||||
"Global strategy",
|
||||
strat_options,
|
||||
index=0,
|
||||
help=(
|
||||
"drop_row / drop_col use the thresholds below. "
|
||||
"mean / median / interpolate are numeric only — non-numeric "
|
||||
"columns fall back to the categorical strategy."
|
||||
),
|
||||
)
|
||||
if strat_choice != "(use preset)":
|
||||
options.strategy = strat_choice # type: ignore[assignment]
|
||||
|
||||
cat_strat = st.selectbox(
|
||||
"Categorical fallback (for non-numeric columns)",
|
||||
["mode", "constant", "ffill", "bfill", "none"],
|
||||
index=0,
|
||||
)
|
||||
options.categorical_strategy = cat_strat # type: ignore[assignment]
|
||||
|
||||
if options.strategy == "constant" or cat_strat == "constant":
|
||||
fill_val = st.text_input(
|
||||
"Constant fill value",
|
||||
value="",
|
||||
help="Used when strategy = constant. Leave blank to fill with empty string.",
|
||||
with col_a:
|
||||
st.markdown("**Detection**")
|
||||
options.standardize_sentinels = st.checkbox(
|
||||
"Standardize disguised nulls to NaN",
|
||||
value=options.standardize_sentinels,
|
||||
help="Replace 'N/A', '-', 'NULL', whitespace-only cells, etc. with real NaN.",
|
||||
)
|
||||
options.fill_value = fill_val
|
||||
sentinels_text = st.text_input(
|
||||
"Sentinel values (comma-separated)",
|
||||
value=", ".join(options.sentinels),
|
||||
disabled=not options.standardize_sentinels,
|
||||
help="Matched case-insensitively after stripping whitespace.",
|
||||
)
|
||||
options.sentinels = [
|
||||
s.strip() for s in sentinels_text.split(",") if s.strip()
|
||||
]
|
||||
|
||||
st.markdown("**Drop thresholds**")
|
||||
col_c, col_d = st.columns(2)
|
||||
with col_c:
|
||||
options.row_drop_threshold = st.slider(
|
||||
"Row drop threshold (drop rows with ≥ this fraction missing across selected cols)",
|
||||
0.0, 1.0, options.row_drop_threshold, 0.05,
|
||||
)
|
||||
with col_d:
|
||||
options.col_drop_threshold = st.slider(
|
||||
"Column drop threshold (drop columns with ≥ this fraction missing)",
|
||||
0.0, 1.0, options.col_drop_threshold, 0.05,
|
||||
)
|
||||
|
||||
st.markdown("**Scope**")
|
||||
selected_cols = st.multiselect(
|
||||
"Columns to handle (default: all)",
|
||||
options=list(df.columns),
|
||||
default=list(df.columns),
|
||||
)
|
||||
skip_cols = st.multiselect(
|
||||
"Columns to skip",
|
||||
options=list(df.columns),
|
||||
default=[],
|
||||
)
|
||||
options.columns = selected_cols if selected_cols else None
|
||||
options.skip_columns = list(skip_cols)
|
||||
|
||||
st.markdown("**Per-column strategy overrides** (optional)")
|
||||
st.caption(
|
||||
"Set a different strategy for specific columns. Leave any row blank to "
|
||||
"use the global strategy."
|
||||
)
|
||||
per_col_overrides: dict[str, str] = {}
|
||||
only_missing_cols = [
|
||||
r.column for r in initial_profile.columns if r.has_missing
|
||||
]
|
||||
if only_missing_cols:
|
||||
edit_df = pd.DataFrame({
|
||||
"column": only_missing_cols,
|
||||
"strategy": ["" for _ in only_missing_cols],
|
||||
})
|
||||
edited = st.data_editor(
|
||||
edit_df,
|
||||
use_container_width=True,
|
||||
hide_index=True,
|
||||
column_config={
|
||||
"column": st.column_config.TextColumn("Column", disabled=True),
|
||||
"strategy": st.column_config.SelectboxColumn(
|
||||
"Override",
|
||||
options=[
|
||||
"", "drop_row", "drop_col",
|
||||
"mean", "median", "mode", "constant",
|
||||
"ffill", "bfill", "interpolate",
|
||||
],
|
||||
with col_b:
|
||||
st.markdown("**Strategy override**")
|
||||
strat_options = [
|
||||
"(use preset)",
|
||||
"none", "drop_row", "drop_col", "drop_both",
|
||||
"mean", "median", "mode", "constant",
|
||||
"ffill", "bfill", "interpolate",
|
||||
]
|
||||
strat_choice = st.selectbox(
|
||||
"Global strategy",
|
||||
strat_options,
|
||||
index=0,
|
||||
help=(
|
||||
"drop_row / drop_col use the thresholds below. "
|
||||
"mean / median / interpolate are numeric only — non-numeric "
|
||||
"columns fall back to the categorical strategy."
|
||||
),
|
||||
},
|
||||
key="missing_per_col_editor",
|
||||
)
|
||||
if strat_choice != "(use preset)":
|
||||
options.strategy = strat_choice # type: ignore[assignment]
|
||||
|
||||
cat_strat = st.selectbox(
|
||||
"Categorical fallback (for non-numeric columns)",
|
||||
["mode", "constant", "ffill", "bfill", "none"],
|
||||
index=0,
|
||||
)
|
||||
options.categorical_strategy = cat_strat # type: ignore[assignment]
|
||||
|
||||
if options.strategy == "constant" or cat_strat == "constant":
|
||||
fill_val = st.text_input(
|
||||
"Constant fill value",
|
||||
value="",
|
||||
help="Used when strategy = constant. Leave blank to fill with empty string.",
|
||||
)
|
||||
options.fill_value = fill_val
|
||||
|
||||
st.markdown("**Drop thresholds**")
|
||||
col_c, col_d = st.columns(2)
|
||||
with col_c:
|
||||
options.row_drop_threshold = st.slider(
|
||||
"Row drop threshold (drop rows with ≥ this fraction missing across selected cols)",
|
||||
0.0, 1.0, options.row_drop_threshold, 0.05,
|
||||
)
|
||||
with col_d:
|
||||
options.col_drop_threshold = st.slider(
|
||||
"Column drop threshold (drop columns with ≥ this fraction missing)",
|
||||
0.0, 1.0, options.col_drop_threshold, 0.05,
|
||||
)
|
||||
|
||||
st.markdown("**Scope**")
|
||||
selected_cols = st.multiselect(
|
||||
"Columns to handle (default: all)",
|
||||
options=list(df.columns),
|
||||
default=list(df.columns),
|
||||
)
|
||||
for _, row in edited.iterrows():
|
||||
if row["strategy"]:
|
||||
per_col_overrides[row["column"]] = row["strategy"]
|
||||
options.column_strategies = per_col_overrides # type: ignore[assignment]
|
||||
skip_cols = st.multiselect(
|
||||
"Columns to skip",
|
||||
options=list(df.columns),
|
||||
default=[],
|
||||
)
|
||||
options.columns = selected_cols if selected_cols else None
|
||||
options.skip_columns = list(skip_cols)
|
||||
|
||||
st.markdown("**Per-column strategy overrides** (optional)")
|
||||
st.caption(
|
||||
"Set a different strategy for specific columns. Leave any row blank to "
|
||||
"use the global strategy."
|
||||
)
|
||||
per_col_overrides: dict[str, str] = {}
|
||||
only_missing_cols = [
|
||||
r.column for r in initial_profile.columns if r.has_missing
|
||||
]
|
||||
if only_missing_cols:
|
||||
edit_df = pd.DataFrame({
|
||||
"column": only_missing_cols,
|
||||
"strategy": ["" for _ in only_missing_cols],
|
||||
})
|
||||
edited = st.data_editor(
|
||||
edit_df,
|
||||
use_container_width=True,
|
||||
hide_index=True,
|
||||
column_config={
|
||||
"column": st.column_config.TextColumn("Column", disabled=True),
|
||||
"strategy": st.column_config.SelectboxColumn(
|
||||
"Override",
|
||||
options=[
|
||||
"", "drop_row", "drop_col",
|
||||
"mean", "median", "mode", "constant",
|
||||
"ffill", "bfill", "interpolate",
|
||||
],
|
||||
),
|
||||
},
|
||||
key="missing_per_col_editor",
|
||||
)
|
||||
for _, row in edited.iterrows():
|
||||
if row["strategy"]:
|
||||
per_col_overrides[row["column"]] = row["strategy"]
|
||||
options.column_strategies = per_col_overrides # type: ignore[assignment]
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Run
|
||||
@@ -282,6 +293,14 @@ if st.button("Handle Missing Values", type="primary", use_container_width=True):
|
||||
st.session_state["missing_result"] = result
|
||||
st.session_state["missing_input_name"] = uploaded.name
|
||||
st.session_state["missing_options"] = options.to_dict()
|
||||
# One-shot flag picked up on the next pass to scroll the parent
|
||||
# document to the Results anchor (see scroll snippet below).
|
||||
st.session_state["_missing_scroll_to_results"] = True
|
||||
# Force a second rerun so the preview and options expanders see
|
||||
# the new result on the NEXT script pass and collapse themselves.
|
||||
# Without this they stay expanded until the user touches any
|
||||
# other widget.
|
||||
st.rerun()
|
||||
|
||||
result = st.session_state.get("missing_result")
|
||||
if result is None:
|
||||
@@ -292,6 +311,16 @@ if result is None:
|
||||
# Results
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Anchor target for the auto-scroll snippet at the end of this block.
|
||||
# A bare ``<div id="...">`` survives Streamlit's HTML sanitizer (only
|
||||
# ``<script>`` is stripped), and a 1px-tall div doesn't visually shift
|
||||
# anything. Placed before the subheader so the scrolled-to viewport
|
||||
# starts a few pixels above the section heading rather than below it.
|
||||
st.markdown(
|
||||
'<div id="missing-results-anchor" style="height:1px"></div>',
|
||||
unsafe_allow_html=True,
|
||||
)
|
||||
|
||||
st.subheader("Results")
|
||||
|
||||
m1, m2, m3, m4 = st.columns(4)
|
||||
@@ -334,38 +363,85 @@ st.dataframe(result.handled_df.head(10), use_container_width=True)
|
||||
# ---------------------------------------------------------------------------
|
||||
# Downloads
|
||||
# ---------------------------------------------------------------------------
|
||||
#
|
||||
# All three byte buffers are prepared up front (outside the columns) so
|
||||
# each ``st.download_button`` sees stable ``data`` across reruns and an
|
||||
# explicit ``key`` — without those, Streamlit auto-derived widget IDs
|
||||
# can collide for multiple download_buttons in adjacent columns and
|
||||
# only the first one actually fires on click. The empty-changes case
|
||||
# now renders a disabled button (rather than vanishing) so the layout
|
||||
# stays steady and the user understands why nothing's available.
|
||||
|
||||
st.divider()
|
||||
stem = Path(st.session_state.get("missing_input_name", "input")).stem
|
||||
|
||||
handled_bytes = result.handled_df.to_csv(index=False).encode("utf-8-sig")
|
||||
changes_bytes = (
|
||||
result.changes.to_csv(index=False).encode("utf-8-sig")
|
||||
if not result.changes.empty
|
||||
else b""
|
||||
)
|
||||
config_bytes = json.dumps(
|
||||
st.session_state.get("missing_options", {}), indent=2, default=str,
|
||||
).encode("utf-8")
|
||||
|
||||
dl_a, dl_b, dl_c = st.columns(3)
|
||||
with dl_a:
|
||||
handled_bytes = result.handled_df.to_csv(index=False).encode("utf-8-sig")
|
||||
st.download_button(
|
||||
"Download handled CSV",
|
||||
data=handled_bytes,
|
||||
file_name=f"{stem}_missing.csv",
|
||||
mime="text/csv",
|
||||
key="missing_dl_handled",
|
||||
use_container_width=True,
|
||||
)
|
||||
with dl_b:
|
||||
if not result.changes.empty:
|
||||
changes_bytes = result.changes.to_csv(index=False).encode("utf-8-sig")
|
||||
st.download_button(
|
||||
"Download changes audit",
|
||||
data=changes_bytes,
|
||||
file_name=f"{stem}_missing_changes.csv",
|
||||
mime="text/csv",
|
||||
)
|
||||
st.download_button(
|
||||
"Download changes audit",
|
||||
data=changes_bytes,
|
||||
file_name=f"{stem}_missing_changes.csv",
|
||||
mime="text/csv",
|
||||
key="missing_dl_changes",
|
||||
disabled=result.changes.empty,
|
||||
help="No changes to audit." if result.changes.empty else None,
|
||||
use_container_width=True,
|
||||
)
|
||||
with dl_c:
|
||||
config_bytes = json.dumps(
|
||||
st.session_state.get("missing_options", {}), indent=2, default=str,
|
||||
).encode("utf-8")
|
||||
st.download_button(
|
||||
"Download config JSON",
|
||||
data=config_bytes,
|
||||
file_name="missing_config.json",
|
||||
mime="application/json",
|
||||
key="missing_dl_config",
|
||||
use_container_width=True,
|
||||
)
|
||||
|
||||
st.divider()
|
||||
st.caption("Runs locally. Your data never leaves this computer. | DataTools v3.0")
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Post-run auto-scroll
|
||||
# ---------------------------------------------------------------------------
|
||||
#
|
||||
# When the user clicks Handle Missing Values, the preview + options
|
||||
# collapse but Streamlit by itself doesn't scroll — the Results section
|
||||
# is at the bottom of a tall script so the user has to find it. Inject
|
||||
# a tiny component-html iframe that calls ``scrollIntoView`` on the
|
||||
# parent's Results anchor. Streamlit's main page is same-origin with
|
||||
# component iframes so ``window.parent.document`` access is allowed.
|
||||
#
|
||||
# The flag is one-shot (``pop`` removes it) so re-renders triggered by
|
||||
# unrelated widgets in the Results section don't yank the viewport
|
||||
# back to the top of Results.
|
||||
if st.session_state.pop("_missing_scroll_to_results", False):
|
||||
from streamlit.components.v1 import html as _components_html
|
||||
_components_html(
|
||||
"""
|
||||
<script>
|
||||
const doc = window.parent.document;
|
||||
const target = doc.getElementById('missing-results-anchor');
|
||||
if (target) target.scrollIntoView({behavior: 'smooth', block: 'start'});
|
||||
</script>
|
||||
""",
|
||||
height=0,
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user