feat(tools): unified post-run UX across all Ready tool pages

Apply the Clean Text page's post-run UX pattern to every other Ready tool page (Find Duplicates, Standardize Formats, Fix Missing Values, Map Columns, Automated Workflows) for consistency and ease of use. Per page: 1. Preview wrapped in ``st.expander(f"Preview: {filename}", expanded=not _has_result)``. Open before a result exists, folded afterwards. 2. Options / configuration controls wrapped in ``st.expander("Options", expanded=not _has_result)``. Inner sub-expanders preserved (Streamlit 1.36+ supports nesting). 3. After the primary action stashes the result, set a one-shot ``_<tool>_scroll_to_results`` flag in session state and call ``st.rerun()`` so the preview + options expanders see the new state on the next pass and collapse themselves. 4. ``<div id="<tool>-results-anchor" style="height:1px">`` placed immediately before the Results subheader. 5. End-of-page: pop the scroll flag and inject a tiny ``streamlit.components.v1.html`` iframe whose ``<script>`` calls ``scrollIntoView`` on the parent document's anchor. One-shot, so unrelated reruns (toggling Show-hidden, etc.) don't yank the viewport. 6. Download buttons hardened against the multi-button Streamlit footgun: byte buffers pre-computed outside the column scopes, explicit unique ``key="<tool>_dl_<purpose>"`` per button, ``use_container_width=True``, and previously-conditional buttons now render unconditionally with ``disabled=True`` + a help tooltip when the underlying data is empty so layout stays steady. Per-page judgment calls (already noted in agent reports): - Find Duplicates: sheet picker and delimiter selector kept OUTSIDE expanders (the user still needs to see them when a file fails to parse). - Fix Missing Values: missingness profile wrapped INSIDE the Options expander together with Strategy — the Results section already shows a before/after missingness comparison that supersedes the static input profile. - Map Columns: all three subsections (Target schema, Strategy, Mapping) wrapped under one outer Options expander, matching the Text Cleaner pattern. - Automated Workflows: inner "Recommended tool order" expander stays nested inside the outer Options wrap; Run button stays outside Options so the user can re-run after tweaking the (collapsed) editor. 2008 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 21:04:37 +00:00
parent d1aaf3c2b9
commit 6415be8bf4
5 changed files with 1250 additions and 879 deletions
--- a/src/gui/pages/5_Column_Mapper.py
+++ b/src/gui/pages/5_Column_Mapper.py
@@ -88,224 +88,240 @@ except Exception as e:
    )
    st.stop()

-st.subheader(f"Preview: {uploaded.name}")
-st.caption(f"{len(df)} rows, {len(df.columns)} columns")
-st.dataframe(df.head(10), use_container_width=True)
+# Collapse the input preview once the user has clicked Apply Column
+# Mapping so the Results section below is the primary visual focus.
+# The user can re-expand the expander to re-inspect the source rows.
+_has_result = st.session_state.get("colmap_result") is not None
+
+with st.expander(f"Preview: {uploaded.name}", expanded=not _has_result):
+    st.caption(f"{len(df)} rows, {len(df.columns)} columns")
+    st.dataframe(df.head(10), use_container_width=True)
 st.divider()

 # ---------------------------------------------------------------------------
-# Schema input
+# Options (Target schema + Strategy + Mapping)
 # ---------------------------------------------------------------------------
+#
+# Wrapped in an outer expander whose default state mirrors the preview
+# expander above: open before a result exists, folded once the user has
+# clicked Apply Column Mapping. The Mapping editor is the heart of the
+# tool, but per the Text Cleaner pattern we still collapse everything
+# post-run — the user can re-expand to tweak any of the three sections.

-st.subheader("Target schema")
+with st.expander("Options", expanded=not _has_result):
+    # -----------------------------------------------------------------------
+    # Schema input
+    # -----------------------------------------------------------------------

-schema_mode = st.radio(
-    "How would you like to define the target schema?",
-    [
-        "Build interactively (start from current columns)",
-        "Upload schema JSON",
-        "Skip (rename / coerce only — no schema)",
-    ],
-    index=0,
-    help=(
-        "An interactive build is fastest for one-off cleanup. Upload a JSON "
-        "when you have a fixed contract (a CRM import format, db schema). "
-        "Skip when you only want to rename or coerce specific columns."
-    ),
-)
+    st.subheader("Target schema")

-schema: TargetSchema | None = None
-
-if schema_mode.startswith("Upload"):
-    schema_file = st.file_uploader(
-        "Schema JSON",
-        type=["json"],
-        key="colmap_schema_upload",
-        help='Format: {"fields": [{"name": "email", "dtype": "string", "required": true, "aliases": ["EmailAddr"]}, ...]}',
+    schema_mode = st.radio(
+        "How would you like to define the target schema?",
+        [
+            "Build interactively (start from current columns)",
+            "Upload schema JSON",
+            "Skip (rename / coerce only — no schema)",
+        ],
+        index=0,
+        help=(
+            "An interactive build is fastest for one-off cleanup. Upload a JSON "
+            "when you have a fixed contract (a CRM import format, db schema). "
+            "Skip when you only want to rename or coerce specific columns."
+        ),
    )
-    if schema_file is not None:
-        try:
-            schema = TargetSchema.from_dict(json.loads(schema_file.getvalue()))
-            st.success(f"Loaded {len(schema.fields)} target field(s).")
-        except Exception as e:
-            from src.core.errors import format_for_user
-            st.error(f"**Could not parse schema**\n\n```\n{format_for_user(e)}\n```")

-elif schema_mode.startswith("Build"):
-    st.caption(
-        "Edit the table to define your target schema. Add rows for fields the "
-        "input doesn't have yet (with a default), or remove rows for columns "
-        "you want to drop."
+    schema: TargetSchema | None = None
+
+    if schema_mode.startswith("Upload"):
+        schema_file = st.file_uploader(
+            "Schema JSON",
+            type=["json"],
+            key="colmap_schema_upload",
+            help='Format: {"fields": [{"name": "email", "dtype": "string", "required": true, "aliases": ["EmailAddr"]}, ...]}',
+        )
+        if schema_file is not None:
+            try:
+                schema = TargetSchema.from_dict(json.loads(schema_file.getvalue()))
+                st.success(f"Loaded {len(schema.fields)} target field(s).")
+            except Exception as e:
+                from src.core.errors import format_for_user
+                st.error(f"**Could not parse schema**\n\n```\n{format_for_user(e)}\n```")
+
+    elif schema_mode.startswith("Build"):
+        st.caption(
+            "Edit the table to define your target schema. Add rows for fields the "
+            "input doesn't have yet (with a default), or remove rows for columns "
+            "you want to drop."
+        )
+        initial = pd.DataFrame({
+            "name": list(df.columns),
+            "dtype": ["auto"] * len(df.columns),
+            "required": [False] * len(df.columns),
+            "default": [""] * len(df.columns),
+            "aliases": [""] * len(df.columns),
+        })
+        edited = st.data_editor(
+            initial,
+            use_container_width=True,
+            num_rows="dynamic",
+            column_config={
+                "name": st.column_config.TextColumn("Target name"),
+                "dtype": st.column_config.SelectboxColumn(
+                    "Type",
+                    options=[
+                        "auto", "string", "integer", "float",
+                        "boolean", "date", "datetime", "category",
+                    ],
+                ),
+                "required": st.column_config.CheckboxColumn("Required"),
+                "default": st.column_config.TextColumn("Default (for added cols)"),
+                "aliases": st.column_config.TextColumn(
+                    "Aliases (comma-sep, helps fuzzy-match)",
+                ),
+            },
+            key="colmap_schema_editor",
+        )
+        fields: list[TargetField] = []
+        for _, row in edited.iterrows():
+            name = str(row.get("name", "")).strip()
+            if not name:
+                continue
+            aliases = [
+                a.strip() for a in str(row.get("aliases", "") or "").split(",")
+                if a.strip()
+            ]
+            default_raw = row.get("default")
+            default_val = (
+                default_raw if (default_raw not in (None, "", float("nan")))
+                else None
+            )
+            try:
+                if isinstance(default_val, float) and pd.isna(default_val):
+                    default_val = None
+            except TypeError:
+                pass
+            fields.append(TargetField(
+                name=name,
+                dtype=str(row.get("dtype", "auto")),  # type: ignore[arg-type]
+                required=bool(row.get("required", False)),
+                aliases=aliases,
+                default=default_val,
+            ))
+        if fields:
+            schema = TargetSchema(fields=fields)
+
+    st.divider()
+
+    # -----------------------------------------------------------------------
+    # Strategy
+    # -----------------------------------------------------------------------
+
+    st.subheader("Strategy")
+
+    preset_label = st.radio(
+        "Preset",
+        [
+            "rename-only (just rename, leave types alone, keep extras)",
+            "lenient-schema (rename + coerce + reorder, keep extras)",
+            "strict-schema (rename + coerce + reorder, drop extras)",
+        ],
+        index=0,
    )
-    initial = pd.DataFrame({
-        "name": list(df.columns),
-        "dtype": ["auto"] * len(df.columns),
-        "required": [False] * len(df.columns),
-        "default": [""] * len(df.columns),
-        "aliases": [""] * len(df.columns),
-    })
-    edited = st.data_editor(
-        initial,
-        use_container_width=True,
-        num_rows="dynamic",
-        column_config={
-            "name": st.column_config.TextColumn("Target name"),
-            "dtype": st.column_config.SelectboxColumn(
-                "Type",
-                options=[
-                    "auto", "string", "integer", "float",
-                    "boolean", "date", "datetime", "category",
-                ],
-            ),
-            "required": st.column_config.CheckboxColumn("Required"),
-            "default": st.column_config.TextColumn("Default (for added cols)"),
-            "aliases": st.column_config.TextColumn(
-                "Aliases (comma-sep, helps fuzzy-match)",
-            ),
-        },
-        key="colmap_schema_editor",
-    )
-    fields: list[TargetField] = []
-    for _, row in edited.iterrows():
-        name = str(row.get("name", "")).strip()
-        if not name:
-            continue
-        aliases = [
-            a.strip() for a in str(row.get("aliases", "") or "").split(",")
-            if a.strip()
-        ]
-        default_raw = row.get("default")
-        default_val = (
-            default_raw if (default_raw not in (None, "", float("nan")))
-            else None
+    preset_key = preset_label.split(" ", 1)[0]
+    options = MapOptions.from_preset(preset_key)
+    options.schema = schema
+
+    with st.expander("Advanced options"):
+        col_a, col_b = st.columns(2)
+        with col_a:
+            options.unmapped = st.selectbox(  # type: ignore[assignment]
+                "Unmapped source columns",
+                ["keep", "drop", "error"],
+                index=["keep", "drop", "error"].index(options.unmapped),
+            )
+            options.coerce_types = st.checkbox(
+                "Coerce types per schema", value=options.coerce_types,
+            )
+            options.reorder_to_schema = st.checkbox(
+                "Reorder to schema order", value=options.reorder_to_schema,
+            )
+        with col_b:
+            options.auto_infer = st.checkbox(
+                "Auto-infer mapping (fuzzy match)", value=options.auto_infer,
+            )
+            options.fuzzy_threshold = st.slider(
+                "Fuzzy match threshold", 0.0, 1.0, options.fuzzy_threshold, 0.05,
+            )
+            options.enforce_required = st.checkbox(
+                "Enforce required fields", value=options.enforce_required,
+            )
+
+    # -----------------------------------------------------------------------
+    # Mapping editor — show inferred and let user override
+    # -----------------------------------------------------------------------
+
+    st.subheader("Mapping")
+
+    if schema is None:
+        st.caption(
+            "No schema — define explicit renames below (left blank means keep "
+            "the source name)."
        )
-        try:
-            if isinstance(default_val, float) and pd.isna(default_val):
-                default_val = None
-        except TypeError:
-            pass
-        fields.append(TargetField(
-            name=name,
-            dtype=str(row.get("dtype", "auto")),  # type: ignore[arg-type]
-            required=bool(row.get("required", False)),
-            aliases=aliases,
-            default=default_val,
-        ))
-    if fields:
-        schema = TargetSchema(fields=fields)
-
-st.divider()
-
-# ---------------------------------------------------------------------------
-# Strategy
-# ---------------------------------------------------------------------------
-
-st.subheader("Strategy")
-
-preset_label = st.radio(
-    "Preset",
-    [
-        "rename-only (just rename, leave types alone, keep extras)",
-        "lenient-schema (rename + coerce + reorder, keep extras)",
-        "strict-schema (rename + coerce + reorder, drop extras)",
-    ],
-    index=0,
-)
-preset_key = preset_label.split(" ", 1)[0]
-options = MapOptions.from_preset(preset_key)
-options.schema = schema
-
-with st.expander("Advanced options"):
-    col_a, col_b = st.columns(2)
-    with col_a:
-        options.unmapped = st.selectbox(  # type: ignore[assignment]
-            "Unmapped source columns",
-            ["keep", "drop", "error"],
-            index=["keep", "drop", "error"].index(options.unmapped),
+        rename_initial = pd.DataFrame({
+            "source": list(df.columns),
+            "target": list(df.columns),
+        })
+        rename_edited = st.data_editor(
+            rename_initial,
+            use_container_width=True,
+            column_config={
+                "source": st.column_config.TextColumn("Source", disabled=True),
+                "target": st.column_config.TextColumn("Target"),
+            },
+            hide_index=True,
+            key="colmap_rename_only_editor",
        )
-        options.coerce_types = st.checkbox(
-            "Coerce types per schema", value=options.coerce_types,
+        explicit_mapping: dict[str, str] = {}
+        for _, row in rename_edited.iterrows():
+            src = str(row["source"])
+            tgt = str(row["target"]).strip()
+            if tgt and tgt != src:
+                explicit_mapping[src] = tgt
+        options.mapping = explicit_mapping
+    else:
+        inferred = (
+            infer_mapping(df, schema, threshold=options.fuzzy_threshold)
+            if options.auto_infer else {}
        )
-        options.reorder_to_schema = st.checkbox(
-            "Reorder to schema order", value=options.reorder_to_schema,
+        target_options = ["(unmapped)"] + schema.field_names()
+        map_initial = pd.DataFrame({
+            "source": list(df.columns),
+            "target": [inferred.get(c, "(unmapped)") for c in df.columns],
+            "auto": [c in inferred for c in df.columns],
+        })
+        map_edited = st.data_editor(
+            map_initial,
+            use_container_width=True,
+            column_config={
+                "source": st.column_config.TextColumn("Source", disabled=True),
+                "target": st.column_config.SelectboxColumn(
+                    "Target", options=target_options,
+                ),
+                "auto": st.column_config.CheckboxColumn("Auto-suggested", disabled=True),
+            },
+            hide_index=True,
+            key="colmap_schema_mapping_editor",
        )
-    with col_b:
-        options.auto_infer = st.checkbox(
-            "Auto-infer mapping (fuzzy match)", value=options.auto_infer,
-        )
-        options.fuzzy_threshold = st.slider(
-            "Fuzzy match threshold", 0.0, 1.0, options.fuzzy_threshold, 0.05,
-        )
-        options.enforce_required = st.checkbox(
-            "Enforce required fields", value=options.enforce_required,
-        )
-
-# ---------------------------------------------------------------------------
-# Mapping editor — show inferred and let user override
-# ---------------------------------------------------------------------------
-
-st.subheader("Mapping")
-
-if schema is None:
-    st.caption(
-        "No schema — define explicit renames below (left blank means keep "
-        "the source name)."
-    )
-    rename_initial = pd.DataFrame({
-        "source": list(df.columns),
-        "target": list(df.columns),
-    })
-    rename_edited = st.data_editor(
-        rename_initial,
-        use_container_width=True,
-        column_config={
-            "source": st.column_config.TextColumn("Source", disabled=True),
-            "target": st.column_config.TextColumn("Target"),
-        },
-        hide_index=True,
-        key="colmap_rename_only_editor",
-    )
-    explicit_mapping: dict[str, str] = {}
-    for _, row in rename_edited.iterrows():
-        src = str(row["source"])
-        tgt = str(row["target"]).strip()
-        if tgt and tgt != src:
-            explicit_mapping[src] = tgt
-    options.mapping = explicit_mapping
-else:
-    inferred = (
-        infer_mapping(df, schema, threshold=options.fuzzy_threshold)
-        if options.auto_infer else {}
-    )
-    target_options = ["(unmapped)"] + schema.field_names()
-    map_initial = pd.DataFrame({
-        "source": list(df.columns),
-        "target": [inferred.get(c, "(unmapped)") for c in df.columns],
-        "auto": [c in inferred for c in df.columns],
-    })
-    map_edited = st.data_editor(
-        map_initial,
-        use_container_width=True,
-        column_config={
-            "source": st.column_config.TextColumn("Source", disabled=True),
-            "target": st.column_config.SelectboxColumn(
-                "Target", options=target_options,
-            ),
-            "auto": st.column_config.CheckboxColumn("Auto-suggested", disabled=True),
-        },
-        hide_index=True,
-        key="colmap_schema_mapping_editor",
-    )
-    explicit_mapping = {}
-    for _, row in map_edited.iterrows():
-        src = str(row["source"])
-        tgt = str(row["target"])
-        if tgt and tgt != "(unmapped)":
-            explicit_mapping[src] = tgt
-    options.mapping = explicit_mapping
-    # Disable auto-infer for the actual run since the editor already shows
-    # the user's resolved choices (they can manually re-select to add).
-    options.auto_infer = False
+        explicit_mapping = {}
+        for _, row in map_edited.iterrows():
+            src = str(row["source"])
+            tgt = str(row["target"])
+            if tgt and tgt != "(unmapped)":
+                explicit_mapping[src] = tgt
+        options.mapping = explicit_mapping
+        # Disable auto-infer for the actual run since the editor already shows
+        # the user's resolved choices (they can manually re-select to add).
+        options.auto_infer = False

 # ---------------------------------------------------------------------------
 # Run
@@ -324,6 +340,12 @@ if st.button("Apply Column Mapping", type="primary", use_container_width=True):
    st.session_state["colmap_result"] = result
    st.session_state["colmap_input_name"] = uploaded.name
    st.session_state["colmap_options"] = options.to_dict()
+    # One-shot flag picked up on the next pass to scroll the parent
+    # document to the Results anchor (see scroll snippet below).
+    st.session_state["_colmap_scroll_to_results"] = True
+    # Force a second rerun so the preview and options expanders see
+    # the new result on the NEXT script pass and collapse themselves.
+    st.rerun()

 result = st.session_state.get("colmap_result")
 if result is None:
@@ -334,6 +356,16 @@ if result is None:
 # Results
 # ---------------------------------------------------------------------------

+# Anchor target for the auto-scroll snippet at the end of this block.
+# A bare ``<div id="...">`` survives Streamlit's HTML sanitizer (only
+# ``<script>`` is stripped), and a 1px-tall div doesn't visually shift
+# anything. Placed before the subheader so the scrolled-to viewport
+# starts a few pixels above the section heading rather than below it.
+st.markdown(
+    '<div id="colmap-results-anchor" style="height:1px"></div>',
+    unsafe_allow_html=True,
+)
+
 st.subheader("Results")

 m1, m2, m3, m4 = st.columns(4)
@@ -371,46 +403,90 @@ st.dataframe(result.mapped_df.head(10), use_container_width=True)
 # ---------------------------------------------------------------------------
 # Downloads
 # ---------------------------------------------------------------------------
+#
+# All three byte buffers are prepared up front (outside the columns) so
+# each ``st.download_button`` sees stable ``data`` across reruns and an
+# explicit ``key`` — without those, Streamlit auto-derived widget IDs
+# can collide for multiple download_buttons in adjacent columns and
+# only the first one actually fires on click.

 st.divider()
 stem = Path(st.session_state.get("colmap_input_name", "input")).stem

+mapped_bytes = result.mapped_df.to_csv(index=False).encode("utf-8-sig")
+audit_bytes = json.dumps({
+    "mapping": result.mapping,
+    "inferred_pairs": result.inferred_pairs,
+    "columns_renamed": result.columns_renamed,
+    "columns_dropped": result.columns_dropped,
+    "columns_added": result.columns_added,
+    "coercion_failures": result.coercion_failures,
+    "unmapped_kept": result.unmapped_kept,
+    "missing_required_targets": result.missing_required_targets,
+}, indent=2, default=str).encode("utf-8")
+config_bytes = json.dumps(
+    st.session_state.get("colmap_options", {}), indent=2, default=str,
+).encode("utf-8")
+
+_no_mapping = not result.mapping
+
 dl_a, dl_b, dl_c = st.columns(3)
 with dl_a:
-    mapped_bytes = result.mapped_df.to_csv(index=False).encode("utf-8-sig")
    st.download_button(
        "Download mapped CSV",
        data=mapped_bytes,
        file_name=f"{stem}_mapped.csv",
        mime="text/csv",
+        key="colmap_dl_mapped",
+        use_container_width=True,
    )
 with dl_b:
-    audit_bytes = json.dumps({
-        "mapping": result.mapping,
-        "inferred_pairs": result.inferred_pairs,
-        "columns_renamed": result.columns_renamed,
-        "columns_dropped": result.columns_dropped,
-        "columns_added": result.columns_added,
-        "coercion_failures": result.coercion_failures,
-        "unmapped_kept": result.unmapped_kept,
-        "missing_required_targets": result.missing_required_targets,
-    }, indent=2, default=str).encode("utf-8")
    st.download_button(
        "Download mapping audit",
        data=audit_bytes,
        file_name=f"{stem}_mapping.json",
        mime="application/json",
+        key="colmap_dl_audit",
+        disabled=_no_mapping,
+        help="No mapping was applied." if _no_mapping else None,
+        use_container_width=True,
    )
 with dl_c:
-    config_bytes = json.dumps(
-        st.session_state.get("colmap_options", {}), indent=2, default=str,
-    ).encode("utf-8")
    st.download_button(
        "Download config JSON",
        data=config_bytes,
        file_name="column_map_config.json",
        mime="application/json",
+        key="colmap_dl_config",
+        use_container_width=True,
    )

 st.divider()
 st.caption("Runs locally. Your data never leaves this computer. | DataTools v3.0")
+
+# ---------------------------------------------------------------------------
+# Post-run auto-scroll
+# ---------------------------------------------------------------------------
+#
+# When the user clicks Apply Column Mapping, the preview + options
+# collapse but Streamlit by itself doesn't scroll — the Results section
+# is at the bottom of a tall script so the user has to find it. Inject
+# a tiny component-html iframe that calls ``scrollIntoView`` on the
+# parent's Results anchor. Streamlit's main page is same-origin with
+# component iframes so ``window.parent.document`` access is allowed.
+#
+# The flag is one-shot (``pop`` removes it) so re-renders triggered by
+# unrelated widgets in the Results section don't yank the viewport back
+# to the top of Results.
+if st.session_state.pop("_colmap_scroll_to_results", False):
+    from streamlit.components.v1 import html as _components_html
+    _components_html(
+        """
+        <script>
+          const doc = window.parent.document;
+          const target = doc.getElementById('colmap-results-anchor');
+          if (target) target.scrollIntoView({behavior: 'smooth', block: 'start'});
+        </script>
+        """,
+        height=0,
+    )