feat(layout-review): address review findings on pages 7-12
Find Duplicates (01_deduplicator): - Delete the redundant outer Options wrapper; surface threshold + survivor rule directly, push the rest behind a single Advanced pane. - Disambiguate competing primaries: top result is an auto-resolved preview (secondary download), review decisions are the single primary. - Plain-English match labels (exact / approximate); clarify the third. - Lift the match-card caption to a one-time instruction; note delimiter is delimited-text-only. Quality Check (08_validator_reporter) — stub: - Remove the dead disabled "Load rules file (JSON)" uploader so the stub invites a single action; keep the informative feature list. Map Columns (05_column_mapper): - Regroup schema -> mapping -> strategy/advanced (core task contiguous). - Make preset-vs-Advanced precedence legible (Custom + modified marker). - Adopt the compact file-intake banner; drop the duplicate resolved- mapping table; fix the add-row gutter style. Combine Files (07_multi_file_merger) — stub: - Actually disable the Merge CTA (add the disabled attribute). PDF to CSV (10_pdf_extractor): - Drop page/raw from the default preview to match export + fix the horizontal clip; surface raw via per-row affordance + overflow-x. - Move the column selector above the download button; give auto-excluded rows a reason; align the files card to Home; de-dupe the row count. Automated Workflows (09_pipeline_runner): - Replace hand-edited JSON step config with per-step control expanders; JSON moved behind Advanced import/export. - Editing the table marks the mode modified; fold the empty error column into the status pill; render summaries as plain English; collapse the explainer by default. Cross-cutting items (stub standardization on page 10, shared disabled- field token, remaining intake rollout) deferred to a holistic pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -67,69 +67,192 @@
|
||||
<summary>Options</summary>
|
||||
<div class="dt-expander-body">
|
||||
|
||||
<!-- Mode radio -->
|
||||
<!-- Mode radio. Editing the steps below auto-switches the mode from the
|
||||
recommended default to "Build interactively" (same precedence-visibility
|
||||
pattern as Fix Missing Values: the active state is made legible, and the
|
||||
default it superseded is marked "· modified"). -->
|
||||
<div class="dt-field">
|
||||
<label class="dt-label">How would you like to define the pipeline?</label>
|
||||
<div class="dt-radio-row" style="flex-direction:column;gap:9px">
|
||||
<span class="dt-radio on"><span class="dot"></span> Use the recommended default (text-clean → format → missing → dedup)</span>
|
||||
<span class="dt-radio"><span class="dot"></span> Build interactively</span>
|
||||
<span class="dt-radio"><span class="dot"></span> Use the recommended default (text-clean → format → missing → dedup) <span class="dt-count-pill warn" style="margin-left:4px">· modified</span></span>
|
||||
<span class="dt-radio on"><span class="dot"></span> Build interactively</span>
|
||||
<span class="dt-radio"><span class="dot"></span> Import a saved pipeline JSON</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="dt-precedence">
|
||||
<span class="dt-mi">edit</span>
|
||||
<span>You started from the recommended default and edited a step, so the mode switched to <strong>Build interactively</strong>. The steps below are now yours to change — pick <strong>recommended default</strong> again to discard your edits and restore the suggested order.</span>
|
||||
</div>
|
||||
|
||||
<p class="dt-caption" style="margin:10px 0">
|
||||
Edit the table to add, remove, reorder (drag the row index), enable, or configure each step.
|
||||
Add, remove, reorder (drag the row index), enable, or configure each step.
|
||||
Open a step's <strong>Configure</strong> panel to set its options in plain language.
|
||||
Tool order is recommended, not enforced — violations surface as warnings below the table.
|
||||
</p>
|
||||
|
||||
<!-- Pipeline editor (st.data_editor: Tool selectbox · Enabled checkbox · Options JSON) -->
|
||||
<!-- Pipeline editor. Each step row carries an enable toggle + a "Configure"
|
||||
expander that reveals that tool's OWN controls as the editing surface
|
||||
(built from .dt-* form classes). Raw per-row JSON has been removed;
|
||||
JSON survives only as import/export under "Advanced" below. -->
|
||||
<div class="dt-table-wrap">
|
||||
<table class="dt-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th class="idx"></th>
|
||||
<th>Tool</th>
|
||||
<th>Enabled</th>
|
||||
<th>Options (JSON)</th>
|
||||
<th>Step</th>
|
||||
<th style="text-align:center">Enabled</th>
|
||||
<th style="text-align:right">Configure</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td class="idx">≡ 0</td>
|
||||
<td>text_clean <span class="dt-mi" style="font-size:14px;vertical-align:-2px;color:var(--ink-tertiary)">expand_more</span></td>
|
||||
<td>text_clean</td>
|
||||
<td><span class="dt-check on" style="margin:0;justify-content:center"><span class="box"><span class="dt-mi">check</span></span></span></td>
|
||||
<td>{"trim": true, "collapse_whitespace": true}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="idx">≡ 1</td>
|
||||
<td>format_standardize <span class="dt-mi" style="font-size:14px;vertical-align:-2px;color:var(--ink-tertiary)">expand_more</span></td>
|
||||
<td><span class="dt-check on" style="margin:0;justify-content:center"><span class="box"><span class="dt-mi">check</span></span></span></td>
|
||||
<td>{"column_types": {"phone": "phone", "signup_date": "date"}}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="idx">≡ 2</td>
|
||||
<td>missing <span class="dt-mi" style="font-size:14px;vertical-align:-2px;color:var(--ink-tertiary)">expand_more</span></td>
|
||||
<td><span class="dt-check on" style="margin:0;justify-content:center"><span class="box"><span class="dt-mi">check</span></span></span></td>
|
||||
<td>{"strategy": "flag", "sentinels": ["N/A", "—"]}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="idx">≡ 3</td>
|
||||
<td>dedup <span class="dt-mi" style="font-size:14px;vertical-align:-2px;color:var(--ink-tertiary)">expand_more</span></td>
|
||||
<td><span class="dt-check on" style="margin:0;justify-content:center"><span class="box"><span class="dt-mi">check</span></span></span></td>
|
||||
<td>{"survivor_rule": "most_complete", "merge": true}</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="idx" style="color:var(--ink-tertiary)">+</td>
|
||||
<td colspan="3" style="color:var(--ink-tertiary);font-family:var(--font-sans)">Add row</td>
|
||||
<td style="text-align:right;color:var(--ink-tertiary)"><span class="dt-mi" style="font-size:16px;vertical-align:-3px">tune</span> Configure <span class="dt-mi" style="font-size:14px;vertical-align:-2px">expand_more</span></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<!-- text_clean config panel (open to show the per-step editing surface) -->
|
||||
<details class="dt-expander" open style="margin:6px 0 10px">
|
||||
<summary>Configure: text_clean</summary>
|
||||
<div class="dt-expander-body">
|
||||
<div class="dt-check on"><span class="box"><span class="dt-mi">check</span></span> Trim leading & trailing whitespace</div>
|
||||
<div class="dt-check on"><span class="box"><span class="dt-mi">check</span></span> Collapse repeated spaces to one</div>
|
||||
<div class="dt-check"><span class="box"></span> Normalize smart quotes & dashes to plain ASCII</div>
|
||||
<div class="dt-field">
|
||||
<label class="dt-label">Letter case</label>
|
||||
<div class="dt-select">Leave as-is</div>
|
||||
</div>
|
||||
</div>
|
||||
</details>
|
||||
|
||||
<div class="dt-table-wrap">
|
||||
<table class="dt-table">
|
||||
<tbody>
|
||||
<tr>
|
||||
<td class="idx">≡ 1</td>
|
||||
<td>format_standardize</td>
|
||||
<td><span class="dt-check on" style="margin:0;justify-content:center"><span class="box"><span class="dt-mi">check</span></span></span></td>
|
||||
<td style="text-align:right;color:var(--ink-tertiary)"><span class="dt-mi" style="font-size:16px;vertical-align:-3px">tune</span> Configure <span class="dt-mi" style="font-size:14px;vertical-align:-2px">chevron_right</span></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<!-- format_standardize config panel (collapsed) -->
|
||||
<details class="dt-expander" style="margin:6px 0 10px">
|
||||
<summary>Configure: format_standardize</summary>
|
||||
<div class="dt-expander-body">
|
||||
<p class="dt-caption" style="margin-bottom:8px">Choose a target format for each column. Columns left as “Leave as-is” are untouched.</p>
|
||||
<div class="dt-table-wrap">
|
||||
<table class="dt-table">
|
||||
<thead><tr><th>Column</th><th>Format as</th></tr></thead>
|
||||
<tbody>
|
||||
<tr><td>name</td><td><span class="dt-select" style="display:inline-block;min-width:150px;padding:4px 24px 4px 10px;color:var(--ink-tertiary)">Leave as-is</span></td></tr>
|
||||
<tr><td>email</td><td><span class="dt-select" style="display:inline-block;min-width:150px;padding:4px 24px 4px 10px;color:var(--ink-tertiary)">Leave as-is</span></td></tr>
|
||||
<tr><td>phone</td><td><span class="dt-select" style="display:inline-block;min-width:150px;padding:4px 24px 4px 10px">Phone number</span></td></tr>
|
||||
<tr><td>signup_date</td><td><span class="dt-select" style="display:inline-block;min-width:150px;padding:4px 24px 4px 10px">Date</span></td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
</details>
|
||||
|
||||
<div class="dt-table-wrap">
|
||||
<table class="dt-table">
|
||||
<tbody>
|
||||
<tr>
|
||||
<td class="idx">≡ 2</td>
|
||||
<td>missing</td>
|
||||
<td><span class="dt-check on" style="margin:0;justify-content:center"><span class="box"><span class="dt-mi">check</span></span></span></td>
|
||||
<td style="text-align:right;color:var(--ink-tertiary)"><span class="dt-mi" style="font-size:16px;vertical-align:-3px">tune</span> Configure <span class="dt-mi" style="font-size:14px;vertical-align:-2px">chevron_right</span></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<!-- missing config panel (collapsed) -->
|
||||
<details class="dt-expander" style="margin:6px 0 10px">
|
||||
<summary>Configure: missing</summary>
|
||||
<div class="dt-expander-body">
|
||||
<div class="dt-field">
|
||||
<label class="dt-label">What should happen to blank cells?</label>
|
||||
<div class="dt-radio-row" style="flex-direction:column;gap:8px">
|
||||
<span class="dt-radio on"><span class="dot"></span> Flag them (mark blanks, change nothing)</span>
|
||||
<span class="dt-radio"><span class="dot"></span> Fill them in (numbers → median, text → most common)</span>
|
||||
<span class="dt-radio"><span class="dot"></span> Drop rows that have any blank</span>
|
||||
</div>
|
||||
</div>
|
||||
<div class="dt-field">
|
||||
<label class="dt-label">Treat these as blank (comma-separated)</label>
|
||||
<div class="dt-input">N/A, —</div>
|
||||
<div class="dt-help-text">Matched case-insensitively after stripping whitespace.</div>
|
||||
</div>
|
||||
</div>
|
||||
</details>
|
||||
|
||||
<div class="dt-table-wrap">
|
||||
<table class="dt-table">
|
||||
<tbody>
|
||||
<tr>
|
||||
<td class="idx">≡ 3</td>
|
||||
<td>dedup</td>
|
||||
<td><span class="dt-check on" style="margin:0;justify-content:center"><span class="box"><span class="dt-mi">check</span></span></span></td>
|
||||
<td style="text-align:right;color:var(--ink-tertiary)"><span class="dt-mi" style="font-size:16px;vertical-align:-3px">tune</span> Configure <span class="dt-mi" style="font-size:14px;vertical-align:-2px">chevron_right</span></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="idx" style="color:var(--ink-tertiary)">+</td>
|
||||
<td colspan="3" style="color:var(--ink-tertiary);font-family:var(--font-sans)">Add step</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<!-- dedup config panel (collapsed) -->
|
||||
<details class="dt-expander" style="margin:6px 0 10px">
|
||||
<summary>Configure: dedup</summary>
|
||||
<div class="dt-expander-body">
|
||||
<div class="dt-field">
|
||||
<label class="dt-label">When rows match, which one survives?</label>
|
||||
<div class="dt-select">Keep the most complete row</div>
|
||||
<div class="dt-help-text">Other options: keep the first seen, keep the last seen.</div>
|
||||
</div>
|
||||
<div class="dt-check on"><span class="box"><span class="dt-mi">check</span></span> Merge matched rows (fill each survivor's blanks from its duplicates)</div>
|
||||
<div class="dt-field">
|
||||
<label class="dt-label">Match on these columns</label>
|
||||
<div class="dt-multiselect">
|
||||
<span class="dt-ms-chip">email <span class="x">✕</span></span>
|
||||
<span class="dt-ms-chip">phone <span class="x">✕</span></span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</details>
|
||||
|
||||
<!-- Validation: pipeline is in recommended order, so no warning shown (warning block omitted) -->
|
||||
|
||||
<!-- Advanced: JSON is import/export only, never the per-step editing surface -->
|
||||
<details class="dt-expander" style="margin-top:14px">
|
||||
<summary>Advanced — import / export pipeline as JSON</summary>
|
||||
<div class="dt-expander-body">
|
||||
<p class="dt-caption" style="margin-bottom:8px">For sharing or version control. Editing is done in the step panels above — this is just the saved form of the same settings.</p>
|
||||
<div class="dt-code">{
|
||||
"version": 1,
|
||||
"steps": [
|
||||
{"tool": "text_clean", "enabled": true, "options": {"trim": true, "collapse_whitespace": true}},
|
||||
{"tool": "format_standardize", "enabled": true, "options": {"column_types": {"phone": "phone", "signup_date": "date"}}},
|
||||
{"tool": "missing", "enabled": true, "options": {"strategy": "flag", "sentinels": ["N/A", "—"]}},
|
||||
{"tool": "dedup", "enabled": true, "options": {"survivor_rule": "most_complete", "merge": true, "keys": ["email", "phone"]}}
|
||||
]
|
||||
}</div>
|
||||
<div class="dt-btn-row" style="margin-top:10px">
|
||||
<button class="dt-btn"><span class="dt-mi">upload</span> Import JSON</button>
|
||||
<button class="dt-btn"><span class="dt-mi">download</span> Export JSON</button>
|
||||
</div>
|
||||
</div>
|
||||
</details>
|
||||
|
||||
<!-- Nested explainer expander -->
|
||||
<details class="dt-expander" open style="margin-top:14px">
|
||||
<details class="dt-expander" style="margin-top:14px">
|
||||
<summary>Recommended tool order — why each step belongs where it does</summary>
|
||||
<div class="dt-expander-body">
|
||||
<p><strong>text_clean</strong> before <strong>format_standardize</strong> — format parsers (phone / currency / date) fail on smart-quote-contaminated or NBSP-padded input — clean text first</p>
|
||||
@@ -161,39 +284,49 @@
|
||||
</div>
|
||||
|
||||
<h4>Per-step summary</h4>
|
||||
<!-- Standalone error column removed: status is one pill per step. A failed step
|
||||
turns the pill danger and surfaces its message in a detail row directly below
|
||||
that step (shown only on failure); successful steps just show a green pill.
|
||||
Summaries are plain-English phrases, not raw JSON. Demo: this run completed
|
||||
cleanly (all four ok, matching the metrics above) — the format_standardize
|
||||
row carries a warn pill + detail row to illustrate how a non-fatal step issue
|
||||
surfaces inline without a dedicated always-empty column. -->
|
||||
<div class="dt-table-wrap">
|
||||
<table class="dt-table">
|
||||
<thead>
|
||||
<tr><th>step</th><th>status</th><th>elapsed_ms</th><th>summary</th><th>error</th></tr>
|
||||
<tr><th>step</th><th>status</th><th>elapsed</th><th>summary</th></tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>text_clean</td>
|
||||
<td><span class="dt-count-pill success">ok</span></td>
|
||||
<td>214</td>
|
||||
<td>{"cells_changed": 1204, "columns": ["name", "city"]}</td>
|
||||
<td></td>
|
||||
<td>214 ms</td>
|
||||
<td style="font-family:var(--font-sans)">1,204 cells changed in name & city</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>format_standardize</td>
|
||||
<td><span class="dt-count-pill success">ok</span></td>
|
||||
<td>388</td>
|
||||
<td>{"phone": 18301, "signup_date": 17996}</td>
|
||||
<td><span class="dt-count-pill warn"><span class="dt-mi" style="font-size:13px;margin-right:3px">warning</span> ok · 141 skipped</span></td>
|
||||
<td>388 ms</td>
|
||||
<td style="font-family:var(--font-sans)">18,301 phones and 17,996 dates standardized</td>
|
||||
</tr>
|
||||
<tr style="background:var(--warn-fill)">
|
||||
<td></td>
|
||||
<td colspan="3" style="font-family:var(--font-sans);color:var(--warn);white-space:normal">
|
||||
<span class="dt-mi" style="font-size:15px;vertical-align:-3px;margin-right:4px">info</span>
|
||||
141 phone values didn't match any known pattern and were left unchanged. The step still completed — review them in the output preview if needed.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>missing</td>
|
||||
<td><span class="dt-count-pill success">ok</span></td>
|
||||
<td>121</td>
|
||||
<td>{"flagged_cells": 642, "sentinels_found": ["—"]}</td>
|
||||
<td></td>
|
||||
<td>121 ms</td>
|
||||
<td style="font-family:var(--font-sans)">642 blank cells flagged (sentinel “—”)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>dedup</td>
|
||||
<td><span class="dt-count-pill success">ok</span></td>
|
||||
<td>911</td>
|
||||
<td>{"input_rows": 18442, "output_rows": 18130, "duplicates_removed": 312, "groups": 147}</td>
|
||||
<td></td>
|
||||
<td>911 ms</td>
|
||||
<td style="font-family:var(--font-sans)">312 duplicates removed across 147 groups (18,442 → 18,130 rows)</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
Reference in New Issue
Block a user