From 2592604067da9ab7666c35acffde09117f1f7e7a Mon Sep 17 00:00:00 2001 From: Michael Date: Mon, 8 Jun 2026 16:14:04 +0000 Subject: [PATCH 01/10] feat(layout-review): address Home page review findings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Findings card no longer truncates silently: panel #1 gains a .dt-finding-more overflow control ("Show all 8 findings · 5 more"). - Replace the dead "Files analyzed: 3" stat (restated the section meta + visible rows) with "Rows scanned" — info not already on screen. - Collapsed findings panels use a real .is-collapsed state variant instead of inline margin-bottom:-16px hacks, so states can't drift. - Action bar buttons are content-sized; drop the 340px island that jarred against the full-width divider/stats below it. Branding kept as deliberate landing-style treatment on Home (per review decision); interior tool pages remain title-only. Co-Authored-By: Claude Opus 4.8 (1M context) --- layout-review/app.css | 19 +++++++++++++++++++ layout-review/home.html | 22 +++++++++++++--------- 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/layout-review/app.css b/layout-review/app.css index b363eaa..11b3a04 100644 --- a/layout-review/app.css +++ b/layout-review/app.css @@ -445,6 +445,25 @@ table.dt-table td.idx { color: var(--ink-tertiary); background: var(--surface-ho .dt-finding-title strong { font-weight: 500; } .dt-finding-meta { font-family: var(--font-mono); font-size: 12px; color: var(--ink-tertiary); line-height: 1.4; margin: 0; font-feature-settings: "ss02"; } +/* Overflow control — sits at the foot of a findings card when rows are hidden. + Bleeds to the card edges (cancels the .dt-card 16px padding) like .dt-file-add. */ +.dt-finding-more { + display: flex; align-items: center; justify-content: center; gap: 6px; + width: calc(100% + 32px); margin: 4px -16px -16px; + padding: 11px 16px; background: var(--surface-hover); + border: none; border-top: 1px solid var(--border); + border-radius: 0 0 var(--r-lg) var(--r-lg); cursor: pointer; + font-family: var(--font-sans); font-size: 12.5px; font-weight: 500; color: var(--ink-secondary); +} +.dt-finding-more:hover { background: var(--accent-fill); color: var(--accent); } +.dt-finding-more .dt-mi { font-family: "Material Symbols Outlined"; font-size: 18px; } + +/* Collapsed findings panel — the group head fills the whole card (head only, + no body). Proper state variant so the two states don't drift; replaces the + per-instance inline margin-bottom:-16px hack. */ +.dt-card.is-collapsed { padding: 0; } +.dt-finding-group-head.is-collapsed { margin: 0; border-bottom: none; border-radius: var(--r-lg); } + /* Match-group review card (dedup) */ .dt-match-card { background: var(--surface); border: 1px solid var(--border); border-radius: var(--r-lg); box-shadow: 0 1px 2px rgba(28,25,23,0.03); margin: 12px 0; overflow: hidden; } .dt-match-head { background: var(--surface-hover); border-bottom: 1px solid var(--border); padding: 12px 16px; display: flex; align-items: center; gap: 12px; } diff --git a/layout-review/home.html b/layout-review/home.html index 5c4d3ca..38557da 100644 --- a/layout-review/home.html +++ b/layout-review/home.html @@ -69,9 +69,9 @@ -
- - +
+ +

@@ -79,8 +79,8 @@
-
Files analyzed
-
3
+
Rows scanned
+
48,210 rows
Total findings
@@ -129,11 +129,15 @@

3 formats detected · Standardize Formats →

+ +
-
-
+
@@ -150,56 +151,34 @@

Results

-
Matched
1,173
Review
9
Unmatched left
22
Unmatched right
16
+
Matched
1,173

Coverage: 97.4% of the larger side

- +
- Matched (1,173) - Review (9) + Review (9) Unmatched left (22) Unmatched right (16) + Matched (1,173)
- -

Preview of first 25 of 1,173 rows — download the CSV below for the full set.

+ +

Pairs flagged because the algorithm couldn't pick a single best match (e.g. multiple equally-good candidates). Use the left/right indices to disambiguate manually.

- - - - + - - - - - + +
left_posted_dateleft_descriptionleft_amountright_txn_dateright_memoright_valueamount_diff
left_idxleft_amountright_idxright_valuecandidates
2026-05-01ACME SUPPLIES-1240.002026-05-01Acme Supplies Inc-1240.000.00
2026-05-02PAYROLL RUN-8800.002026-05-02Monthly payroll-8800.000.00
2026-05-03CLIENT GLOBEX5200.002026-05-03Globex retainer5200.000.00
2026-05-04UTILITY CO-318.422026-05-04City Utilities-318.400.02
2026-05-06OFFICE DEPOT-89.152026-05-07Office supplies-89.150.00
118-450.00121, 209-450.002 equal
2031000.00198, 2441000.002 equal
-
- Review (9) — ambiguous candidates -
-

Pairs flagged because the algorithm couldn't pick a single best match (e.g. multiple equally-good candidates). Use the left/right indices to disambiguate manually.

-
- - - - - - -
left_idxleft_amountright_idxright_valuecandidates
118-450.00121, 209-450.002 equal
2031000.00198, 2441000.002 equal
-
-
-
-
Unmatched left (22) — only in bank_feed_may.csv
@@ -232,6 +211,28 @@
+
+ Matched (1,173) — cleanly reconciled +
+

Preview of first 25 of 1,173 rows — download the CSV below for the full set.

+
+ + + + + + + + + + + + +
left_posted_dateleft_descriptionleft_amountright_txn_dateright_memoright_valueamount_diff
2026-05-01ACME SUPPLIES-1240.002026-05-01Acme Supplies Inc-1240.000.00
2026-05-02PAYROLL RUN-8800.002026-05-02Monthly payroll-8800.000.00
2026-05-03CLIENT GLOBEX5200.002026-05-03Globex retainer5200.000.00
2026-05-04UTILITY CO-318.422026-05-04City Utilities-318.400.02
2026-05-06OFFICE DEPOT-89.152026-05-07Office supplies-89.150.00
+
+
+
+
From be1e263223aa6586b91bb38a4aad0eda3aa88cfc Mon Sep 17 00:00:00 2001 From: Michael Date: Mon, 8 Jun 2026 16:23:32 +0000 Subject: [PATCH 03/10] feat(layout-review): address Fix Missing Values review findings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Pin down strategy precedence: add a resolution-order legend (per-column -> global -> preset), dim/strike the preset radios when a global strategy overrides them, and add a "Resolves to" column to the per-column override table so the winning value is legible. - Make the demo state honest: Global strategy = median is what drives the 1,043 fills, resolving the detect-only contradiction. - Surface the missingness profile as an always-visible block above the (now-open) Options expander — diagnostic before configuration. - Stop highlighting unchanged before/after cells (respondent_id 0->0); show "(global)" placeholders in unset per-column override cells. - Fold the standalone "Strategy applied per column" table into the before/after table as a strategy column; inset maxed slider knobs. Co-Authored-By: Claude Opus 4.8 (1M context) --- layout-review/04_missing_handler.html | 105 ++++++++++++-------------- layout-review/app.css | 14 ++++ 2 files changed, 62 insertions(+), 57 deletions(-) diff --git a/layout-review/04_missing_handler.html b/layout-review/04_missing_handler.html index 8475a01..53aca8e 100644 --- a/layout-review/04_missing_handler.html +++ b/layout-review/04_missing_handler.html @@ -63,39 +63,44 @@
- -
+ +

Missingness profile

+
+
Rows
2,150
+
Cells missing
1,043
+
% cells missing
8.1%
+
Complete rows
1,388
+
+
+ + + + + + + + + + +
columndtypemissingmissing_pctdisguisedhas_missing
respondent_idobject00.0%0False
agefloat641878.7%61True
regionobject1426.6%142True
incomefloat6432915.3%118True
satisfactionfloat64954.4%40True
commentsobject29013.5%290True
+
+ +
+ + +
Options
-

Missingness profile

-
-
Rows
2,150
-
Cells missing
1,043
-
% cells missing
8.1%
-
Complete rows
1,388
-
- -
- - - - - - - - - - -
columndtypemissingmissing_pctdisguisedhas_missing
respondent_idobject00.0%0False
agefloat641878.7%61True
regionobject1426.6%142True
incomefloat6432915.3%118True
satisfactionfloat64954.4%40True
commentsobject29013.5%290True
-
- -
-

Strategy

+
+ layers + Resolution order: per-column overrideglobal strategypreset. The most specific setting wins; layers it overrides are dimmed. +
-
+
info Overridden by Global strategy → median (set under Advanced options). Presets apply only when global is “(use preset)”.
+
detect-only (standardize sentinels to NaN, no fill or drop) safe-fill (numeric → median, categorical → mode) drop-incomplete (drop any row with missing) @@ -121,7 +126,7 @@

Strategy override

-
(use preset)
+
median
drop_row / drop_col use the thresholds below. mean / median / interpolate are numeric only — non-numeric columns fall back to the categorical strategy.
@@ -135,11 +140,11 @@
-
1.00
+
1.00
-
1.00
+
1.00
@@ -164,13 +169,13 @@

Set a different strategy for specific columns. Leave any row blank to use the global strategy.

- + - - - - - + + + + +
ColumnOverride
ColumnOverrideResolves to
agemedian
regionmode
income
satisfaction
commentsconstant
age(global)median · global
region(global)mode · global → categorical fallback
income(global)median · global
satisfaction(global)median · global
commentsconstantconstant · this column
@@ -198,28 +203,14 @@

Missingness — before vs. after

- + - - - - - - - -
columnbefore_missingbefore_pctafter_missingafter_pct
columnbefore_missingbefore_pctafter_missingafter_pctstrategy
respondent_id00.000.0
age1878.700.0
region1426.600.0
income32915.300.0
satisfaction954.400.0
comments29013.500.0
-
- -

Strategy applied per column

-
- - - - - - - - + + + + + +
columnstrategy
agemedian
regionmode
incomemedian
satisfactionmedian
commentsconstant
respondent_id00.000.0
age1878.700.0median
region1426.600.0mode
income32915.300.0median
satisfaction954.400.0median
comments29013.500.0constant
diff --git a/layout-review/app.css b/layout-review/app.css index 11b3a04..9ce8263 100644 --- a/layout-review/app.css +++ b/layout-review/app.css @@ -330,6 +330,20 @@ code, .dt-mono { font-family: var(--font-mono); font-size: 0.92em; font-feature- .dt-radio .dot { width: 16px; height: 16px; border-radius: 50%; border: 1px solid var(--border-strong); display: inline-block; flex-shrink: 0; } .dt-radio.on .dot { border: 5px solid var(--ink); } +/* Strategy precedence legend + overridden state (Fix Missing Values). + Makes the preset -> global -> per-column resolution order legible and + visibly dims a layer when a more specific layer wins. */ +.dt-precedence { + display: flex; align-items: center; gap: 8px; + background: var(--surface-hover); border: 1px solid var(--border); + border-radius: var(--r-md); padding: 9px 13px; margin: 0 0 14px; + font-size: 12.5px; color: var(--ink-secondary); line-height: 1.4; +} +.dt-precedence .dt-mi { font-family: "Material Symbols Outlined"; font-size: 18px; color: var(--ink-tertiary); flex-shrink: 0; } +.dt-precedence strong { color: var(--ink); font-weight: 600; } +.dt-radio-row.is-overridden { opacity: 0.5; } +.dt-radio-row.is-overridden .dt-radio { text-decoration: line-through; text-decoration-color: var(--ink-tertiary); } + /* Slider */ .dt-slider { margin: 14px 0 6px; } .dt-slider .track { position: relative; height: 4px; background: var(--border-strong); border-radius: 2px; } From 563d845b7010fb4dc26ce037c1ce43ba1e2bdb38 Mon Sep 17 00:00:00 2001 From: Michael Date: Mon, 8 Jun 2026 16:27:42 +0000 Subject: [PATCH 04/10] feat(layout-review): address review findings on pages 4-6 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Find Unusual Values (06_outlier_detector) — coming-soon stub: - Anchor the disabled Method on IQR (multiplier 1.5), not Z-score, per the logged robustness decision. - Drop the redundant feature bullet list (kept alert + greyed controls + disabled button); also fixes the MAD-only-in-bullets mismatch. - Remove the live uploader that dead-ended into disabled controls. Clean Text (02_text_cleaner): - Add an inline hidden-character legend (3 swatches reusing the actual badge classes) beside the canonical "Show hidden characters" toggle. - Unify the two hidden-char toggles: preview one is canonical; the Results bare checkbox is wrapped in a field + bound note. - Describe all three presets (minimal / excel-hygiene / paranoid). - Give "Changes by column" a real "column" header instead of the grey index-gutter style. Standardize Formats (03_format_standardizer): - Make preset-vs-control precedence legible: preset shows Custom with a "modified" marker + base tag, diverging controls flag the winning value (same pattern as Fix Missing Values). - Replace the dead-end unparseable alert with a real "Unparseable cells (47)" expander the alert now points to. - Honest preview caption: "5 of 6 columns (notes skipped)". Intake pattern (the cross-page reference) left untouched. Co-Authored-By: Claude Opus 4.8 (1M context) --- layout-review/02_text_cleaner.html | 28 ++++++++++---- layout-review/03_format_standardizer.html | 45 +++++++++++++++++++---- layout-review/06_outlier_detector.html | 30 +-------------- 3 files changed, 59 insertions(+), 44 deletions(-) diff --git a/layout-review/02_text_cleaner.html b/layout-review/02_text_cleaner.html index 6f49c95..3473806 100644 --- a/layout-review/02_text_cleaner.html +++ b/layout-review/02_text_cleaner.html @@ -54,7 +54,12 @@ Preview: contacts_messy.csv

4,120 rows, 4 columns

-
check Show hidden characters in preview
+
check Show hidden characters
+
+ · Whitespace + Smart / special + Control +
@@ -82,7 +87,11 @@ minimal paranoid -
excel-hygiene: trim, collapse whitespace, fold smart quotes, strip invisible chars, normalize line endings, NFC.
+
+ minimal: trim and collapse whitespace only — no character substitutions.
+ excel-hygiene: trim, collapse whitespace, fold smart quotes, strip invisible chars, normalize line endings, NFC.
+ paranoid: everything in excel-hygiene plus strip control characters, strip BOM, and NFKC compatibility fold (lossy). +
@@ -143,17 +152,20 @@
Columns processed
4
-
check Show hidden characters (NBSP, ZWSP, smart quotes, control chars…)
+
+
check Show hidden characters (NBSP, ZWSP, smart quotes, control chars…)
+
Same setting as “Show hidden characters” in the preview above — toggling either updates both.
+

Changes by column

nameemailcompanynotes
- + - - - - + + + +
cells_changed
columncells_changed
company1,604
name1,210
notes982
email151
company1,604
name1,210
notes982
email151
diff --git a/layout-review/03_format_standardizer.html b/layout-review/03_format_standardizer.html index a0bff95..8faacb3 100644 --- a/layout-review/03_format_standardizer.html +++ b/layout-review/03_format_standardizer.html @@ -76,18 +76,23 @@

Format options

- +
- US (default) — ISO 8601 dates · E.164 phones · USD - European — DMY input · INTL phones · EUR comma decimal + US (default) — ISO 8601 dates · E.164 phones · USD + European — DMY input · INTL phones · EUR comma decimal base UK — DD/MM/YYYY · GB phones · Yes/No booleans ISO Strict — ISO 8601 · bare-number currency · true/false Legacy US — MM/DD/YYYY · National phones · Yes/No - Custom — keep current settings + Custom — based on European, 2 controls changed modified
-
Pick a published standard or regional convention as the baseline. Every option below is still individually overridable.
+
+ rule + Individual controls win over the preset. You started from European, then changed Ambiguous input order and Decimal separator below — so the preset is now Custom. The controls' current values are what actually run. +
+
Pick a published standard or regional convention as the baseline. Every option below is still individually overridable; overriding any one switches the preset to Custom.
@@ -97,11 +102,12 @@

Dates

YYYY-MM-DD (ISO)
- +
MDY (US) DMY (EU)
+
Winning value: MDY. Overrides the European base (DMY) — 01/02/2024 reads as 2024-01-02.

Phones

@@ -117,11 +123,12 @@

Currency

- +
dot (1,234.56) comma (1.234,56)
+
Winning value: dot. Overrides the European base (comma) — $1,234.5 reads as 1234.50.
2
Preserve original precision (don't round)
@@ -154,9 +161,30 @@
info - 47 cell(s) in typed columns didn't match a recognizable shape and were left as-is. Check the changes audit below to find them, or re-classify the column to (skip). + 47 cell(s) in typed columns didn't match a recognizable shape and were left as-is. See Unparseable cells below to review them, or re-classify the column to (skip). (They aren't in the changes audit — nothing was changed.)
+ +
+ Unparseable cells (47) +
+

Cells in typed columns that didn't match a recognizable shape and were left unchanged.

+
+ + + + + + + + + +
rowcolumnfield_typevalue (left as-is)
318signup_datedatesoon
902phonephoneext. 4471
1,544amountcurrencyTBD
2,087activebooleanmaybe
3,610signup_datedate00/00/0000
+
+

… and 42 more.

+
+
+

Changes by column

@@ -194,6 +222,7 @@

Standardized preview (first 10 rows)

+

Showing 5 of 6 columns — notes is set to (skip), so it's omitted here.

diff --git a/layout-review/06_outlier_detector.html b/layout-review/06_outlier_detector.html index 546a847..4d73b4c 100644 --- a/layout-review/06_outlier_detector.html +++ b/layout-review/06_outlier_detector.html @@ -12,7 +12,7 @@
visibility - Static layout preview of Find Unusual Values — a Coming Soon tool. The page is a stub/teaser: an "under development" notice, a list of planned features, and disabled placeholder controls (only the file uploader is live). All pages → + Static layout preview of Find Unusual Values — a Coming Soon tool. The page is a stub/teaser: an "under development" notice and disabled placeholder controls. All pages →
@@ -31,40 +31,14 @@ This tool is under development.
- -

Features:

-
    -
  • Z-score detection (configurable threshold)
  • -
  • IQR (interquartile range) detection
  • -
  • MAD (median absolute deviation) detection
  • -
  • Domain-rule violations (e.g., age < 0, price > $1M)
  • -
  • Visual outlier highlighting in data preview
  • -
  • Handling: flag only, remove, cap/winsorize to bounds
  • -
-
- - -
-
- upload_file Drag and drop file here - CSV, TSV, XLSX, XLS · Import a file to preview. Processing is not yet available. -
- -
-

Detection Method

-
Z-Score
-
- -
- -
3.0
+
IQR (interquartile range)
From cf31d9ef141d4471f2f99c2abc4a12d7ccbe2190 Mon Sep 17 00:00:00 2001 From: Michael Date: Mon, 8 Jun 2026 16:35:46 +0000 Subject: [PATCH 05/10] feat(layout-review): address review findings on pages 7-12 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Find Duplicates (01_deduplicator): - Delete the redundant outer Options wrapper; surface threshold + survivor rule directly, push the rest behind a single Advanced pane. - Disambiguate competing primaries: top result is an auto-resolved preview (secondary download), review decisions are the single primary. - Plain-English match labels (exact / approximate); clarify the third. - Lift the match-card caption to a one-time instruction; note delimiter is delimited-text-only. Quality Check (08_validator_reporter) — stub: - Remove the dead disabled "Load rules file (JSON)" uploader so the stub invites a single action; keep the informative feature list. Map Columns (05_column_mapper): - Regroup schema -> mapping -> strategy/advanced (core task contiguous). - Make preset-vs-Advanced precedence legible (Custom + modified marker). - Adopt the compact file-intake banner; drop the duplicate resolved- mapping table; fix the add-row gutter style. Combine Files (07_multi_file_merger) — stub: - Actually disable the Merge CTA (add the disabled attribute). PDF to CSV (10_pdf_extractor): - Drop page/raw from the default preview to match export + fix the horizontal clip; surface raw via per-row affordance + overflow-x. - Move the column selector above the download button; give auto-excluded rows a reason; align the files card to Home; de-dupe the row count. Automated Workflows (09_pipeline_runner): - Replace hand-edited JSON step config with per-step control expanders; JSON moved behind Advanced import/export. - Editing the table marks the mode modified; fold the empty error column into the status pill; render summaries as plain English; collapse the explainer by default. Cross-cutting items (stub standardization on page 10, shared disabled- field token, remaining intake rollout) deferred to a holistic pass. Co-Authored-By: Claude Opus 4.8 (1M context) --- layout-review/01_deduplicator.html | 59 +++--- layout-review/05_column_mapper.html | 121 ++++++------ layout-review/07_multi_file_merger.html | 2 +- layout-review/08_validator_reporter.html | 9 - layout-review/09_pipeline_runner.html | 225 ++++++++++++++++++----- layout-review/10_pdf_extractor.html | 69 +++---- 6 files changed, 302 insertions(+), 183 deletions(-) diff --git a/layout-review/01_deduplicator.html b/layout-review/01_deduplicator.html index b00c79f..e396728 100644 --- a/layout-review/01_deduplicator.html +++ b/layout-review/01_deduplicator.html @@ -41,7 +41,8 @@
- +
Comma (,)
@@ -67,32 +68,33 @@
- + +
+
+
85
+
Higher means rows must look more alike to count as a duplicate.
+
+
the most-complete row
+
Which row survives in each group of duplicates.
+
+ +
- Options + Advanced options
-
- Advanced Options -
-
-
-
-
Leave empty to auto-detect
-
-
email
-
-
name
-
-
-
jaro_winkler
-
-
85
-
most-complete
-
-
-
check Merge mode — fill missing fields in the surviving row
+

Leave these empty to auto-detect which columns to compare. Otherwise, list the columns that must match exactly and the ones that only need to match approximately — together these are the columns used to find duplicates.

+
+
+
+
email
+
+
name
-
+
+
jaro_winkler
+
+
+
check Merge mode — fill missing fields in the surviving row
@@ -109,8 +111,9 @@
Match groups
147
Rows kept
18,130
+

Preview of an auto-resolved run: each group keeps its auto-picked survivor. Review the groups below to override any pending picks before the final download.

- +
@@ -123,6 +126,7 @@ +

Differing columns are highlighted. The survivor row is kept; uncheck a row to split it out of the group.

@@ -140,7 +144,6 @@
full_namephoneamountsignup_dateactive
-

Differing columns highlighted. The survivor row is kept; uncheck rows to split the group.

@@ -163,8 +166,8 @@
-

Decisions: 1 merged, 1 pending

- +

Decisions: 1 merged, 1 pending · Pending groups keep their auto-picked survivor unless you review them.

+
diff --git a/layout-review/05_column_mapper.html b/layout-review/05_column_mapper.html index c0c2a02..89bbcbe 100644 --- a/layout-review/05_column_mapper.html +++ b/layout-review/05_column_mapper.html @@ -25,22 +25,12 @@
- -

You can also import a file on the home screen and pick it up here.

- -
-
- upload_file Drag and drop file here - Up to 1.5 GB · CSV, TSV, XLSX, XLS · encoding & delimiter auto-detected -
- -
-
- - crm_contacts_raw.csv - 684 KB - + +
+ description + Using crm_contacts_raw.csv from the upload screen.
+
@@ -93,7 +83,7 @@ signup_datedate✗Signup amount_spentfloat✗0.0Amount Spent sourcestring✗crm-import - add add row + add add row
@@ -101,43 +91,8 @@
- -

Strategy

-
- -
- rename-only (just rename, leave types alone, keep extras) - lenient-schema (rename + coerce + reorder, keep extras) - strict-schema (rename + coerce + reorder, drop extras) -
-
- - -
- Advanced options -
-
-
-
- -
keep
-
-
check Coerce types per schema
-
check Reorder to schema order
-
-
-
check Auto-infer mapping (fuzzy match)
-
- -
0.80
-
-
check Enforce required fields
-
-
-
-
- +

Mapping

@@ -153,7 +108,53 @@
-

Pick a target for each source column. Notes stays unmapped — with the lenient preset it is kept as-is. source is added from the schema default.

+

Pick a target for each source column. Notes stays unmapped — with the keep-extras strategy it is kept as-is. source is added from the schema default.

+ +
+ + + +

Strategy

+
+ +
+ rename-only (just rename, leave types alone, keep extras) + lenient-schema (rename + coerce + reorder, keep extras) + strict-schema (rename + coerce + reorder, drop extras) base + Custom — based on strict-schema, 1 control changed modified +
+
+ rule + Individual Advanced controls win over the preset. You started from strict-schema, then changed Unmapped source columns to keep below — so the preset is now Custom. The controls' current values are what actually run. +
+
Pick a strategy as the baseline. Every Advanced toggle below is still individually overridable; overriding any one switches the preset to Custom.
+
+ + +
+ Advanced options +
+
+
+
+ +
keep
+
Winning value: keep. Overrides the strict-schema base (drop) — so Notes survives into the output.
+
+
check Coerce types per schema
+
check Reorder to schema order
+
+
+
check Auto-infer mapping (fuzzy match)
+
+ +
0.80
+
+
check Enforce required fields
+
+
+
+
@@ -176,20 +177,6 @@
infoAdded (with defaults): source
warningSome cells could not be coerced and were left as NaN: amount_spent (3)
-

Resolved mapping

-
- - - - - - - - - -
sourcetargetauto
Full Namefull_nameTrue
EmailAddremailTrue
Phone #phoneTrue
Signupsignup_dateTrue
Amount Spentamount_spentTrue
-
-

Mapped preview (first 10 rows)

diff --git a/layout-review/07_multi_file_merger.html b/layout-review/07_multi_file_merger.html index ede9b11..c25a344 100644 --- a/layout-review/07_multi_file_merger.html +++ b/layout-review/07_multi_file_merger.html @@ -72,7 +72,7 @@
- + diff --git a/layout-review/08_validator_reporter.html b/layout-review/08_validator_reporter.html index d255430..ab70f04 100644 --- a/layout-review/08_validator_reporter.html +++ b/layout-review/08_validator_reporter.html @@ -57,15 +57,6 @@

Validation Rules

- -
-
- upload_file Drag and drop file here - JSON -
- -
-
diff --git a/layout-review/09_pipeline_runner.html b/layout-review/09_pipeline_runner.html index 63426bd..f7aedf1 100644 --- a/layout-review/09_pipeline_runner.html +++ b/layout-review/09_pipeline_runner.html @@ -67,69 +67,192 @@ Options
- +
- Use the recommended default (text-clean → format → missing → dedup) - Build interactively + Use the recommended default (text-clean → format → missing → dedup) · modified + Build interactively Import a saved pipeline JSON
+
+ edit + You started from the recommended default and edited a step, so the mode switched to Build interactively. The steps below are now yours to change — pick recommended default again to discard your edits and restore the suggested order. +
+

- Edit the table to add, remove, reorder (drag the row index), enable, or configure each step. + Add, remove, reorder (drag the row index), enable, or configure each step. + Open a step's Configure panel to set its options in plain language. Tool order is recommended, not enforced — violations surface as warnings below the table.

- +
- - - + + + - + - - - - - - - - - - - - - - - - - - - - - - - +
ToolEnabledOptions (JSON)StepEnabledConfigure
≡ 0text_clean expand_moretext_clean check{"trim": true, "collapse_whitespace": true}
≡ 1format_standardize expand_morecheck{"column_types": {"phone": "phone", "signup_date": "date"}}
≡ 2missing expand_morecheck{"strategy": "flag", "sentinels": ["N/A", "—"]}
≡ 3dedup expand_morecheck{"survivor_rule": "most_complete", "merge": true}
Add rowtune Configure expand_more
+ +
+ Configure: text_clean +
+
check Trim leading & trailing whitespace
+
check Collapse repeated spaces to one
+
Normalize smart quotes & dashes to plain ASCII
+
+ +
Leave as-is
+
+
+
+ +
+ + + + + + + + + +
≡ 1format_standardizechecktune Configure chevron_right
+
+ +
+ Configure: format_standardize +
+

Choose a target format for each column. Columns left as “Leave as-is” are untouched.

+
+ + + + + + + + +
ColumnFormat as
nameLeave as-is
emailLeave as-is
phonePhone number
signup_dateDate
+
+
+
+ +
+ + + + + + + + + +
≡ 2missingchecktune Configure chevron_right
+
+ +
+ Configure: missing +
+
+ +
+ Flag them (mark blanks, change nothing) + Fill them in (numbers → median, text → most common) + Drop rows that have any blank +
+
+
+ +
N/A, —
+
Matched case-insensitively after stripping whitespace.
+
+
+
+ +
+ + + + + + + + + + + + + +
≡ 3dedupchecktune Configure chevron_right
Add step
+
+ +
+ Configure: dedup +
+
+ +
Keep the most complete row
+
Other options: keep the first seen, keep the last seen.
+
+
check Merge matched rows (fill each survivor's blanks from its duplicates)
+
+ +
+ email + phone +
+
+
+
+ +
+ Advanced — import / export pipeline as JSON +
+

For sharing or version control. Editing is done in the step panels above — this is just the saved form of the same settings.

+
{ + "version": 1, + "steps": [ + {"tool": "text_clean", "enabled": true, "options": {"trim": true, "collapse_whitespace": true}}, + {"tool": "format_standardize", "enabled": true, "options": {"column_types": {"phone": "phone", "signup_date": "date"}}}, + {"tool": "missing", "enabled": true, "options": {"strategy": "flag", "sentinels": ["N/A", "—"]}}, + {"tool": "dedup", "enabled": true, "options": {"survivor_rule": "most_complete", "merge": true, "keys": ["email", "phone"]}} + ] +}
+
+ + +
+
+
+ -
+
Recommended tool order — why each step belongs where it does

text_clean before format_standardize — format parsers (phone / currency / date) fail on smart-quote-contaminated or NBSP-padded input — clean text first

@@ -161,39 +284,49 @@

Per-step summary

+
- + - - - + + - - - + + + + + + - - - + + - - - + +
stepstatuselapsed_mssummaryerror
stepstatuselapsedsummary
text_clean ok214{"cells_changed": 1204, "columns": ["name", "city"]}214 ms1,204 cells changed in name & city
format_standardizeok388{"phone": 18301, "signup_date": 17996}warning ok · 141 skipped388 ms18,301 phones and 17,996 dates standardized
+ info + 141 phone values didn't match any known pattern and were left unchanged. The step still completed — review them in the output preview if needed. +
missing ok121{"flagged_cells": 642, "sentinels_found": ["—"]}121 ms642 blank cells flagged (sentinel “—”)
dedup ok911{"input_rows": 18442, "output_rows": 18130, "duplicates_removed": 312, "groups": 147}911 ms312 duplicates removed across 147 groups (18,442 → 18,130 rows)
diff --git a/layout-review/10_pdf_extractor.html b/layout-review/10_pdf_extractor.html index 3d457b1..1dbf5ae 100644 --- a/layout-review/10_pdf_extractor.html +++ b/layout-review/10_pdf_extractor.html @@ -74,7 +74,7 @@ statement-feb-2026.pdf 147.2 KB
-
@@ -100,84 +100,89 @@

47 candidate transaction(s) from 2 file(s)

-

Uncheck rows to exclude. Edit any cell to fix a value the scanner got wrong. The raw column shows the original PDF text for that row.

+

Uncheck rows to exclude. Edit any cell to fix a value the scanner got wrong. Hover the info on any row to see the original PDF text it came from.

-
+ +
+ - - - + + - + + - + + - + + - + + - + + - + + - + + - + +
Include date description amount_debit amount_credit account_number source_filepageraw
check2026-01-03OPENING BALANCE****4821statement-jan-2026.pdf101/03 OPENING BALANCE 2,140.55info2026-01-03OPENING BALANCE****4821statement-jan-2026.pdf
check2026-01-05POS PURCHASE WHOLE FOODS MKT84.12****4821statement-jan-2026.pdf101/05 POS PURCHASE WHOLE FOODS MKT (84.12)info2026-01-05POS PURCHASE WHOLE FOODS MKT84.12****4821statement-jan-2026.pdf
check2026-01-08ACH DEPOSIT PAYROLL ACME CORP3,250.00****4821statement-jan-2026.pdf101/08 ACH DEPOSIT PAYROLL ACME CORP 3,250.00info2026-01-08ACH DEPOSIT PAYROLL ACME CORP3,250.00****4821statement-jan-2026.pdf
check2026-01-11ONLINE TRANSFER TO SAVINGS500.00****4821statement-jan-2026.pdf201/11 ONLINE TRANSFER TO SAVINGS (500.00)info2026-01-11ONLINE TRANSFER TO SAVINGS500.00****4821statement-jan-2026.pdf
2026-01-12INTEREST RATE 0.50% APY DETAIL****4821statement-jan-2026.pdf201/12 INTEREST RATE 0.50% APY 0.00info2026-01-12INTEREST RATE 0.50% APY DETAIL auto-excluded · not a transaction line****4821statement-jan-2026.pdf
check2026-01-14DEBIT CARD SHELL OIL #228752.40****4821statement-jan-2026.pdf201/14 DEBIT CARD SHELL OIL #2287 (52.40)info2026-01-14DEBIT CARD SHELL OIL #228752.40****4821statement-jan-2026.pdf
check2026-02-02POS PURCHASE TRADER JOES #51161.88****4821statement-feb-2026.pdf102/02 POS PURCHASE TRADER JOES #511 (61.88)info2026-02-02POS PURCHASE TRADER JOES #51161.88****4821statement-feb-2026.pdf
check2026-02-06ACH DEPOSIT PAYROLL ACME CORP3,250.00****4821statement-feb-2026.pdf202/06 ACH DEPOSIT PAYROLL ACME CORP 3,250.00info2026-02-06ACH DEPOSIT PAYROLL ACME CORP3,250.00****4821statement-feb-2026.pdf
check2026-02-09CHECK #10431,200.00****4821statement-feb-2026.pdf202/09 CHECK #1043 (1,200.00)info2026-02-09CHECK #10431,200.00****4821statement-feb-2026.pdf
- -
-
- -

46 of 47 rows selected.

-
-
-
- -
- date - description - amount_debit - amount_credit - account_number - source_file -
-
page and raw are kept off by default; tick them if you want them in the file.
+ +
+
+ +
+ date + description + amount_debit + amount_credit + account_number + source_file
+
page and raw are kept off by default; tick them if you want them in the file.
+ +

1 row excluded (INTEREST RATE detail line).

From dd0942d71098e29214795a965caf9244563949c8 Mon Sep 17 00:00:00 2001 From: Michael Date: Mon, 8 Jun 2026 16:44:11 +0000 Subject: [PATCH 06/10] =?UTF-8?q?feat(layout-review):=20journey-level=20re?= =?UTF-8?q?design=20=E2=80=94=20front=20door,=20taught=20order,=20consiste?= =?UTF-8?q?ncy?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses the journey-level review (the app felt like 12 tools sharing a stylesheet, not one guided product). File-partitioned changes: Navigation (shell.js): rename Home -> "Start here" with front-door emphasis (.dt-nav-start); reorder Data Cleaners into pipeline order (Clean Text -> Standardize -> Fix Missing -> Find Duplicates); new "Finance" group (Reconcile, PDF to CSV); all stubs moved to a bottom "Coming soon" group, no longer interleaved with working tools. Front door (home.html): a prominent primary "Clean these files for me" that runs the recommended pipeline in order, above the existing per-finding cards (reframed as "fix one thing at a time"). Shared tokens (app.css): .dt-next-step suggestion strip + .dt-nav-start. Teach the order: a slim .dt-next-step strip at the end of each linear cleaner page points to the next pipeline step (Map Columns -> Start here; orchestrator/Finance pages correctly omit it). Local-first: the green "Runs 100% locally" pill now sits in every working tool page's header (home + 8 tools), where client data is entered. Plain English: jargon relabeled on input controls (coerce, E.164, NFC/NFKC, sentinels, survivor rule), technical terms kept in tooltips and audit/output cells only. Stubs (06/08/07): rebuilt to one identical skeleton — info line + plain feature list + a real "Notify me when this ships" button; every disabled control and uploader removed (a dimmed dropzone reads as broken). Intake: full dropzone+chip replaced with the compact "Using " banner on Clean Text, Fix Missing, Find Duplicates, and both Reconcile sides. Co-Authored-By: Claude Opus 4.8 (1M context) --- layout-review/01_deduplicator.html | 32 +++++++------ layout-review/02_text_cleaner.html | 41 ++++++++-------- layout-review/03_format_standardizer.html | 18 +++++-- layout-review/04_missing_handler.html | 37 ++++++++------- layout-review/05_column_mapper.html | 24 +++++++--- layout-review/06_outlier_detector.html | 40 ++++++---------- layout-review/07_multi_file_merger.html | 58 ++++++----------------- layout-review/08_validator_reporter.html | 53 +++++---------------- layout-review/09_pipeline_runner.html | 29 ++++++++---- layout-review/10_pdf_extractor.html | 11 ++++- layout-review/11_reconciler.html | 41 +++++++--------- layout-review/app.css | 31 ++++++++++++ layout-review/home.html | 38 +++++++++++++++ layout-review/shell.js | 31 +++++++----- 14 files changed, 269 insertions(+), 215 deletions(-) diff --git a/layout-review/01_deduplicator.html b/layout-review/01_deduplicator.html index e396728..16246fa 100644 --- a/layout-review/01_deduplicator.html +++ b/layout-review/01_deduplicator.html @@ -19,27 +19,27 @@

Find Duplicates

- +
+ + + + + + Runs 100% locally + + +

Find rows that repeat, then keep one and remove the extras.

- - -
-
- upload_file Drag and drop file here - Up to 1.5 GB · CSV, TSV, XLSX, XLS · encoding & delimiter auto-detected -
- -
-
- - customers_export.csv - 2.1 MB - + +
+ description + Using customers_export.csv from the upload screen.
+ @@ -181,6 +181,8 @@
+
arrow_forwardDuplicates handled — your file is cleaned. Review the result or Back to Start here →
+
diff --git a/layout-review/02_text_cleaner.html b/layout-review/02_text_cleaner.html index 3473806..188b6fe 100644 --- a/layout-review/02_text_cleaner.html +++ b/layout-review/02_text_cleaner.html @@ -27,27 +27,27 @@

Clean Text

- +
+ + + + + + Runs 100% locally + + +

Trim extra spaces and strip out odd characters.

- - -
-
- upload_file Drag and drop file here - Up to 1.5 GB · CSV, TSV, XLSX, XLS · encoding auto-detected -
- -
-
- - contacts_messy.csv - 684 KB - + +
+ description + Using contacts_messy.csv from the upload screen.
+
@@ -89,8 +89,8 @@
minimal: trim and collapse whitespace only — no character substitutions.
- excel-hygiene: trim, collapse whitespace, fold smart quotes, strip invisible chars, normalize line endings, NFC.
- paranoid: everything in excel-hygiene plus strip control characters, strip BOM, and NFKC compatibility fold (lossy). + excel-hygiene: trim, collapse whitespace, fold smart quotes, strip invisible chars, normalize line endings, and normalize accented characters.
+ paranoid: everything in excel-hygiene plus strip control characters, strip BOM, and normalize accented and look-alike characters (lossy).
@@ -108,8 +108,8 @@
check Fold smart characters (curly quotes, em-dash, NBSP)
check Strip zero-width / invisible characters
-
check Unicode NFC normalization
-
Unicode NFKC compat fold (lossy: ① → 1, fi → fi)
+
check Normalize accented characters (NFC)
+
Normalize accented and look-alike characters (lossy: ① → 1, fi → fi)
@@ -211,6 +211,9 @@
+ +
arrow_forwardText cleaned. Next, most files need: Standardize Formats →
+ diff --git a/layout-review/03_format_standardizer.html b/layout-review/03_format_standardizer.html index 8faacb3..ed6ffb5 100644 --- a/layout-review/03_format_standardizer.html +++ b/layout-review/03_format_standardizer.html @@ -19,7 +19,16 @@

Standardize Formats

- +
+ + + + + + Runs 100% locally + + +

Make dates, phones, currency, and names look the same throughout.

@@ -81,7 +90,7 @@
- US (default) — ISO 8601 dates · E.164 phones · USD + US (default) — ISO 8601 dates · international-format phones (+1…) · USD European — DMY input · INTL phones · EUR comma decimal base UK — DD/MM/YYYY · GB phones · Yes/No booleans ISO Strict — ISO 8601 · bare-number currency · true/false @@ -111,7 +120,7 @@

Phones

-
E.164 (+15551234567)
+
Standard international format (+15551234567)
US
@@ -244,6 +253,9 @@
+ +
arrow_forwardFormats standardized. Next, most files need: Fix Missing Values →
+
diff --git a/layout-review/04_missing_handler.html b/layout-review/04_missing_handler.html index 53aca8e..9d47537 100644 --- a/layout-review/04_missing_handler.html +++ b/layout-review/04_missing_handler.html @@ -19,28 +19,27 @@

Fix Missing Values

- +
+ + + + + + Runs 100% locally + + +

Find blank cells (even hidden ones) and fill them in or remove them.

- -

Tip: files imported on the Home screen are picked up here automatically.

- -
-
- upload_file Drag and drop file here - Up to 1.5 GB · CSV, TSV, XLSX, XLS -
- -
-
- - survey_responses.csv - 684 KB - + +
+ description + Using survey_responses.csv from the upload screen.
+
@@ -117,9 +116,9 @@

Detection

check Standardize disguised nulls to NaN
- +
N/A, n/a, NA, NULL, null, None, -, --, ?, #N/A
-
Matched case-insensitively after stripping whitespace.
+
Text that really means “empty.” Matched case-insensitively after stripping whitespace.
@@ -253,6 +252,8 @@
+
arrow_forwardMissing values handled. Next, most files need: Find Duplicates →
+ diff --git a/layout-review/05_column_mapper.html b/layout-review/05_column_mapper.html index 89bbcbe..2656b55 100644 --- a/layout-review/05_column_mapper.html +++ b/layout-review/05_column_mapper.html @@ -19,7 +19,16 @@

Map Columns

- +
+ + + + + + Runs 100% locally + + +

Rename columns, change their order, and set each one as text, number, or date.

@@ -65,9 +74,9 @@
Build interactively (start from current columns) Import schema JSON - Skip (rename / coerce only — no schema) + Skip (rename / convert types only — no schema)
-
An interactive build is fastest for one-off cleanup. Import a JSON when you have a fixed contract (a CRM import format, db schema). Skip when you only want to rename or coerce specific columns.
+
An interactive build is fastest for one-off cleanup. Import a JSON when you have a fixed contract (a CRM import format, db schema). Skip when you only want to rename or convert the type of specific columns.

Edit the table to define your target schema. Add rows for fields the input doesn't have yet (with a default), or remove rows for columns you want to drop.

@@ -119,8 +128,8 @@
rename-only (just rename, leave types alone, keep extras) - lenient-schema (rename + coerce + reorder, keep extras) - strict-schema (rename + coerce + reorder, drop extras) base + lenient-schema (rename + convert types + reorder, keep extras) + strict-schema (rename + convert types + reorder, drop extras) base Custom — based on strict-schema, 1 control changed modified
@@ -141,7 +150,7 @@
keep
Winning value: keep. Overrides the strict-schema base (drop) — so Notes survives into the output.
-
check Coerce types per schema
+
check Convert each column to the right type
check Reorder to schema order
@@ -200,6 +209,9 @@
+ +
arrow_forwardColumns mapped. Run the recommended clean →
+ diff --git a/layout-review/06_outlier_detector.html b/layout-review/06_outlier_detector.html index 4d73b4c..3f9e164 100644 --- a/layout-review/06_outlier_detector.html +++ b/layout-review/06_outlier_detector.html @@ -12,7 +12,7 @@
visibility - Static layout preview of Find Unusual Values — a Coming Soon tool. The page is a stub/teaser: an "under development" notice and disabled placeholder controls. All pages → + Static layout preview of Find Unusual Values — a Coming Soon tool. The page is a stub: a "coming soon" notice, a plain-English list of what the tool will do, and a single "Notify me" action. All pages →
@@ -25,36 +25,26 @@
- +
info - This tool is under development. + This tool is coming soon.
+ +

What it will do:

+
    +
  • Find values that are unusually high or low for a column
  • +
  • Spot values that break the rules you set (out of range, wrong type)
  • +
  • Choose how sensitive the check is
  • +
  • Flag unusual rows by adding a column, without changing your data
  • +
  • Cap extreme values at a limit you choose
  • +
  • See a summary of how many values were flagged
  • +
+
- -

Detection Method

- -
- -
IQR (interquartile range)
-
- -
- -
1.5
-
- -

Handling

- -
- -
Flag only (add column)
-
- -
- +
diff --git a/layout-review/07_multi_file_merger.html b/layout-review/07_multi_file_merger.html index c25a344..aa42434 100644 --- a/layout-review/07_multi_file_merger.html +++ b/layout-review/07_multi_file_merger.html @@ -12,7 +12,7 @@
visibility - Static layout preview of Combine Files — a Coming-Soon tool. The page is a stub: an "under development" notice, a planned-features list, a working multi-file uploader, and disabled placeholder options. All pages → + Static layout preview of Combine Files — a Coming Soon tool. The page is a stub: a "coming soon" notice, a plain-English list of what the tool will do, and a single "Notify me" action. All pages →
@@ -23,56 +23,28 @@

Combine several CSV or Excel files into one — even if columns differ.

- +
+ +
info - This tool is under development. + This tool is coming soon.
- -

Features:

-
    -
  • Import multiple CSV/Excel files at once
  • -
  • Automatic schema alignment (matching columns by name)
  • -
  • Append mode: stack files vertically (union)
  • -
  • Join mode: merge files on shared key columns
  • -
  • Handle mismatched columns (fill missing with nulls or drop)
  • -
  • Source file tracking column
  • + +

    What it will do:

    +
      +
    • Import several CSV or Excel files at once
    • +
    • Line up columns automatically by matching their names
    • +
    • Stack files on top of each other into one long file
    • +
    • Merge files side by side using shared key columns
    • +
    • Handle columns that don't match (fill the gaps with blanks or drop them)
    • +
    • Add a column showing which file each row came from

    - - -
    -
    - upload_file Drag and drop files here - CSV, TSV, XLSX, XLS · multiple files allowed -
    - -
    -
    Import multiple files to preview. Processing is not yet available.
    - - -

    Merge Strategy

    - -
    - -
    Append (stack vertically)
    -
    - -
    - -
    Fill with null
    -
    - -
    - check Add source filename column -
    - -
    - - +
diff --git a/layout-review/08_validator_reporter.html b/layout-review/08_validator_reporter.html index ab70f04..4dafd96 100644 --- a/layout-review/08_validator_reporter.html +++ b/layout-review/08_validator_reporter.html @@ -12,7 +12,7 @@
visibility - Static layout preview of Quality Check, a Coming-Soon tool. The page is a stub: an "under development" notice, a feature list, a working file uploader, and disabled placeholder controls. All pages → + Static layout preview of Quality Check — a Coming Soon tool. The page is a stub: a "coming soon" notice, a plain-English list of what the tool will do, and a single "Notify me" action. All pages →
@@ -25,55 +25,26 @@
- +
info - This tool is under development. + This tool is coming soon.
- -

Features:

+ +

What it will do:

    -
  • Column-level validation rules (not null, unique, regex pattern, range, enum)
  • -
  • Cross-column validation (e.g., start_date < end_date)
  • -
  • Data quality score per column and overall
  • -
  • Generate PDF quality report
  • -
  • Generate Excel report with flagged rows highlighted
  • -
  • Summary dashboard: pass/fail counts, severity breakdown
  • +
  • Check each column against rules you set (no blanks, no duplicates, matches a pattern, within a range, from a set list)
  • +
  • Check rules across columns (for example, start date is before end date)
  • +
  • Give each column and the whole file a quality score
  • +
  • Export a PDF quality report
  • +
  • Export an Excel report with the problem rows highlighted
  • +
  • Show a summary of what passed, what failed, and how serious each issue is

- - -
-
- upload_file Drag and drop file here - Import a file to preview. Processing is not yet available. -
- -
- - -

Validation Rules

- -
- -
- Choose options -
-
- -

Report Format

- -
- -
Excel (flagged rows)
-
- -
- - +
diff --git a/layout-review/09_pipeline_runner.html b/layout-review/09_pipeline_runner.html index f7aedf1..4c3de7e 100644 --- a/layout-review/09_pipeline_runner.html +++ b/layout-review/09_pipeline_runner.html @@ -19,7 +19,16 @@

Automated Workflows

- +
+ + + + + + Runs 100% locally + + +

Run several tools in a row — save the steps once, reuse them anytime.

@@ -74,7 +83,7 @@
- Use the recommended default (text-clean → format → missing → dedup) · modified + Use the recommended default (Clean Text → Standardize → Fix Missing → Find Duplicates) · modified Build interactively Import a saved pipeline JSON
@@ -108,7 +117,7 @@ ≡ 0 - text_clean +
Clean Text
Trim spaces, collapse repeats, leave case as-is
check tune Configure expand_more @@ -117,7 +126,7 @@
- Configure: text_clean + Configure: Clean Text
check Trim leading & trailing whitespace
check Collapse repeated spaces to one
@@ -134,7 +143,7 @@ ≡ 1 - format_standardize +
Standardize Formats
Format phone as phone, signup_date as a date
check tune Configure chevron_right @@ -143,7 +152,7 @@
- Configure: format_standardize + Configure: Standardize Formats

Choose a target format for each column. Columns left as “Leave as-is” are untouched.

@@ -165,7 +174,7 @@ ≡ 2 - missing +
Fix Missing Values
Flag blank cells (treat “N/A” and “—” as blank)
check tune Configure chevron_right @@ -174,7 +183,7 @@
- Configure: missing + Configure: Fix Missing Values
@@ -197,7 +206,7 @@ ≡ 3 - dedup +
Find Duplicates
Match on email & phone; keep the most complete row, merge in missing fields
check tune Configure chevron_right @@ -210,7 +219,7 @@
- Configure: dedup + Configure: Find Duplicates
diff --git a/layout-review/10_pdf_extractor.html b/layout-review/10_pdf_extractor.html index 1dbf5ae..5e68f6c 100644 --- a/layout-review/10_pdf_extractor.html +++ b/layout-review/10_pdf_extractor.html @@ -19,7 +19,16 @@

PDF to CSV

- +
+ + + + + + Runs 100% locally + + +

Pull transactions out of bank-statement PDFs into a clean CSV file.

diff --git a/layout-review/11_reconciler.html b/layout-review/11_reconciler.html index a867b3e..dcd323f 100644 --- a/layout-review/11_reconciler.html +++ b/layout-review/11_reconciler.html @@ -19,7 +19,16 @@

Reconcile Two Files

- +
+ + + + + + Runs 100% locally + + +

Compare two lists of transactions (e.g. bank vs. ledger) and flag what doesn't match.

@@ -30,18 +39,11 @@

Left (e.g. bank feed)

-
-
- upload_file Drag and drop file here - CSV, TSV, XLSX, XLS -
- -
-
- - bank_feed_may.csv - 214 KB +
+ description + Using bank_feed_may.csv from the upload screen.
+

bank_feed_may.csv — 1,204 rows, 4 columns

Preview left (e.g. bank feed) @@ -63,18 +65,11 @@

Right (e.g. ledger)

-
-
- upload_file Drag and drop file here - CSV, TSV, XLSX, XLS -
- -
-
- - ledger_may.xlsx - 96 KB +
+ description + Using ledger_may.xlsx from the upload screen.
+

ledger_may.xlsx — 1,198 rows, 5 columns

Preview right (e.g. ledger) diff --git a/layout-review/app.css b/layout-review/app.css index 9ce8263..d853e7b 100644 --- a/layout-review/app.css +++ b/layout-review/app.css @@ -122,6 +122,19 @@ code, .dt-mono { font-family: var(--font-mono); font-size: 0.92em; font-feature- .dt-nav-link .dt-mi { font-family: "Material Symbols Outlined"; font-size: 18px; color: var(--ink-secondary); line-height: 1; } .dt-nav-link.is-active .dt-mi { color: var(--ink); } .dt-nav-link.is-soon { opacity: 0.55; } + +/* "Start here" front-door item — weightier than ordinary nav links so the + obvious entry point reads at a glance. Accent-fill ground + accent-hover ink, + slightly larger hit area, with bottom margin to part it from the groups below. + Layers on .dt-nav-link, so the .is-active treatment still overrides cleanly. */ +.dt-nav-start { + background: var(--accent-fill); color: var(--accent-hover); font-weight: 600; + padding: 8px 10px; margin-bottom: 12px; +} +.dt-nav-start:hover { background: var(--accent-fill-strong); color: var(--accent-hover); } +.dt-nav-start .dt-mi { color: var(--accent); } +.dt-nav-start.is-active { background: var(--accent-fill-strong); color: var(--accent-hover); } +.dt-nav-start.is-active .dt-mi { color: var(--accent); } .dt-nav-soon-tag { margin-left: auto; font-size: 9px; font-weight: 600; letter-spacing: 0.06em; text-transform: uppercase; color: var(--ink-tertiary); @@ -288,6 +301,24 @@ code, .dt-mono { font-family: var(--font-mono); font-size: 0.92em; font-feature- .dt-alert.error { background: var(--danger-fill); color: var(--danger); } .dt-alert code { background: rgba(0,0,0,0.05); padding: 1px 5px; border-radius: 4px; } +/* Next-step strip — slim single-line "what to do next" suggestion shown at the + end of a tool's results. Subtle accent ground + left accent rule so it nudges + without competing with alerts; the trailing dismiss control is unobtrusive. */ +.dt-next-step { + display: flex; align-items: center; gap: 10px; + background: var(--accent-fill); border-left: 3px solid var(--accent); + border-radius: var(--r-md); padding: 10px 14px; margin: 16px 0; + font-size: 13.5px; line-height: 1.4; color: var(--ink); +} +.dt-next-step .dt-mi { font-family: "Material Symbols Outlined"; font-size: 18px; color: var(--accent); flex-shrink: 0; } +.dt-next-step a { color: var(--accent); font-weight: 500; } +.dt-next-step a:hover { color: var(--accent-hover); } +.dt-next-step-dismiss { + margin-left: auto; background: transparent; border: none; cursor: pointer; + color: var(--ink-tertiary); font-size: 13px; line-height: 1; padding: 2px 4px; +} +.dt-next-step-dismiss:hover { color: var(--ink-secondary); } + /* =========================================================================== Inputs (static representations of Streamlit widgets) =========================================================================== */ diff --git a/layout-review/home.html b/layout-review/home.html index 38557da..6f2d8cb 100644 --- a/layout-review/home.html +++ b/layout-review/home.html @@ -96,6 +96,44 @@
+ +
+
+ + auto_awesome + +
+

Recommended

+

Runs the recommended clean — fix text, standardize formats, fill blanks, remove duplicates — in the right order, then hands you the cleaned file.

+
+ +
+ +
+ 1 · Clean Text + arrow_forward + 2 · Standardize + arrow_forward + 3 · Fix Missing + arrow_forward + 4 · Find Duplicates + Result downloads when finished +
+
+ + +

Or fix issues one at a time

+

Prefer to handle things yourself? Open any finding to jump straight to the right tool.

+
diff --git a/layout-review/shell.js b/layout-review/shell.js index 68aecc9..94c55f7 100644 --- a/layout-review/shell.js +++ b/layout-review/shell.js @@ -3,28 +3,32 @@ src/gui/components/_legacy.py:render_sticky_footer(). Each page sets to mark the active nav item. */ (function () { - // Sections + entries in the same order app.py registers them. + // Front-door entry — rendered standalone above the section groups. + var START = { id: "home", icon: "insert_chart_outlined", name: "Start here", href: "home.html" }; + + // Sections + entries in pipeline / job order. var NAV = [ - { label: "Analysis", items: [ - { id: "home", icon: "insert_chart_outlined", name: "File Analysis", href: "home.html" }, - { id: "11_reconciler", icon: "compare_arrows", name: "Reconcile Two Files", href: "11_reconciler.html" }, - ]}, { label: "Data Cleaners", items: [ - { id: "04_missing_handler", icon: "help_outline", name: "Fix Missing Values", href: "04_missing_handler.html" }, - { id: "06_outlier_detector", icon: "insights", name: "Find Unusual Values", href: "06_outlier_detector.html", soon: true }, { id: "02_text_cleaner", icon: "text_format", name: "Clean Text", href: "02_text_cleaner.html" }, { id: "03_format_standardizer", icon: "format_list_bulleted", name: "Standardize Formats", href: "03_format_standardizer.html" }, + { id: "04_missing_handler", icon: "help_outline", name: "Fix Missing Values", href: "04_missing_handler.html" }, { id: "01_deduplicator", icon: "search", name: "Find Duplicates", href: "01_deduplicator.html" }, - { id: "08_validator_reporter", icon: "check_circle", name: "Quality Check", href: "08_validator_reporter.html", soon: true }, ]}, { label: "Transformations", items: [ { id: "05_column_mapper", icon: "view_column", name: "Map Columns", href: "05_column_mapper.html" }, - { id: "07_multi_file_merger", icon: "account_tree", name: "Combine Files", href: "07_multi_file_merger.html", soon: true }, - { id: "10_pdf_extractor", icon: "picture_as_pdf", name: "PDF to CSV", href: "10_pdf_extractor.html" }, ]}, { label: "Automations", items: [ { id: "09_pipeline_runner", icon: "auto_awesome", name: "Automated Workflows", href: "09_pipeline_runner.html" }, ]}, + { label: "Finance", items: [ + { id: "11_reconciler", icon: "compare_arrows", name: "Reconcile Two Files", href: "11_reconciler.html" }, + { id: "10_pdf_extractor", icon: "picture_as_pdf", name: "PDF to CSV", href: "10_pdf_extractor.html" }, + ]}, + { label: "Coming soon", items: [ + { id: "06_outlier_detector", icon: "insights", name: "Find Unusual Values", href: "06_outlier_detector.html", soon: true }, + { id: "08_validator_reporter", icon: "check_circle", name: "Quality Check", href: "08_validator_reporter.html", soon: true }, + { id: "07_multi_file_merger", icon: "account_tree", name: "Combine Files", href: "07_multi_file_merger.html", soon: true }, + ]}, ]; var active = document.body.getAttribute("data-page") || ""; @@ -41,8 +45,13 @@ '' + '' + '