feat: 3 new tools, format streaming, distribution-ready demo + landing pages

Tools shipped this batch (4 → 6 of 9 Ready):
  04 Missing Value Handler   src/core/missing.py + cli_missing.py + GUI
  05 Column Mapper           src/core/column_mapper.py + cli_column_map.py + GUI
  09 Pipeline Runner         src/core/pipeline.py + cli_pipeline.py + GUI
                             with soft tool-dependency graph (recommended,
                             not enforced) and JSON save/load for repeatable
                             weekly cleanups.

Format Standardizer reworked for 1 GB international files:
  • Vectorised dispatch + LRU cache over phone/date/currency/boolean/email
  • Per-row country / address columns drive parsing
  • Audit cap (default 10 k rows, ~50 MB RAM)
  • standardize_file(): chunked streaming entry point (~165 k rows/sec)
  • currency_decimal="auto" for EU comma-decimal locales
  • R$ / kr / zł multi-char currency prefixes
  • cli_format.py with auto-stream above 100 MB inputs

Encoding detection arbiter + language-aware probe:
  Closes the last 4 xfails (cp1250 / mac_iceland / shift_jis_2004 / lying-BOM)
  via tied-confidence arbiter + Cyrillic / EE-Latin coverage probes.

Distribution-readiness assets:
  • streamlit_app.py — Streamlit Community Cloud entry shim
  • src/gui/app_demo.py — single-page demo, ?p=<persona> routing,
    100-row cap + watermark, free-vs-paid boundary enforced at surface
  • samples/demo/ — 3 niche datasets + pre-tuned pipeline JSONs
  • landing/ — 4 static HTML pages (apex chooser + 3 niche),
    shared CSS, deploy.py URL-substitution script,
    auto-generated robots.txt + sitemap.xml + 404.html + favicon
  • docs/PLAN.md, DEMO-PLAN.md, DEPLOYMENT.md, POST-LAUNCH.md, NEXT-STEPS.md
    — full strategy + measurement + deployment + master checklist

Test counts:
  before: 1,520 passed · 4 skipped · 17 xfailed
  after:  1,729 passed · 0 skipped · 0  xfailed

Tier-1 corpora added:
  • missing-corpus           3 use cases + 16 edge cases
  • column-mapper-corpus     3 use cases + 5 edge cases
  • format-cleaner intl      20-row 13-country stress fixture

Engine hardening flushed out by the corpora:
  • interpolate guards against object-dtype columns
  • mean/median skip all-NaN columns (silences numpy warning)
  • fillna runs under future.no_silent_downcasting (silences pandas warning)
  • mojibake test no longer skips when ftfy installed (monkeypatch path)
  • drop-row threshold semantics: strict-greater (consistent across rows / cols)
  • currency_decimal validator allow-set updated for "auto"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-01 22:31:26 +00:00
parent d18b95880d
commit 966af8ef94
89 changed files with 12039 additions and 284 deletions

234
landing/_shared/styles.css Normal file
View File

@@ -0,0 +1,234 @@
/* DataTools landing-page styles — single shared sheet for all niches.
*
* Design constraints:
* • No external font / CSS dependencies (works on Cloudflare Pages
* with zero build step, no privacy banner needed).
* • Mobile-first; layout reflows below 720 px.
* • Dark, focused, content-first. Buyer reads this on a laptop
* between Shopify exports — keep it readable and skimmable.
* • Persona pages all share this sheet — niche differences live in
* copy + accent-color variables overridden in each page's <style>.
*/
:root {
--bg: #0f1115;
--surface: #161922;
--surface-2: #1d212b;
--text: #e8eaed;
--text-mute: #9aa3b2;
--text-soft: #c8ced8;
--rule: #252a36;
--accent: #6ee7b7; /* Shopify pet default — overridden per persona */
--accent-ink: #052e1a;
--warn: #fbbf24;
--max: 1080px;
--radius: 12px;
--shadow: 0 1px 3px rgba(0,0,0,0.3), 0 8px 24px rgba(0,0,0,0.2);
--mono: ui-monospace, SFMono-Regular, "SF Mono", Menlo, monospace;
--sans: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto,
"Helvetica Neue", Arial, sans-serif;
}
* { box-sizing: border-box; }
html, body {
margin: 0; padding: 0;
background: var(--bg);
color: var(--text);
font-family: var(--sans);
font-size: 16px;
line-height: 1.55;
-webkit-font-smoothing: antialiased;
}
a { color: var(--accent); text-decoration: none; }
a:hover { text-decoration: underline; }
/* ----- Sticky buy bar ----- */
.buybar {
position: sticky; top: 0; z-index: 50;
background: rgba(15,17,21,0.92);
backdrop-filter: blur(8px);
border-bottom: 1px solid var(--rule);
padding: 10px 20px;
}
.buybar-inner {
max-width: var(--max); margin: 0 auto;
display: flex; align-items: center; justify-content: space-between;
gap: 16px;
}
.buybar .brand { font-weight: 600; letter-spacing: -0.01em; }
.buybar .brand-mark { color: var(--accent); margin-right: 6px; }
.buybar .price-tag { color: var(--text-mute); font-size: 14px; margin-right: 12px; }
/* ----- Buttons ----- */
.btn {
display: inline-block;
background: var(--accent); color: var(--accent-ink);
font-weight: 600; font-size: 15px;
padding: 11px 18px; border-radius: 8px;
border: 0; cursor: pointer;
transition: transform 0.05s ease, box-shadow 0.15s ease;
}
.btn:hover { transform: translateY(-1px); text-decoration: none; box-shadow: var(--shadow); }
.btn-large {
padding: 14px 24px; font-size: 17px;
}
.btn-ghost {
background: transparent; color: var(--text-soft);
border: 1px solid var(--rule);
}
.btn-ghost:hover { background: var(--surface); }
/* ----- Layout ----- */
section {
padding: 60px 20px;
border-bottom: 1px solid var(--rule);
}
section:last-of-type { border-bottom: 0; }
.container { max-width: var(--max); margin: 0 auto; }
h1, h2, h3 { line-height: 1.2; letter-spacing: -0.02em; margin-top: 0; }
h1 { font-size: 44px; margin-bottom: 18px; }
h2 { font-size: 30px; margin-bottom: 16px; }
h3 { font-size: 19px; margin-bottom: 8px; }
p { margin: 0 0 14px 0; color: var(--text-soft); }
.muted { color: var(--text-mute); }
.eyebrow { color: var(--accent); font-size: 13px; font-weight: 600;
text-transform: uppercase; letter-spacing: 0.08em; margin-bottom: 10px; }
ul.bullets { padding-left: 20px; margin: 0 0 14px 0; }
ul.bullets li { margin-bottom: 8px; color: var(--text-soft); }
/* ----- Hero ----- */
.hero {
padding: 80px 20px 60px;
background: radial-gradient(ellipse at top, var(--surface), var(--bg) 60%);
}
.hero h1 strong { color: var(--accent); font-weight: 700; }
.hero .lead {
font-size: 19px; color: var(--text-soft); max-width: 720px;
margin-bottom: 28px;
}
.hero .cta-row { display: flex; gap: 12px; flex-wrap: wrap; align-items: center; }
.hero .price-note { color: var(--text-mute); font-size: 14px; }
/* ----- Demo embed ----- */
.demo-frame {
background: var(--surface);
border: 1px solid var(--rule);
border-radius: var(--radius);
overflow: hidden;
box-shadow: var(--shadow);
}
.demo-frame iframe {
width: 100%; height: 720px; border: 0; display: block;
background: var(--surface-2);
}
.demo-caption {
font-size: 14px; color: var(--text-mute);
padding: 10px 16px; border-top: 1px solid var(--rule);
}
/* ----- Cards / grids ----- */
.grid {
display: grid; gap: 18px;
grid-template-columns: repeat(auto-fit, minmax(260px, 1fr));
}
.card {
background: var(--surface);
border: 1px solid var(--rule);
border-radius: var(--radius);
padding: 22px;
}
.card h3 { color: var(--text); }
.card p:last-child { margin-bottom: 0; }
.card .icon {
display: inline-block; font-size: 22px; margin-bottom: 8px;
}
/* ----- Stats row ----- */
.stats { display: flex; gap: 28px; flex-wrap: wrap; margin: 18px 0 0; }
.stats .stat .num {
font-family: var(--mono); font-size: 26px; font-weight: 600;
color: var(--accent);
}
.stats .stat .label { font-size: 13px; color: var(--text-mute); }
/* ----- Privacy / audit callout panels ----- */
.callout {
background: var(--surface);
border-left: 3px solid var(--accent);
border-radius: 0 var(--radius) var(--radius) 0;
padding: 18px 22px;
margin: 18px 0;
}
.callout strong { color: var(--text); }
/* ----- Code-ish blocks ----- */
.terminal {
font-family: var(--mono); font-size: 14px;
background: #0a0c10;
color: #d8dfe8;
border: 1px solid var(--rule);
border-radius: var(--radius);
padding: 16px 18px;
overflow-x: auto;
white-space: pre;
line-height: 1.45;
}
.terminal .prompt { color: var(--text-mute); }
.terminal .ok { color: var(--accent); }
.terminal .warn { color: var(--warn); }
/* ----- Pricing ----- */
.pricing {
display: grid; gap: 18px;
grid-template-columns: repeat(auto-fit, minmax(260px, 1fr));
}
.pricing .card .price {
font-size: 38px; font-weight: 700; letter-spacing: -0.02em;
color: var(--text);
}
.pricing .card .price-suffix { font-size: 14px; color: var(--text-mute); margin-left: 4px; }
.pricing .card.featured { border-color: var(--accent); }
.pricing .card .row { display: flex; align-items: baseline; gap: 4px; margin-bottom: 12px; }
.pricing .card ul { padding-left: 18px; margin: 12px 0 18px; }
.pricing .card li { color: var(--text-soft); margin-bottom: 6px; }
/* ----- FAQ ----- */
details.faq {
border-bottom: 1px solid var(--rule);
padding: 14px 0;
}
details.faq summary {
font-weight: 600; color: var(--text);
cursor: pointer; list-style: none;
display: flex; align-items: center; justify-content: space-between;
}
details.faq summary::after {
content: "+"; color: var(--accent); font-size: 22px;
margin-left: 14px;
}
details.faq[open] summary::after { content: ""; }
details.faq p { margin-top: 10px; }
/* ----- Footer ----- */
footer {
padding: 40px 20px 60px;
font-size: 14px;
color: var(--text-mute);
}
footer .container { display: flex; gap: 28px; flex-wrap: wrap; justify-content: space-between; }
footer a { color: var(--text-soft); }
footer p { color: var(--text-mute); }
/* ----- Responsive ----- */
@media (max-width: 720px) {
h1 { font-size: 32px; }
h2 { font-size: 24px; }
section { padding: 40px 18px; }
.hero { padding: 56px 18px 40px; }
.demo-frame iframe { height: 560px; }
.buybar-inner .price-tag { display: none; }
}