Tools shipped this batch (4 → 6 of 9 Ready):
04 Missing Value Handler src/core/missing.py + cli_missing.py + GUI
05 Column Mapper src/core/column_mapper.py + cli_column_map.py + GUI
09 Pipeline Runner src/core/pipeline.py + cli_pipeline.py + GUI
with soft tool-dependency graph (recommended,
not enforced) and JSON save/load for repeatable
weekly cleanups.
Format Standardizer reworked for 1 GB international files:
• Vectorised dispatch + LRU cache over phone/date/currency/boolean/email
• Per-row country / address columns drive parsing
• Audit cap (default 10 k rows, ~50 MB RAM)
• standardize_file(): chunked streaming entry point (~165 k rows/sec)
• currency_decimal="auto" for EU comma-decimal locales
• R$ / kr / zł multi-char currency prefixes
• cli_format.py with auto-stream above 100 MB inputs
Encoding detection arbiter + language-aware probe:
Closes the last 4 xfails (cp1250 / mac_iceland / shift_jis_2004 / lying-BOM)
via tied-confidence arbiter + Cyrillic / EE-Latin coverage probes.
Distribution-readiness assets:
• streamlit_app.py — Streamlit Community Cloud entry shim
• src/gui/app_demo.py — single-page demo, ?p=<persona> routing,
100-row cap + watermark, free-vs-paid boundary enforced at surface
• samples/demo/ — 3 niche datasets + pre-tuned pipeline JSONs
• landing/ — 4 static HTML pages (apex chooser + 3 niche),
shared CSS, deploy.py URL-substitution script,
auto-generated robots.txt + sitemap.xml + 404.html + favicon
• docs/PLAN.md, DEMO-PLAN.md, DEPLOYMENT.md, POST-LAUNCH.md, NEXT-STEPS.md
— full strategy + measurement + deployment + master checklist
Test counts:
before: 1,520 passed · 4 skipped · 17 xfailed
after: 1,729 passed · 0 skipped · 0 xfailed
Tier-1 corpora added:
• missing-corpus 3 use cases + 16 edge cases
• column-mapper-corpus 3 use cases + 5 edge cases
• format-cleaner intl 20-row 13-country stress fixture
Engine hardening flushed out by the corpora:
• interpolate guards against object-dtype columns
• mean/median skip all-NaN columns (silences numpy warning)
• fillna runs under future.no_silent_downcasting (silences pandas warning)
• mojibake test no longer skips when ftfy installed (monkeypatch path)
• drop-row threshold semantics: strict-greater (consistent across rows / cols)
• currency_decimal validator allow-set updated for "auto"
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
355 lines
18 KiB
HTML
355 lines
18 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
<head>
|
||
<meta charset="utf-8" />
|
||
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
||
<title>DataTools for Bookkeepers — Reconcile Bank Exports With An Audit Trail · $49</title>
|
||
<meta name="description" content="Reconcile messy bank exports. Catch duplicate transactions QuickBooks imported twice. Standardize dates, amounts, and vendor casing — locally. Every change auditable. $49 one-time." />
|
||
<meta name="keywords" content="reconcile bank export csv, quickbooks duplicate transactions, vendor list cleanup, bookkeeper csv tool, bank export deduplicator, bookkeeper audit trail" />
|
||
<link rel="canonical" href="https://datatools.app/bookkeeper/" />
|
||
<link rel="stylesheet" href="../_shared/styles.css" />
|
||
|
||
<!-- Persona accent: Bookkeeper → calm steel-blue -->
|
||
<style>
|
||
:root {
|
||
--accent: #7dd3fc;
|
||
--accent-ink: #042c43;
|
||
}
|
||
</style>
|
||
|
||
<!-- Open Graph -->
|
||
<meta property="og:title" content="DataTools for Bookkeepers — Reconcile Bank Exports With An Audit Trail" />
|
||
<meta property="og:description" content="Catch duplicate transactions. Standardize dates and amounts. Hand your client an audit trail. $49 one-time." />
|
||
<meta property="og:type" content="product" />
|
||
<meta property="og:url" content="https://datatools.app/bookkeeper/" />
|
||
|
||
<script type="application/ld+json">
|
||
{
|
||
"@context": "https://schema.org",
|
||
"@type": "SoftwareApplication",
|
||
"name": "DataTools for Bookkeepers",
|
||
"operatingSystem": "Windows, macOS, Linux",
|
||
"applicationCategory": "BusinessApplication",
|
||
"offers": {
|
||
"@type": "Offer",
|
||
"price": "49",
|
||
"priceCurrency": "USD"
|
||
},
|
||
"description": "Reconcile bank exports, dedupe vendor lists, and produce a hand-off-ready audit trail. Six-tool data-cleaning bundle for bookkeepers and freelance accountants.",
|
||
"softwareVersion": "1.0"
|
||
}
|
||
</script>
|
||
</head>
|
||
<body>
|
||
|
||
<div class="buybar">
|
||
<div class="buybar-inner">
|
||
<div class="brand"><span class="brand-mark">●</span> DataTools <span class="muted">/ for Bookkeepers</span></div>
|
||
<div>
|
||
<span class="price-tag">$49 — one-time, no subscription</span>
|
||
<a class="btn" href="https://gumroad.com/l/datatools?from=bookkeeper" rel="noopener">Get DataTools →</a>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<section class="hero">
|
||
<div class="container">
|
||
<div class="eyebrow">For bookkeepers · freelance accountants · small-firm partners</div>
|
||
<h1>Reconcile messy bank exports.<br /><strong>Hand your client an audit trail.</strong></h1>
|
||
<p class="lead">
|
||
The Jan and Feb exports overlap and you've got the same transaction
|
||
booked twice. Vendor names are <em>"Amazon"</em>, <em>"amazon.com"</em>,
|
||
and <em>"AMAZON.COM*4F2X9"</em> in three different rows. Dates are a
|
||
smoosh of <code>01/15/2025</code>, <code>2025-01-15</code>, and
|
||
<code>Jan 18 2025</code>. DataTools fixes all of it in one pass —
|
||
and produces a row-by-row CSV showing every change so your client
|
||
can verify your work.
|
||
</p>
|
||
<div class="cta-row">
|
||
<a class="btn btn-large" href="https://gumroad.com/l/datatools?from=bookkeeper" rel="noopener">Get DataTools — $49 →</a>
|
||
<a class="btn btn-ghost btn-large" href="#demo">Try the live demo ↓</a>
|
||
<span class="price-note">One-time payment · cross-platform · runs offline</span>
|
||
</div>
|
||
<div class="stats">
|
||
<div class="stat"><div class="num">6</div><div class="label">tools, one bundle</div></div>
|
||
<div class="stat"><div class="num">100 %</div><div class="label">auditable changes</div></div>
|
||
<div class="stat"><div class="num">0</div><div class="label">cloud uploads ever</div></div>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<!-- ============= Pain points ============= -->
|
||
<section>
|
||
<div class="container">
|
||
<div class="eyebrow">If you've spent a Saturday on this, you already know</div>
|
||
<h2>Five pains DataTools fixes in one pass</h2>
|
||
<div class="grid">
|
||
<div class="card">
|
||
<span class="icon">📅</span>
|
||
<h3>Jan and Feb bank exports overlap — the same transaction posts twice</h3>
|
||
<p>QuickBooks (or any reconciler) silently double-counts the month-boundary rows. Your client's books understate cash by 1–4 % and nobody notices until tax season.</p>
|
||
<p class="muted"><strong>What it costs:</strong> 2–4 hours per month per client + reconciliation errors that can compound.</p>
|
||
</div>
|
||
<div class="card">
|
||
<span class="icon">📒</span>
|
||
<h3>1099 reports break because vendors are spelled three ways</h3>
|
||
<p>"Amazon", "amazon.com", "AMAZON.COM*4F2X9" become three separate vendors in QBO. You ship three 1099s instead of one — and the 1099-NEC threshold breaks both ways.</p>
|
||
<p class="muted"><strong>What it costs:</strong> 1–2 hours per 1099 cycle + IRS-paper-trail risk.</p>
|
||
</div>
|
||
<div class="card">
|
||
<span class="icon">🛡️</span>
|
||
<h3>"Show me what you changed" — your liability hangs on the answer</h3>
|
||
<p>Cloud cleaners that "just clean your data" don't give you a row-level audit log. Your professional indemnity insurance hates that. Your client's auditor hates that. You hate explaining it.</p>
|
||
<p class="muted"><strong>What it costs:</strong> per-firm liability premium + 24–48 hr audit-response window stress.</p>
|
||
</div>
|
||
<div class="card">
|
||
<span class="icon">👥</span>
|
||
<h3>Per-client SaaS pricing destroys your margins at 10+ clients</h3>
|
||
<p>$30/mo per client × 20 clients = $600/mo, every month, for tooling. DataTools is a one-time desktop license you use on every client's books for the same $49. Forever.</p>
|
||
<p class="muted"><strong>What it costs:</strong> the difference between a $30/mo/client subscription and $49 once.</p>
|
||
</div>
|
||
<div class="card">
|
||
<span class="icon">🌍</span>
|
||
<h3>Multi-currency books break standard parsers</h3>
|
||
<p>Your client has EU customers. Their amounts come in as <code>€1.234,56</code> (comma decimal). Standard import tools see "1.234" as the whole-dollar amount and drop the rest. Parens-negative <code>($89.50)</code> gets read as positive.</p>
|
||
<p class="muted"><strong>What it costs:</strong> 30–60 min per multi-currency client per month + occasional silent errors.</p>
|
||
</div>
|
||
<div class="card">
|
||
<span class="icon">🔒</span>
|
||
<h3>Your client's books are too sensitive for a cloud cleaner</h3>
|
||
<p>One "vendor breach" email to your clients ends the relationship. DataTools is desktop-only. No upload, no SaaS account, no third party seeing a single transaction. Verifiable in your browser's network tab.</p>
|
||
<p class="muted"><strong>What it costs:</strong> nothing — and that's exactly the point.</p>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section id="demo">
|
||
<div class="container">
|
||
<div class="eyebrow">Live demo · runs in your browser</div>
|
||
<h2>Try it on a sample bank export with a known overlap</h2>
|
||
<p>
|
||
The demo below loads a 25-row export combining January and February
|
||
activity, with the month-boundary rows duplicated across exports —
|
||
the exact scenario where QuickBooks (or any reconciler) silently
|
||
double-counts transactions. Click <strong>Run pipeline</strong> and
|
||
watch the dedup catch every overlap, dates land in ISO format, and
|
||
the parens-negative amounts (<code>($89.50)</code>) become proper
|
||
negative numbers.
|
||
</p>
|
||
<div class="demo-frame">
|
||
<iframe
|
||
src="https://demo.datatools.app/?p=bookkeeper"
|
||
loading="lazy"
|
||
title="DataTools live demo — Bookkeeper"
|
||
sandbox="allow-scripts allow-same-origin allow-downloads allow-forms"></iframe>
|
||
<div class="demo-caption">
|
||
Demo runs on free hosting. Capped at 100 input rows · output
|
||
watermarked. The paid product has no caps and runs entirely offline.
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section>
|
||
<div class="container">
|
||
<div class="eyebrow">Built for the bookkeeper's actual day</div>
|
||
<h2>Four workflows the rest of the industry tax-codes around</h2>
|
||
<div class="grid">
|
||
<div class="card">
|
||
<span class="icon">🏦</span>
|
||
<h3>Bank export reconciliation</h3>
|
||
<p>Two months of activity overlap at the boundary. The same transaction posts twice — once in each export — with different formatting. DataTools dedups on Date + Amount + fuzzy Vendor and catches all of them.</p>
|
||
</div>
|
||
<div class="card">
|
||
<span class="icon">📒</span>
|
||
<h3>Vendor list consolidation</h3>
|
||
<p>QuickBooks has <code>amazon.com</code>. Your spreadsheet has <code>Amazon</code>. The bank statement has <code>AMAZON.COM*4F2X9</code>. Standardize the casing, fuzzy-match across sources, hand the client one clean vendor list.</p>
|
||
</div>
|
||
<div class="card">
|
||
<span class="icon">👥</span>
|
||
<h3>Customer master cleanup pre-migration</h3>
|
||
<p>Before moving from one accounting system to another, the customer master needs to be deduped, standardized, and audited. One tool, one pipeline, one CSV in / clean CSV out.</p>
|
||
</div>
|
||
<div class="card">
|
||
<span class="icon">🧾</span>
|
||
<h3>Expense report dedup</h3>
|
||
<p>Same receipt scanned twice. Same Uber ride entered manually and then imported from the corporate card. Catch them once — and produce the audit log that proves the duplicate <em>was</em> a duplicate.</p>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section>
|
||
<div class="container">
|
||
<div class="eyebrow">The feature your liability insurance cares about</div>
|
||
<h2>Every change auditable. Period.</h2>
|
||
<p>
|
||
Every cell DataTools modifies is logged with the original value, the
|
||
new value, and which rule fired. When your client asks why a
|
||
transaction got merged or a date got reformatted, you don't say
|
||
"the AI did it." You hand them the CSV.
|
||
</p>
|
||
<div class="callout">
|
||
<strong>Why this matters specifically to bookkeepers:</strong> your
|
||
professional liability hangs on traceability. Cloud cleaners that
|
||
"just clean your data" without a row-level audit are unsafe at any
|
||
price. DataTools writes the audit by default, downloadable as a
|
||
separate CSV alongside the cleaned file.
|
||
</div>
|
||
<div class="terminal"><span class="prompt">$</span> head -5 client_jan2025_changes.csv
|
||
row,column,field_type,old,new
|
||
0,"Date ",date,"01/15/2025","2025-01-15"
|
||
0,Description,name," AMAZON.COM*4F2X9 PURCHASE","Amazon.com*4F2X9 Purchase"
|
||
0,Amount,currency,"-$129.99","-129.99"
|
||
1,Date ,date,"2025-01-15","2025-01-15"
|
||
<span class="prompt">$</span> # one row of audit per cell change. handed to the client. signed off.</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section>
|
||
<div class="container">
|
||
<div class="eyebrow">The thing every cloud reconciler can't say</div>
|
||
<h2>Your client's books never leave your computer.</h2>
|
||
<p>
|
||
Your clients trust you with their books. That trust is one
|
||
"we noticed our data appeared in a vendor breach" email away from
|
||
gone. DataTools is a desktop app — no upload, no SaaS, no
|
||
subscription, no third party seeing a single transaction.
|
||
</p>
|
||
<div class="callout">
|
||
<strong>Confirm it yourself.</strong> Open your browser's network
|
||
tab when DataTools is running. Click around. Run the pipeline.
|
||
Zero outbound requests. Ever.
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section>
|
||
<div class="container">
|
||
<div class="eyebrow">If your clients run multi-currency books</div>
|
||
<h2>$ £ € ¥ R$ kr zł — handled.</h2>
|
||
<p>
|
||
Standardize <code>$1,234.56</code>, <code>1.234,56 €</code> (EU
|
||
decimal), <code>($89.50)</code> (parens-negative),
|
||
<code>R$ 250,00</code>, <code>kr 1.250,50</code>, and the rest of
|
||
the long tail. Output is canonical numeric (your import tool's
|
||
favourite shape) with optional ISO 4217 prefix
|
||
(<code>USD 1234.56</code>) when you need to preserve the
|
||
currency.
|
||
</p>
|
||
<ul class="bullets">
|
||
<li><strong>Auto-detect</strong> EU comma decimal so your French and German clients' books reconcile without per-locale config.</li>
|
||
<li><strong>Parens-negative</strong> handled — accounting convention, not just a math style.</li>
|
||
<li><strong>Multi-character prefixes</strong> like <code>R$</code> (Brazilian Real) and <code>kr</code> (Nordic) detected before the single-symbol regex so they don't get bucketed as USD.</li>
|
||
</ul>
|
||
</div>
|
||
</section>
|
||
|
||
<section>
|
||
<div class="container">
|
||
<div class="eyebrow">In the bundle</div>
|
||
<h2>Six tools. One pipeline. One $49 download.</h2>
|
||
<div class="grid">
|
||
<div class="card"><h3>1 · Deduplicator</h3><p>Fuzzy match (Jaro-Winkler), explicit strategies for Date+Amount+Vendor, survivor rules.</p></div>
|
||
<div class="card"><h3>2 · Text Cleaner</h3><p>Header whitespace, smart quotes from copy-paste, em-dash sentinels.</p></div>
|
||
<div class="card"><h3>3 · Format Standardizer</h3><p>ISO dates, numeric amounts (parens-negative), vendor casing, multi-currency.</p></div>
|
||
<div class="card"><h3>4 · Missing Value Handler</h3><p>Disguised-null detection: <code>—</code>, <code>N/A</code>, <code>(blank)</code>, <code>?</code>.</p></div>
|
||
<div class="card"><h3>5 · Column Mapper</h3><p>Project to your accounting tool's required schema, coerce types, drop extras.</p></div>
|
||
<div class="card"><h3>6 · Pipeline Runner</h3><p>Save the cleanup. Run it on next month's export with one command. Same audit, automated.</p></div>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section>
|
||
<div class="container">
|
||
<div class="eyebrow">Pricing — pay once, own it</div>
|
||
<h2>$49. No subscription. No per-client license.</h2>
|
||
<div class="pricing">
|
||
<div class="card featured">
|
||
<div class="row"><div class="price">$49</div><div class="price-suffix">one-time</div></div>
|
||
<h3>DataTools for Bookkeepers</h3>
|
||
<ul>
|
||
<li>All 6 tools, full pipeline</li>
|
||
<li>Mac · Windows · Linux installers</li>
|
||
<li>Code-signed (no Gatekeeper warnings)</li>
|
||
<li>Free updates for the v1.x line</li>
|
||
<li>Bonus: ready-made bank-reconcile and vendor-cleanup pipelines</li>
|
||
<li><strong>Use on any number of clients</strong> — no seat limits</li>
|
||
</ul>
|
||
<a class="btn btn-large" href="https://gumroad.com/l/datatools?from=bookkeeper" rel="noopener">Buy on Gumroad →</a>
|
||
</div>
|
||
<div class="card">
|
||
<div class="row"><div class="price">$199</div><div class="price-suffix">one-time</div></div>
|
||
<h3>+ Priority email support</h3>
|
||
<p class="muted">Available post-launch. 24-hour async response on edge cases. Same product. Targeted at bookkeepers whose own time is > $200/hr.</p>
|
||
<a class="btn btn-ghost btn-large" href="#" aria-disabled="true">Coming soon</a>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</section>
|
||
|
||
<section>
|
||
<div class="container">
|
||
<h2>Questions</h2>
|
||
|
||
<details class="faq">
|
||
<summary>Does this replace QuickBooks / Xero?</summary>
|
||
<p>No — DataTools cleans the data <em>before</em> it goes into your accounting system, or after you export it for analysis. It sits alongside QB/Xero, not in place of them. Think of it as the import-clean-up step that should have shipped with the bank export feature in the first place.</p>
|
||
</details>
|
||
|
||
<details class="faq">
|
||
<summary>Can I use it on multiple clients without paying again?</summary>
|
||
<p>Yes. The licence is per-bookkeeper, not per-client. Run it on every client's books for the same $49.</p>
|
||
</details>
|
||
|
||
<details class="faq">
|
||
<summary>What's the audit log look like in court?</summary>
|
||
<p>It's a CSV with five columns per change: <code>row, column, field_type, old, new</code>. Plus a JSON pipeline file describing exactly which rules ran in which order. Together they reproduce the cleanup deterministically — your client (or their auditor) can verify it on their machine.</p>
|
||
</details>
|
||
|
||
<details class="faq">
|
||
<summary>How does it handle Excel-only weirdness like serial dates?</summary>
|
||
<p>Excel serial dates (the number 45295 = 2024-01-15) are detected and converted automatically. So are Unix timestamps in seconds and milliseconds, RFC 2822 dates from email exports, partial-precision dates (<code>2024-01</code>, <code>2024-Q1</code>), and locale-specific month names in English/French/German.</p>
|
||
</details>
|
||
|
||
<details class="faq">
|
||
<summary>What about my clients' privacy?</summary>
|
||
<p>Your clients' books never leave your computer. The cleaner is a desktop app with zero network code in the data path. You can verify this in your browser's network tab.</p>
|
||
</details>
|
||
|
||
<details class="faq">
|
||
<summary>What's your refund policy?</summary>
|
||
<p>Try the live demo above on the sample dataset before you buy. If DataTools doesn't fit your workflow within 14 days, email for a refund — no questions asked.</p>
|
||
</details>
|
||
</div>
|
||
</section>
|
||
|
||
<section>
|
||
<div class="container" style="text-align: center;">
|
||
<h2>Stop reconciling bank exports by hand.</h2>
|
||
<p class="lead" style="margin: 0 auto 28px;">One $49 download. Catches the duplicate transactions QuickBooks imported twice, standardises dates and amounts and vendor casing, and hands you a row-level audit log to share with your client.</p>
|
||
<a class="btn btn-large" href="https://gumroad.com/l/datatools?from=bookkeeper" rel="noopener">Get DataTools — $49 →</a>
|
||
</div>
|
||
</section>
|
||
|
||
<footer>
|
||
<div class="container">
|
||
<div>
|
||
<p><strong>DataTools</strong> — local data-cleaning for Shopify, bookkeepers, and RevOps teams.</p>
|
||
<p class="muted">© 2026 · Built solo · Shipped from a small office.</p>
|
||
</div>
|
||
<div>
|
||
<p>
|
||
<a href="../shopify-pet/">For Shopify operators</a> ·
|
||
<a href="../revops/">For RevOps agencies</a><br />
|
||
<a href="https://gumroad.com/l/datatools?from=bookkeeper">Buy on Gumroad</a> ·
|
||
<a href="mailto:hello@datatools.app">Email support</a>
|
||
</p>
|
||
</div>
|
||
</div>
|
||
</footer>
|
||
|
||
</body>
|
||
</html>
|