feat: Tier B operator scaffolding — bundle, copy SoT, posts, emails

Pick up and finish yesterday's cut-off Tier B pass. - build/: PyInstaller scaffold (datatools.spec + launcher.py + hook-streamlit.py + README) — folder-mode bundle, locked 127.0.0.1, per-OS recipe - marketing/COPY.md: single source of truth for every customer-facing string — landing H1/sub/CTAs, demo CTAs, email subjects, Gumroad listing, banned phrases - marketing/community-posts/: 9 drafts (3 posts × 3 niches: bookkeeper, revops, shopify-pet) — story / tip / soft-offer - marketing/emails/: 18 drafts (Gumroad delivery + 5-touch onboarding × 3 niches), per-niche segmentation guidance - docs/NEXT-STEPS.md: flip 2.2 / 2.4 / 3.1 / 3.4 to done with pointers to the new assets; add Phase 0 inventory rows - .gitignore: narrow `build/` ignore so PyInstaller spec + launcher + hooks get tracked, only generated artifacts (build/build/, build/__pycache__/, build/dist/) stay ignored Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 14:04:37 +00:00
parent 966af8ef94
commit e1f364f010
36 changed files with 1741 additions and 15 deletions
--- a/marketing/emails/bookkeeper/00-delivery.md
+++ b/marketing/emails/bookkeeper/00-delivery.md
@@ -0,0 +1,34 @@
+# Bookkeeper · Day 0 — Delivery email
+
+**Subject:** Your DataTools download (start here)
+**Send:** immediately on Gumroad purchase confirmation
+**Goal:** buyer downloads + opens the app within 24h
+
+---
+
+Hi {{first_name}},
+
+Thanks for buying DataTools. Your download:
+
+→ **{{download_url}}**
+
+Three things to do in the next 5 minutes so you don't lose this email under the next 200:
+
+**1. Download the installer for your OS** (Mac `.dmg`, Windows `.exe`, or Linux `.tar.gz`). About 280 MB. The link above auto-detects.
+
+**2. Run it.** First launch takes ~5 seconds; a browser tab opens to `127.0.0.1:8501`. That's the app — running locally on your machine, no network calls. If your browser doesn't open automatically, the terminal window shows the URL.
+
+**3. Drop in a real bank export.** Don't bother with the bundled samples — DataTools is built for messy real-world files. Pull last month's bank export from any client, drag it into the analyzer, and click "Run all". You'll see what the pipeline catches in about 20 seconds.
+
+If something doesn't work: just reply to this email. I read every reply (it goes to my own inbox, not a queue).
+
+If you want to refund: also just reply. 30-day no-questions; no form to fill out.
+
+Tomorrow I'll send a sample bank export with a few of the tricky cases pre-built in, so you can see what the gate report looks like on a known input. After that you'll get one email a week for the next month with one tip each — feel free to unsubscribe at the bottom of any of them.
+
+Welcome aboard.
+
+— Michael
+{{support_email}}
+
+P.S. If you have a bookkeeper friend who'd find this useful, the share-friendly landing page is {{landing_page}}.
--- a/marketing/emails/bookkeeper/01-day1.md
+++ b/marketing/emails/bookkeeper/01-day1.md
@@ -0,0 +1,31 @@
+# Bookkeeper · Day 1 — Try it on this messy bank export first
+
+**Subject:** Try it on this messy bank export first
+**Send:** Day 1, ~9am buyer-local-time
+**Goal:** convert "I bought it" → "I ran it on something"
+
+---
+
+Hi {{first_name}},
+
+Yesterday's email had your download. Today's email has a *file* — a sample bank export I built specifically to break things.
+
+→ **{{sample_file_url}}** (260 KB CSV, 1,400 rows of synthetic data — no real account info)
+
+It's modeled after real exports I've seen from US, UK, and Canadian banks. Hidden in there:
+
+- Mixed date formats (some `MM/DD/YYYY`, some `DD-MM-YY`, one row in `YYYY-MM-DD`)
+- Six different spellings of "Amazon" across the merchant column
+- Trailing whitespace + non-breaking spaces in the description column
+- Three obvious duplicate transactions and two non-obvious ones (different timestamps, same amount + merchant)
+- A totals row at the bottom that's not a transaction
+- One row with currency in `€` instead of `$`
+
+Drop it into DataTools, click **"Run all"** in the analyzer, and look at the gate report. It'll catch all of the above and tell you exactly what changed and why.
+
+The audit trail (a sidecar CSV called `<filename>.audit.csv`) is the part most bookkeepers are surprised by. Open it in Excel — every change has a row: original value, new value, rule that fired, timestamp. That's the file you hand to your client when they ask "wait, why did you re-classify that?".
+
+Try it once on the sample, then once on a real client export. Reply and tell me what it caught (or missed) — I'm building the v1.1 detector list from real-world feedback.
+
+— Michael
+{{support_email}}
--- a/marketing/emails/bookkeeper/02-day3.md
+++ b/marketing/emails/bookkeeper/02-day3.md
@@ -0,0 +1,35 @@
+# Bookkeeper · Day 3 — The audit trail your client will actually open
+
+**Subject:** The audit trail your client will actually open
+**Send:** Day 3
+**Goal:** deepen feature understanding around the audit trail (the
+real differentiator vs. spreadsheet workflow)
+
+---
+
+Hi {{first_name}},
+
+Most "data cleaning" tools spit out a clean file and call it done. The thing your *client* needs — and what protects you in a year when they ask "why did you change that?" — is the audit trail.
+
+Here's the file DataTools writes alongside every cleaned export. It's a CSV called `<filename>.audit.csv` and it sits next to the cleaned file in your output folder.
+
+Five columns, append-only:
+
+| original_value | new_value | rule_applied | confidence | timestamp |
+|----------------|-----------|--------------|------------|-----------|
+| `AMZN Mktp` | `Amazon` | `merchant_canonicalize` | 0.94 | 2026-05-04T09:12:03 |
+| `  Starbucks  ` | `Starbucks` | `whitespace_strip` | 1.00 | 2026-05-04T09:12:03 |
+| `01/02/26` | `2026-02-01` | `date_normalize_dmy` | 0.88 | 2026-05-04T09:12:03 |
+
+Why this matters in a real client conversation:
+
+- **The client asks "why is this Amazon when my statement says AMZN Mktp?"** — open the audit CSV, point at the `merchant_canonicalize` row. Done in 10 seconds.
+- **A reviewer (auditor, accountant, you in 6 months) asks "what changed?"** — the audit CSV is the answer. Diffable, openable in Excel, no proprietary format.
+- **You spot a wrong rule firing** — the `confidence` column tells you which rules to tune. Anything <0.90 is worth eyeballing.
+
+One workflow change worth making: when you send the cleaned file to QuickBooks, send the audit CSV to the client at the same time, in a folder labeled "month-end audit trail". Most clients won't open it. The 10% that do will trust you forever.
+
+Reply if you want me to walk through the audit format on a call — happy to do a quick screen-share for any buyer in the first 30 days.
+
+— Michael
+{{support_email}}
--- a/marketing/emails/bookkeeper/03-day7.md
+++ b/marketing/emails/bookkeeper/03-day7.md
@@ -0,0 +1,32 @@
+# Bookkeeper · Day 7 — One pipeline, every client, every month
+
+**Subject:** One pipeline, every client, every month
+**Send:** Day 7
+**Goal:** reframe from one-off tool to monthly workflow
+
+---
+
+Hi {{first_name}},
+
+A week in. By now you've probably run DataTools on 1-2 client exports and confirmed it does what the landing page promised.
+
+The thing buyers tell me they wish they'd done from day one: **set it up as a workflow, not a one-off.**
+
+The pattern that works:
+
+**1. Make a folder per client.** Inside each client folder, a subfolder per month: `Acme Co/2026-05/`. Drop the raw export here.
+
+**2. Save your DataTools settings as a per-client preset.** The "Save settings" button in the analyzer drops a `.datatools-preset.json` file. Stash that in the client folder. Next month, load the preset and the analyzer pre-configures with the rules you tuned for that client (e.g., your "Amazon Marketplace" canonical name, your client's specific merchant aliases).
+
+**3. Run the pipeline. Get three files back:** the cleaned CSV, the audit CSV, the gate report. Move them into `Acme Co/2026-05/cleaned/`.
+
+**4. Import the cleaned CSV to QuickBooks. Email the audit CSV to the client.**
+
+Total elapsed time per client per month, after the first: 3-5 minutes. The first month per client is longer (~15 min) because you're tuning the preset.
+
+The buyers who do this are the ones still emailing me 3 months later — usually with feature requests for the next client they want to onboard. The buyers who only ever run it ad-hoc tend to drift back to spreadsheets within 2 months.
+
+If you want, reply with a sanitized export and I'll show you what your starting preset should look like — happy to do this for the first 50 buyers.
+
+— Michael
+{{support_email}}
--- a/marketing/emails/bookkeeper/04-day14.md
+++ b/marketing/emails/bookkeeper/04-day14.md
@@ -0,0 +1,35 @@
+# Bookkeeper · Day 14 — Two-minute trick: the gate report
+
+**Subject:** Two-minute trick: the gate report
+**Send:** Day 14
+**Goal:** surface the gate tool — non-obvious, high-value once seen
+
+---
+
+Hi {{first_name}},
+
+The tool inside DataTools that buyers find last is the **gate** — and it's the one that quietly does the most for you.
+
+What it does: before any row gets written to the cleaned CSV, the gate runs a per-row pass-through check. Rows that fail get *quarantined* into a separate file (`<filename>.quarantine.csv`) instead of silently dropped or silently passed.
+
+Default rules (you can add your own):
+
+- Missing required fields (date, amount)
+- Amount in unexpected currency without a flag
+- Date outside the export's stated range (catches the "totals row" issue from Day 1)
+- Duplicate of another row already in the file (per the dedupe pass)
+- Confidence below your threshold on a field that got auto-corrected
+
+The 2-minute workflow:
+
+1. Run the pipeline as usual.
+2. Open `<filename>.quarantine.csv`. (It'll be tiny — typically 0-5% of rows.)
+3. Eyeball it. Anything that's a real transaction, fix-and-re-include manually. Anything that's a totals row / blank row / corrupt row — confirm it's correctly quarantined and delete it.
+4. Re-run the pipeline on the fixed-up version (or just append the manually-fixed rows to the cleaned CSV).
+
+The reason this matters: silent drops are the worst possible failure mode for a bookkeeper. You'd rather a row come out wrong (you'll catch it on review) than disappear (you won't catch it for months). The gate makes the silent-drop case impossible.
+
+Set the gate's confidence threshold to `0.85` for client work. Lower (0.75) for personal / exploratory; higher (0.92+) only if you've spent time tuning your client's preset.
+
+— Michael
+{{support_email}}
--- a/marketing/emails/bookkeeper/05-day30.md
+++ b/marketing/emails/bookkeeper/05-day30.md
@@ -0,0 +1,26 @@
+# Bookkeeper · Day 30 — Heard from a fellow bookkeeper?
+
+**Subject:** Heard from a fellow bookkeeper?
+**Send:** Day 30
+**Goal:** referral / review ask. Last touch in the sequence.
+
+---
+
+Hi {{first_name}},
+
+A month in. If DataTools earned its $49 — would you do me one (very small) favor?
+
+**Pick one of these. Whichever is easiest.**
+
+1. **Gumroad review** (60 seconds): {{download_url}}#reviews — even a single line helps the next bookkeeper trust the listing enough to click "buy".
+2. **Reply to this email with one sentence I can quote** on the bookkeeper landing page. Anonymous if you prefer; I'll never use a name without explicit permission.
+3. **Share the landing page** with one bookkeeper friend who'd benefit: {{landing_page}}. No referral commission scheme, just a link.
+
+If DataTools *didn't* earn its $49 — also reply. Tell me what's missing or what's broken. The 30-day refund window is still open and I'd rather refund a buyer who didn't get value than have an unhappy customer in the wild.
+
+Either way, this is the last automated email you'll get from me. After this you only hear from me when there's a v1.x update or if you reply to one of the previous emails.
+
+Thanks for being an early buyer — the first 50 customers shape the next 5,000.
+
+— Michael
+{{support_email}}