feat: Tier B operator scaffolding — bundle, copy SoT, posts, emails
Pick up and finish yesterday's cut-off Tier B pass. - build/: PyInstaller scaffold (datatools.spec + launcher.py + hook-streamlit.py + README) — folder-mode bundle, locked 127.0.0.1, per-OS recipe - marketing/COPY.md: single source of truth for every customer-facing string — landing H1/sub/CTAs, demo CTAs, email subjects, Gumroad listing, banned phrases - marketing/community-posts/: 9 drafts (3 posts × 3 niches: bookkeeper, revops, shopify-pet) — story / tip / soft-offer - marketing/emails/: 18 drafts (Gumroad delivery + 5-touch onboarding × 3 niches), per-niche segmentation guidance - docs/NEXT-STEPS.md: flip 2.2 / 2.4 / 3.1 / 3.4 to done with pointers to the new assets; add Phase 0 inventory rows - .gitignore: narrow `build/` ignore so PyInstaller spec + launcher + hooks get tracked, only generated artifacts (build/build/, build/__pycache__/, build/dist/) stay ignored Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
60
marketing/emails/README.md
Normal file
60
marketing/emails/README.md
Normal file
@@ -0,0 +1,60 @@
|
||||
# Email sequences
|
||||
|
||||
Per niche (`bookkeeper/`, `revops/`, `shopify-pet/`):
|
||||
|
||||
- **`00-delivery.md`** — Day 0 Gumroad delivery email. Triggered when
|
||||
Gumroad confirms the purchase. Job #1: get the buyer to download
|
||||
and open the app inside the first 24h. Buyers who don't open within
|
||||
72h refund at ~3× the rate of buyers who do.
|
||||
- **`01-day1.md`** — Day 1 nudge with a sample file matched to the
|
||||
niche. The Day-1 email is the highest-leverage one in the
|
||||
sequence; it converts "I bought it" into "I used it".
|
||||
- **`02-day3.md`** — Day 3 deep-dive on one specific feature the
|
||||
niche cares about most.
|
||||
- **`03-day7.md`** — Day 7 workflow framing. "Use it every {month /
|
||||
campaign / sync}, not as a one-off."
|
||||
- **`04-day14.md`** — Day 14 power-user tip. Surfaces a non-obvious
|
||||
feature; converts "I use it" into "I rely on it".
|
||||
- **`05-day30.md`** — Day 30 referral / review ask.
|
||||
|
||||
## Sender setup
|
||||
|
||||
- **From:** `support@datatools.app` (single-sender to keep replies in
|
||||
one inbox; don't fan out to per-niche aliases until volume warrants)
|
||||
- **Reply-To:** same — every email expects a reply pathway
|
||||
- **List provider:** Gumroad's built-in for delivery; Buttondown or
|
||||
ConvertKit for the 5-touch sequence (Gumroad's drip is too crude
|
||||
for niche segmentation)
|
||||
- **Segmentation:** customers self-tag at checkout (Gumroad custom
|
||||
field "What do you do?"). Map: `bookkeeper`, `revops`,
|
||||
`shopify-pet`, `other`. `other` gets a generic sequence (not
|
||||
drafted yet — Tier C).
|
||||
|
||||
## Variables
|
||||
|
||||
All emails use these placeholders. Set them at sequence-import time,
|
||||
not per-email:
|
||||
|
||||
- `{{first_name}}` — Gumroad provides; fall back to "there" if blank
|
||||
- `{{download_url}}` — niche-specific download URL from Gumroad
|
||||
- `{{sample_file_url}}` — niche-specific sample CSV (`samples/demo/...`)
|
||||
- `{{landing_page}}` — niche-specific landing page URL
|
||||
- `{{support_email}}` — `support@datatools.app`
|
||||
|
||||
## Cadence and quiet rules
|
||||
|
||||
- Don't send between 10pm-7am buyer-local-time (Buttondown supports
|
||||
TZ-aware send; ConvertKit doesn't out of the box)
|
||||
- If the buyer replies to *any* email in the sequence, pause the
|
||||
remaining touches until you've replied to them. A drip that
|
||||
ignores a customer reply reads as worse than no drip.
|
||||
- If the buyer requests a refund, kill the sequence immediately.
|
||||
- Day 14 + Day 30 emails are skippable if the buyer has already
|
||||
emailed support with a feature request or bug report — they're
|
||||
engaged enough; don't pile on.
|
||||
|
||||
## Subject lines
|
||||
|
||||
Subjects are owned by `marketing/COPY.md` § 4. Don't edit subjects
|
||||
in-line in the email files; edit COPY.md and re-propagate. Same
|
||||
discipline applies to the closing CTA — owned by COPY.md § 0.
|
||||
34
marketing/emails/bookkeeper/00-delivery.md
Normal file
34
marketing/emails/bookkeeper/00-delivery.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# Bookkeeper · Day 0 — Delivery email
|
||||
|
||||
**Subject:** Your DataTools download (start here)
|
||||
**Send:** immediately on Gumroad purchase confirmation
|
||||
**Goal:** buyer downloads + opens the app within 24h
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
Thanks for buying DataTools. Your download:
|
||||
|
||||
→ **{{download_url}}**
|
||||
|
||||
Three things to do in the next 5 minutes so you don't lose this email under the next 200:
|
||||
|
||||
**1. Download the installer for your OS** (Mac `.dmg`, Windows `.exe`, or Linux `.tar.gz`). About 280 MB. The link above auto-detects.
|
||||
|
||||
**2. Run it.** First launch takes ~5 seconds; a browser tab opens to `127.0.0.1:8501`. That's the app — running locally on your machine, no network calls. If your browser doesn't open automatically, the terminal window shows the URL.
|
||||
|
||||
**3. Drop in a real bank export.** Don't bother with the bundled samples — DataTools is built for messy real-world files. Pull last month's bank export from any client, drag it into the analyzer, and click "Run all". You'll see what the pipeline catches in about 20 seconds.
|
||||
|
||||
If something doesn't work: just reply to this email. I read every reply (it goes to my own inbox, not a queue).
|
||||
|
||||
If you want to refund: also just reply. 30-day no-questions; no form to fill out.
|
||||
|
||||
Tomorrow I'll send a sample bank export with a few of the tricky cases pre-built in, so you can see what the gate report looks like on a known input. After that you'll get one email a week for the next month with one tip each — feel free to unsubscribe at the bottom of any of them.
|
||||
|
||||
Welcome aboard.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
|
||||
P.S. If you have a bookkeeper friend who'd find this useful, the share-friendly landing page is {{landing_page}}.
|
||||
31
marketing/emails/bookkeeper/01-day1.md
Normal file
31
marketing/emails/bookkeeper/01-day1.md
Normal file
@@ -0,0 +1,31 @@
|
||||
# Bookkeeper · Day 1 — Try it on this messy bank export first
|
||||
|
||||
**Subject:** Try it on this messy bank export first
|
||||
**Send:** Day 1, ~9am buyer-local-time
|
||||
**Goal:** convert "I bought it" → "I ran it on something"
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
Yesterday's email had your download. Today's email has a *file* — a sample bank export I built specifically to break things.
|
||||
|
||||
→ **{{sample_file_url}}** (260 KB CSV, 1,400 rows of synthetic data — no real account info)
|
||||
|
||||
It's modeled after real exports I've seen from US, UK, and Canadian banks. Hidden in there:
|
||||
|
||||
- Mixed date formats (some `MM/DD/YYYY`, some `DD-MM-YY`, one row in `YYYY-MM-DD`)
|
||||
- Six different spellings of "Amazon" across the merchant column
|
||||
- Trailing whitespace + non-breaking spaces in the description column
|
||||
- Three obvious duplicate transactions and two non-obvious ones (different timestamps, same amount + merchant)
|
||||
- A totals row at the bottom that's not a transaction
|
||||
- One row with currency in `€` instead of `$`
|
||||
|
||||
Drop it into DataTools, click **"Run all"** in the analyzer, and look at the gate report. It'll catch all of the above and tell you exactly what changed and why.
|
||||
|
||||
The audit trail (a sidecar CSV called `<filename>.audit.csv`) is the part most bookkeepers are surprised by. Open it in Excel — every change has a row: original value, new value, rule that fired, timestamp. That's the file you hand to your client when they ask "wait, why did you re-classify that?".
|
||||
|
||||
Try it once on the sample, then once on a real client export. Reply and tell me what it caught (or missed) — I'm building the v1.1 detector list from real-world feedback.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
35
marketing/emails/bookkeeper/02-day3.md
Normal file
35
marketing/emails/bookkeeper/02-day3.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Bookkeeper · Day 3 — The audit trail your client will actually open
|
||||
|
||||
**Subject:** The audit trail your client will actually open
|
||||
**Send:** Day 3
|
||||
**Goal:** deepen feature understanding around the audit trail (the
|
||||
real differentiator vs. spreadsheet workflow)
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
Most "data cleaning" tools spit out a clean file and call it done. The thing your *client* needs — and what protects you in a year when they ask "why did you change that?" — is the audit trail.
|
||||
|
||||
Here's the file DataTools writes alongside every cleaned export. It's a CSV called `<filename>.audit.csv` and it sits next to the cleaned file in your output folder.
|
||||
|
||||
Five columns, append-only:
|
||||
|
||||
| original_value | new_value | rule_applied | confidence | timestamp |
|
||||
|----------------|-----------|--------------|------------|-----------|
|
||||
| `AMZN Mktp` | `Amazon` | `merchant_canonicalize` | 0.94 | 2026-05-04T09:12:03 |
|
||||
| ` Starbucks ` | `Starbucks` | `whitespace_strip` | 1.00 | 2026-05-04T09:12:03 |
|
||||
| `01/02/26` | `2026-02-01` | `date_normalize_dmy` | 0.88 | 2026-05-04T09:12:03 |
|
||||
|
||||
Why this matters in a real client conversation:
|
||||
|
||||
- **The client asks "why is this Amazon when my statement says AMZN Mktp?"** — open the audit CSV, point at the `merchant_canonicalize` row. Done in 10 seconds.
|
||||
- **A reviewer (auditor, accountant, you in 6 months) asks "what changed?"** — the audit CSV is the answer. Diffable, openable in Excel, no proprietary format.
|
||||
- **You spot a wrong rule firing** — the `confidence` column tells you which rules to tune. Anything <0.90 is worth eyeballing.
|
||||
|
||||
One workflow change worth making: when you send the cleaned file to QuickBooks, send the audit CSV to the client at the same time, in a folder labeled "month-end audit trail". Most clients won't open it. The 10% that do will trust you forever.
|
||||
|
||||
Reply if you want me to walk through the audit format on a call — happy to do a quick screen-share for any buyer in the first 30 days.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
32
marketing/emails/bookkeeper/03-day7.md
Normal file
32
marketing/emails/bookkeeper/03-day7.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# Bookkeeper · Day 7 — One pipeline, every client, every month
|
||||
|
||||
**Subject:** One pipeline, every client, every month
|
||||
**Send:** Day 7
|
||||
**Goal:** reframe from one-off tool to monthly workflow
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
A week in. By now you've probably run DataTools on 1-2 client exports and confirmed it does what the landing page promised.
|
||||
|
||||
The thing buyers tell me they wish they'd done from day one: **set it up as a workflow, not a one-off.**
|
||||
|
||||
The pattern that works:
|
||||
|
||||
**1. Make a folder per client.** Inside each client folder, a subfolder per month: `Acme Co/2026-05/`. Drop the raw export here.
|
||||
|
||||
**2. Save your DataTools settings as a per-client preset.** The "Save settings" button in the analyzer drops a `.datatools-preset.json` file. Stash that in the client folder. Next month, load the preset and the analyzer pre-configures with the rules you tuned for that client (e.g., your "Amazon Marketplace" canonical name, your client's specific merchant aliases).
|
||||
|
||||
**3. Run the pipeline. Get three files back:** the cleaned CSV, the audit CSV, the gate report. Move them into `Acme Co/2026-05/cleaned/`.
|
||||
|
||||
**4. Import the cleaned CSV to QuickBooks. Email the audit CSV to the client.**
|
||||
|
||||
Total elapsed time per client per month, after the first: 3-5 minutes. The first month per client is longer (~15 min) because you're tuning the preset.
|
||||
|
||||
The buyers who do this are the ones still emailing me 3 months later — usually with feature requests for the next client they want to onboard. The buyers who only ever run it ad-hoc tend to drift back to spreadsheets within 2 months.
|
||||
|
||||
If you want, reply with a sanitized export and I'll show you what your starting preset should look like — happy to do this for the first 50 buyers.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
35
marketing/emails/bookkeeper/04-day14.md
Normal file
35
marketing/emails/bookkeeper/04-day14.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Bookkeeper · Day 14 — Two-minute trick: the gate report
|
||||
|
||||
**Subject:** Two-minute trick: the gate report
|
||||
**Send:** Day 14
|
||||
**Goal:** surface the gate tool — non-obvious, high-value once seen
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
The tool inside DataTools that buyers find last is the **gate** — and it's the one that quietly does the most for you.
|
||||
|
||||
What it does: before any row gets written to the cleaned CSV, the gate runs a per-row pass-through check. Rows that fail get *quarantined* into a separate file (`<filename>.quarantine.csv`) instead of silently dropped or silently passed.
|
||||
|
||||
Default rules (you can add your own):
|
||||
|
||||
- Missing required fields (date, amount)
|
||||
- Amount in unexpected currency without a flag
|
||||
- Date outside the export's stated range (catches the "totals row" issue from Day 1)
|
||||
- Duplicate of another row already in the file (per the dedupe pass)
|
||||
- Confidence below your threshold on a field that got auto-corrected
|
||||
|
||||
The 2-minute workflow:
|
||||
|
||||
1. Run the pipeline as usual.
|
||||
2. Open `<filename>.quarantine.csv`. (It'll be tiny — typically 0-5% of rows.)
|
||||
3. Eyeball it. Anything that's a real transaction, fix-and-re-include manually. Anything that's a totals row / blank row / corrupt row — confirm it's correctly quarantined and delete it.
|
||||
4. Re-run the pipeline on the fixed-up version (or just append the manually-fixed rows to the cleaned CSV).
|
||||
|
||||
The reason this matters: silent drops are the worst possible failure mode for a bookkeeper. You'd rather a row come out wrong (you'll catch it on review) than disappear (you won't catch it for months). The gate makes the silent-drop case impossible.
|
||||
|
||||
Set the gate's confidence threshold to `0.85` for client work. Lower (0.75) for personal / exploratory; higher (0.92+) only if you've spent time tuning your client's preset.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
26
marketing/emails/bookkeeper/05-day30.md
Normal file
26
marketing/emails/bookkeeper/05-day30.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# Bookkeeper · Day 30 — Heard from a fellow bookkeeper?
|
||||
|
||||
**Subject:** Heard from a fellow bookkeeper?
|
||||
**Send:** Day 30
|
||||
**Goal:** referral / review ask. Last touch in the sequence.
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
A month in. If DataTools earned its $49 — would you do me one (very small) favor?
|
||||
|
||||
**Pick one of these. Whichever is easiest.**
|
||||
|
||||
1. **Gumroad review** (60 seconds): {{download_url}}#reviews — even a single line helps the next bookkeeper trust the listing enough to click "buy".
|
||||
2. **Reply to this email with one sentence I can quote** on the bookkeeper landing page. Anonymous if you prefer; I'll never use a name without explicit permission.
|
||||
3. **Share the landing page** with one bookkeeper friend who'd benefit: {{landing_page}}. No referral commission scheme, just a link.
|
||||
|
||||
If DataTools *didn't* earn its $49 — also reply. Tell me what's missing or what's broken. The 30-day refund window is still open and I'd rather refund a buyer who didn't get value than have an unhappy customer in the wild.
|
||||
|
||||
Either way, this is the last automated email you'll get from me. After this you only hear from me when there's a v1.x update or if you reply to one of the previous emails.
|
||||
|
||||
Thanks for being an early buyer — the first 50 customers shape the next 5,000.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
34
marketing/emails/revops/00-delivery.md
Normal file
34
marketing/emails/revops/00-delivery.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# RevOps · Day 0 — Delivery email
|
||||
|
||||
**Subject:** Your DataTools download (start here)
|
||||
**Send:** immediately on Gumroad purchase confirmation
|
||||
**Goal:** download + first run within 24h
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
Thanks for buying DataTools. Your download:
|
||||
|
||||
→ **{{download_url}}**
|
||||
|
||||
Three things to do in the next 5 minutes:
|
||||
|
||||
**1. Download the installer for your OS** (Mac `.dmg`, Windows `.exe`, or Linux `.tar.gz`). About 280 MB. The link auto-detects.
|
||||
|
||||
**2. Run it.** First launch takes ~5 seconds; a browser tab opens to `127.0.0.1:8501`. That's the app — running locally on your machine. No data leaves the box. (Yes, even if you're on the corporate VPN. Especially then.)
|
||||
|
||||
**3. Drop in a real lead list.** Don't bother with the bundled samples — the gate report only gets interesting when the data is real. Pull last quarter's webform export, or your most recent Apollo / LinkedIn pull, drag it into the analyzer, and click **"Run all"**. You'll see what the dedupe + format pipeline does in about 30 seconds.
|
||||
|
||||
If something doesn't work: just reply. I read every reply.
|
||||
|
||||
Refund: also just reply. 30-day no-questions; no form.
|
||||
|
||||
Tomorrow I'll send a sample 3-vendor lead list (HubSpot + LinkedIn + Apollo, synthetic data) so you can see the dedupe confidence tiers in action on a known input. After that you'll get one email a week for the next month — practical tips, no upsell. Unsubscribe at the bottom of any of them.
|
||||
|
||||
Welcome aboard.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
|
||||
P.S. If you have a RevOps friend who'd find this useful: {{landing_page}}.
|
||||
36
marketing/emails/revops/01-day1.md
Normal file
36
marketing/emails/revops/01-day1.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# RevOps · Day 1 — Try it on this 3-vendor lead list first
|
||||
|
||||
**Subject:** Try it on this 3-vendor lead list first
|
||||
**Send:** Day 1, ~9am buyer-local-time
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
Yesterday's email had your download. Today's email has a *file* — a synthetic 3-vendor lead list (HubSpot + LinkedIn scrape + Apollo pull) that I built specifically to break naive dedupe.
|
||||
|
||||
→ **{{sample_file_url}}** (1.2 MB CSV, 4,800 rows — fully synthetic, no real prospects)
|
||||
|
||||
What's hidden in there:
|
||||
|
||||
- The same person from 3 sources, with intentionally inconsistent fields:
|
||||
- HubSpot row: full email + company; no LinkedIn URL
|
||||
- LinkedIn row: name + title + LinkedIn URL; no email
|
||||
- Apollo row: email + phone + company; misspelled name
|
||||
- ~120 obvious duplicates (same email, different case)
|
||||
- ~80 cross-source duplicates (different keys, same person — these are the ones HubSpot's native dedupe misses)
|
||||
- ~40 phone numbers in 5 different formats per country (+1, +44, +61)
|
||||
- One row per 200 with a hidden zero-width space in the email
|
||||
|
||||
Drop it into DataTools, click **"Run all"** in the analyzer, then run the **dedupe** tool with the default 0.85 threshold.
|
||||
|
||||
Look at three things in the output:
|
||||
|
||||
1. **The cleaned CSV** — what your import would look like
|
||||
2. **The audit CSV** — every change, every rule, confidence per change
|
||||
3. **The manual-review queue** (`<filename>.review.csv`) — the 0.85-0.95 confidence range. This is where the real dedupe value is; auto-merging this range is what gets people in trouble.
|
||||
|
||||
Try it once on the sample, then once on a real list. Reply and tell me what it caught (or missed) — the v1.1 fuzzy-matching tuning comes from real-world feedback.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
36
marketing/emails/revops/02-day3.md
Normal file
36
marketing/emails/revops/02-day3.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# RevOps · Day 3 — The dedupe rule that catches LinkedIn drift
|
||||
|
||||
**Subject:** The dedupe rule that catches LinkedIn drift
|
||||
**Send:** Day 3
|
||||
**Goal:** deepen feature understanding around the cross-source dedupe
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
The thing native HubSpot / Salesforce dedupe can't do, and the thing DataTools is actually best at: **cross-source matching**, where the same person shows up via LinkedIn, a webform, and a trade-show import — with no shared key.
|
||||
|
||||
The rule that does the work is in the dedupe tool's **"Block by domain, fuzzy on name+title"** mode. Here's what it does:
|
||||
|
||||
**Step 1 — Block.** Group rows by email domain. (LinkedIn rows with no email get bucketed by `domain(linkedin_url)` — usually their company website if they listed it.) This avoids the O(n²) explosion and rules out cross-company false positives.
|
||||
|
||||
**Step 2 — Within each block, fuzzy-match on `first_name + last_name + title`.** Token-set ratio at 0.85 default. Catches:
|
||||
|
||||
- "Sarah O'Brien, VP Marketing" = "sarah obrien, vp of marketing"
|
||||
- "Mike Chen, Head of Sales" = "Michael Chen, Sales Lead" (this one needs a 0.78 threshold; configurable)
|
||||
- "J. Smith, Director" = "Jane Smith, Director" (only with a strong company-name match)
|
||||
|
||||
**Step 3 — Confidence-tier the merge.** ≥0.95 auto-merges. 0.85-0.95 goes to `<filename>.review.csv` for you to eyeball. <0.85 stays unmerged.
|
||||
|
||||
**Step 4 — Field-precedence on merge.** When records merge, you choose which source wins per field. Default precedence (configurable):
|
||||
|
||||
- `title`, `company`, `linkedin_url` → LinkedIn wins (more recent)
|
||||
- `email`, `phone` → Webform wins (verified)
|
||||
- `lifecycle_stage`, `owner` → HubSpot wins (your CRM is canonical)
|
||||
|
||||
**One trap to avoid:** don't run dedupe before format standardization. If phone formats are inconsistent across sources, the dedupe tool sees "+14155550143" and "(415) 555-0143" as different keys. Always run **format → analyzer → dedupe → gate** in that order. The pipeline UI enforces this; the per-tool runs don't.
|
||||
|
||||
Reply if you want me to walk through the precedence config on a screen-share — happy to do this for any buyer in the first 30 days.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
34
marketing/emails/revops/03-day7.md
Normal file
34
marketing/emails/revops/03-day7.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# RevOps · Day 7 — Run it before every HubSpot import
|
||||
|
||||
**Subject:** Run it before every HubSpot import
|
||||
**Send:** Day 7
|
||||
**Goal:** reframe from one-off tool to per-campaign workflow
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
A week in. By now you've probably run DataTools on a real list once or twice and confirmed the dedupe catches more than HubSpot's native check.
|
||||
|
||||
The thing that turns DataTools into a per-month-cost saver instead of a one-off purchase: **make it the gate on every import.**
|
||||
|
||||
The pattern that works:
|
||||
|
||||
**1. One DataTools run per campaign source.** Webform pull → DataTools. LinkedIn scrape → DataTools. Apollo export → DataTools. Each run produces a "clean" CSV.
|
||||
|
||||
**2. Concatenate the cleaned CSVs.** Standard pandas `concat` or just paste in Excel.
|
||||
|
||||
**3. One more DataTools run on the concatenation.** This is the cross-source dedupe pass — the one that catches the same person across the three sources.
|
||||
|
||||
**4. Compare against your current HubSpot export.** DataTools' dedupe against your existing CRM as the second source catches the people you already paid for last quarter and don't need to import again.
|
||||
|
||||
**5. Import only the residue** — the rows that survived all four passes — into HubSpot.
|
||||
|
||||
The buyers running this pipeline tell me they've cut their HubSpot marketing-contact bill 15-25% within two months. Not because their pipeline got smaller — because they stopped paying for duplicates.
|
||||
|
||||
**One thing to set up once:** save your dedupe settings as a `.datatools-preset.json` and commit it to your RevOps team's repo (or a shared Drive folder). Same preset every campaign means consistent results across whoever's running it that week.
|
||||
|
||||
If you want, reply with a sanitized lead list and I'll suggest a starting preset for your sources — happy to do this for the first 50 buyers.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
34
marketing/emails/revops/04-day14.md
Normal file
34
marketing/emails/revops/04-day14.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# RevOps · Day 14 — Two-minute trick: the confidence tiers
|
||||
|
||||
**Subject:** Two-minute trick: the confidence tiers
|
||||
**Send:** Day 14
|
||||
**Goal:** surface the manual-review queue — non-obvious, high-value
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
The single most-skipped feature in DataTools is also the one with the highest payoff per minute: the **manual-review queue**.
|
||||
|
||||
Here's what's happening under the hood: every dedupe decision DataTools makes has a confidence score (0.0 to 1.0). The dedupe tool by default puts decisions into three buckets:
|
||||
|
||||
- **≥0.95** → auto-merge (cleaned CSV)
|
||||
- **0.85 - 0.95** → manual-review queue (`<filename>.review.csv`)
|
||||
- **<0.85** → unmerged (kept as separate rows)
|
||||
|
||||
The 0.85-0.95 bucket is the magic. It's the range where a tuned algorithm catches *most* duplicates but where the wrong choice is a real cost (merging two genuinely different people = lost prospect; not merging two duplicates = paid contact you didn't need).
|
||||
|
||||
The 2-minute workflow:
|
||||
|
||||
1. Run dedupe.
|
||||
2. Open `<filename>.review.csv`. Each row is a candidate merge with: confidence, the two records side-by-side, the rule that fired.
|
||||
3. Eyeball each row. Mark `keep_merge` (Y/N) in the rightmost column.
|
||||
4. Re-run dedupe with the `--apply-review-decisions <filename>.review.csv` flag (or click "Apply review decisions" in the GUI).
|
||||
5. Final cleaned CSV reflects your manual choices.
|
||||
|
||||
For a 5,000-row lead list, the review queue is typically 20-60 rows. ~3 minutes of work. The output is dramatically better than auto-merge-everything-≥0.85, which is what most tools (including HubSpot's) do silently.
|
||||
|
||||
**Pro move:** save your `keep_merge` decisions over time. After 3-4 campaigns you'll have a corpus of "yes-merges" and "no-merges" you can use to retune the auto-merge threshold for *your* data. Most teams find their sweet spot is somewhere in 0.88-0.92.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
26
marketing/emails/revops/05-day30.md
Normal file
26
marketing/emails/revops/05-day30.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# RevOps · Day 30 — Heard from another RevOps lead?
|
||||
|
||||
**Subject:** Heard from another RevOps lead?
|
||||
**Send:** Day 30
|
||||
**Goal:** referral / review ask
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
A month in. If DataTools earned its $49 — would you do me one small favor?
|
||||
|
||||
**Pick the one that's easiest.**
|
||||
|
||||
1. **Gumroad review** (60 seconds): {{download_url}}#reviews — every line helps the next RevOps lead trust the listing enough to click "buy".
|
||||
2. **Reply to this email with one sentence I can quote** on the RevOps landing page. Anonymous if you prefer; I'll never use a name without explicit permission.
|
||||
3. **Share the landing page** with one RevOps friend who'd benefit: {{landing_page}}. No referral commission, just a link.
|
||||
|
||||
If DataTools *didn't* earn its $49 — also reply. Tell me what's missing or broken. The 30-day refund window is still open and I'd rather refund than have an unhappy customer in the wild.
|
||||
|
||||
Either way, this is the last automated email you'll get from me. After this you only hear from me when there's a v1.x update or if you reply to one of the previous emails.
|
||||
|
||||
Thanks for being an early buyer — the first 50 customers shape the next 5,000.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
34
marketing/emails/shopify-pet/00-delivery.md
Normal file
34
marketing/emails/shopify-pet/00-delivery.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# Shopify-pet · Day 0 — Delivery email
|
||||
|
||||
**Subject:** Your DataTools download (start here)
|
||||
**Send:** immediately on Gumroad purchase confirmation
|
||||
**Goal:** download + first run within 24h
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
Thanks for buying DataTools. Your download:
|
||||
|
||||
→ **{{download_url}}**
|
||||
|
||||
Three things to do in the next 5 minutes:
|
||||
|
||||
**1. Download the installer for your OS** (Mac `.dmg`, Windows `.exe`, or Linux `.tar.gz`). About 280 MB. The link auto-detects.
|
||||
|
||||
**2. Run it.** First launch takes ~5 seconds; a browser tab opens to `127.0.0.1:8501`. That's the app — running locally on your machine. No data leaves the box. Your customer list never goes to a server.
|
||||
|
||||
**3. Drop in a real Shopify customer export.** Don't bother with the bundled samples. Customers > Export > "All customers" > CSV in Shopify admin. Drag it into DataTools' analyzer, click **"Run all"**. You'll see what it catches — typically a few hundred phone-format issues, some hidden-character emails, and a handful of cross-row duplicates — in about 30 seconds.
|
||||
|
||||
If something doesn't work: reply to this email. Goes to my inbox.
|
||||
|
||||
Refund: also reply. 30-day no-questions; no form.
|
||||
|
||||
Tomorrow I'll send a sample Shopify customer export with the tricky cases pre-built in, so you can see what the cleanup catches on a known input. After that you'll get one email a week for the next month with one tip each. Unsubscribe at the bottom of any of them.
|
||||
|
||||
Welcome aboard.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
|
||||
P.S. Got a fellow store owner who'd find this useful? {{landing_page}}.
|
||||
32
marketing/emails/shopify-pet/01-day1.md
Normal file
32
marketing/emails/shopify-pet/01-day1.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# Shopify-pet · Day 1 — Try it on this Shopify customer export first
|
||||
|
||||
**Subject:** Try it on this Shopify customer export first
|
||||
**Send:** Day 1, ~9am buyer-local-time
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
Yesterday's email had your download. Today's email has a *file* — a synthetic Shopify customer export I built specifically to break things Klaviyo silently chokes on.
|
||||
|
||||
→ **{{sample_file_url}}** (480 KB CSV, 2,200 rows — fully synthetic, no real customer data)
|
||||
|
||||
What's hidden in there:
|
||||
|
||||
- Phone numbers in 6 different formats (`(415) 555-0143`, `415.555.0143`, `4155550143`, `+44 20 7946 0958` without country field, `+1-415-555-0143 ext 12`, `415 555 0143`)
|
||||
- Email addresses with embedded zero-width spaces (looks identical to a clean email; Klaviyo treats as different addresses)
|
||||
- ~80 obvious customer duplicates (same email, different case)
|
||||
- ~40 cross-row duplicates (different email, same name + same shipping address — usually the same person ordering with two emails)
|
||||
- Shipping addresses with mixed `St.` / `Street` / `St` / `STREET` for the same street name
|
||||
- 12 customers from outside North America with country field blank
|
||||
|
||||
Drop it into DataTools. Click **"Run all"** in the analyzer. Then run **format → dedupe → text-clean → gate** in that order.
|
||||
|
||||
Look at the **gate report** at the end — it'll tell you exactly which rows would have broken Klaviyo, with a one-line "why" per row.
|
||||
|
||||
If you want to see the difference: import the **raw** file to a test Klaviyo list, then import the **cleaned** file to a different test list. Compare the SMS-deliverable count. The delta is what you've been losing every month.
|
||||
|
||||
Reply and tell me what it caught (or missed) — v1.1 detector improvements come from real-world feedback.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
33
marketing/emails/shopify-pet/02-day3.md
Normal file
33
marketing/emails/shopify-pet/02-day3.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Shopify-pet · Day 3 — The phone-format step Klaviyo cares about
|
||||
|
||||
**Subject:** The phone-format step Klaviyo cares about
|
||||
**Send:** Day 3
|
||||
**Goal:** deepen feature understanding around the format standardizer
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
The single biggest source of "Klaviyo dropped this customer silently" is phone formatting. DataTools fixes this in one tool — the **format standardizer** — but the *settings* matter.
|
||||
|
||||
Klaviyo (and basically every modern SMS platform) wants phones in **E.164** format: `+` then country code then number, no spaces, no dashes, no extension. Like: `+14155550143`.
|
||||
|
||||
Three settings in DataTools' format standardizer that get this right:
|
||||
|
||||
**1. Set "Phone output format" to `E.164`.** Default is `national` (`(415) 555-0143`) — fine for display, broken for Klaviyo. Change it once; the preset remembers.
|
||||
|
||||
**2. Set "Default country" per row, not per file.** This is the non-obvious one. For each customer:
|
||||
- If the `country` field has a value (e.g., "Canada", "CA", "Canadá"), use it.
|
||||
- If blank, fall back to the country in the *shipping address*.
|
||||
- If still blank, fall back to the file-level default (you set this — typically your store's primary market).
|
||||
|
||||
DataTools does this automatically when you check "Use per-row country detection". *Skip this and ~30% of international customers will end up with US country codes prepended to their numbers — which Klaviyo accepts but routes wrong, and your SMS never arrives.*
|
||||
|
||||
**3. Set "Quarantine un-parseable phones" to ON.** Don't drop them silently; don't pass them to Klaviyo broken. Send them to `<filename>.quarantine.csv` so you can fix the worst 10-20 by hand and re-include them.
|
||||
|
||||
The combination — E.164 + per-row country + quarantine — typically takes a Shopify export from "60-70% of phones survive Klaviyo's import" to "97-99%". On a 10,000-customer list, that's 2,500 - 3,500 more customers reachable per campaign.
|
||||
|
||||
Reply if you want me to walk through these settings on a screen-share — happy to do this for any buyer in the first 30 days.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
35
marketing/emails/shopify-pet/03-day7.md
Normal file
35
marketing/emails/shopify-pet/03-day7.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# Shopify-pet · Day 7 — Run it before every Klaviyo sync
|
||||
|
||||
**Subject:** Run it before every Klaviyo sync
|
||||
**Send:** Day 7
|
||||
**Goal:** reframe from one-off tool to per-sync workflow
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
A week in. By now you've probably run DataTools on a real customer export once or twice and seen the cleanup catch things you'd been losing in Klaviyo for months.
|
||||
|
||||
The thing that turns DataTools into a recurring win instead of a one-off purchase: **run it before every sync, not just the first time.**
|
||||
|
||||
The pattern that works for most stores:
|
||||
|
||||
**1. Pick a cadence.** Most stores I talk to do this monthly; high-volume stores do it weekly. The cadence should match your "I'm planning a campaign" rhythm.
|
||||
|
||||
**2. The Sunday-morning ritual:**
|
||||
- Pull a fresh customer export from Shopify (Customers > Export > "All customers")
|
||||
- Drop into DataTools
|
||||
- Run the pipeline (analyzer → format → text-clean → dedupe → gate)
|
||||
- Review the gate quarantine file (typically 0.5-2% of rows)
|
||||
- Push the cleaned CSV to Klaviyo (their CSV import or via their API)
|
||||
|
||||
**3. Save your settings as a preset.** The "Save settings" button writes a `.datatools-preset.json`. Keep it in your store's Drive / Notion / wherever your shop docs live. Next month, load preset, run pipeline, done in 4 minutes.
|
||||
|
||||
**4. After 3 months, retune the preset.** Look at your manual-review queue across the 3 runs. If you're consistently approving 0.86-confidence merges, drop the auto-merge threshold to 0.85. If you're rejecting 0.92 merges, raise it to 0.94. The preset improves with use.
|
||||
|
||||
The store owners doing this monthly tell me their open rates go up 8-15% in the first 90 days — not from new content, just from the email actually reaching the inbox.
|
||||
|
||||
If you want, reply with a sanitized export and I'll suggest a starting preset for your store — happy to do this for the first 50 buyers.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
32
marketing/emails/shopify-pet/04-day14.md
Normal file
32
marketing/emails/shopify-pet/04-day14.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# Shopify-pet · Day 14 — Two-minute trick: hidden-character cleanup
|
||||
|
||||
**Subject:** Two-minute trick: hidden-character cleanup
|
||||
**Send:** Day 14
|
||||
**Goal:** surface the text cleaner — non-obvious, high-value
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
The tool inside DataTools that buyers find last is the **text cleaner** — and on Shopify customer exports it's usually the one with the most "wait, that was a problem?" moments.
|
||||
|
||||
What it catches: invisible characters that got into your customer data when customers typed on their phones. The most common offenders:
|
||||
|
||||
- **Zero-width space** (`U+200B`) inside emails — Klaviyo treats `sarah@acme.com` (with hidden char) and `sarah@acme.com` (without) as different addresses
|
||||
- **Non-breaking space** (`U+00A0`) inside addresses — Shopify accepts it, Klaviyo accepts it, but USPS address validation fails on it
|
||||
- **BOM marker** (`U+FEFF`) at the start of CSV cells — usually from a customer pasting from Word or a PDF
|
||||
- **Right-to-left mark** (`U+200F`) — rare, but appears in customer names from Hebrew/Arabic locales
|
||||
|
||||
The 2-minute workflow:
|
||||
|
||||
1. After the format standardizer pass, run the text cleaner.
|
||||
2. It produces an additional sidecar file: `<filename>.hidden-chars.csv` — every cell where it found a hidden char, with a "what was hidden where" annotation.
|
||||
3. Skim it. Most are fine to silently strip (zero-width spaces, BOMs). For rare ones (right-to-left marks in a name), confirm before stripping — sometimes they're load-bearing.
|
||||
4. Click "Apply cleanup". The text cleaner replaces the hidden chars in the cleaned CSV.
|
||||
|
||||
The reason this matters: **dedupe runs after text-clean.** Two emails with a hidden char difference look identical in the GUI but get treated as two separate customers — and your dedupe pass won't catch them unless the text cleaner ran first.
|
||||
|
||||
The pipeline order baked into the GUI is: `analyzer → format → text-clean → dedupe → gate`. Stick to it; per-tool runs out of order are the most common source of "wait, why didn't dedupe catch this?".
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
26
marketing/emails/shopify-pet/05-day30.md
Normal file
26
marketing/emails/shopify-pet/05-day30.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# Shopify-pet · Day 30 — Heard from another store owner?
|
||||
|
||||
**Subject:** Heard from another store owner?
|
||||
**Send:** Day 30
|
||||
**Goal:** referral / review ask
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
A month in. If DataTools earned its $49 — would you do me one small favor?
|
||||
|
||||
**Pick the one that's easiest.**
|
||||
|
||||
1. **Gumroad review** (60 seconds): {{download_url}}#reviews — every line helps the next Shopify owner trust the listing enough to click "buy".
|
||||
2. **Reply to this email with one sentence I can quote** on the landing page. Anonymous if you prefer; I'll never use a name without explicit permission.
|
||||
3. **Share the landing page** with one fellow store owner who'd benefit: {{landing_page}}. No referral commission, just a link.
|
||||
|
||||
If DataTools *didn't* earn its $49 — also reply. Tell me what's missing or broken. The 30-day refund window is still open and I'd rather refund than have an unhappy customer in the wild.
|
||||
|
||||
Either way, this is the last automated email you'll get from me. After this you only hear from me when there's a v1.x update or if you reply to one of the previous emails.
|
||||
|
||||
Thanks for being an early buyer — the first 50 customers shape the next 5,000.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
Reference in New Issue
Block a user