feat: Tier B operator scaffolding — bundle, copy SoT, posts, emails
Pick up and finish yesterday's cut-off Tier B pass. - build/: PyInstaller scaffold (datatools.spec + launcher.py + hook-streamlit.py + README) — folder-mode bundle, locked 127.0.0.1, per-OS recipe - marketing/COPY.md: single source of truth for every customer-facing string — landing H1/sub/CTAs, demo CTAs, email subjects, Gumroad listing, banned phrases - marketing/community-posts/: 9 drafts (3 posts × 3 niches: bookkeeper, revops, shopify-pet) — story / tip / soft-offer - marketing/emails/: 18 drafts (Gumroad delivery + 5-touch onboarding × 3 niches), per-niche segmentation guidance - docs/NEXT-STEPS.md: flip 2.2 / 2.4 / 3.1 / 3.4 to done with pointers to the new assets; add Phase 0 inventory rows - .gitignore: narrow `build/` ignore so PyInstaller spec + launcher + hooks get tracked, only generated artifacts (build/build/, build/__pycache__/, build/dist/) stay ignored Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
34
marketing/emails/revops/00-delivery.md
Normal file
34
marketing/emails/revops/00-delivery.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# RevOps · Day 0 — Delivery email
|
||||
|
||||
**Subject:** Your DataTools download (start here)
|
||||
**Send:** immediately on Gumroad purchase confirmation
|
||||
**Goal:** download + first run within 24h
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
Thanks for buying DataTools. Your download:
|
||||
|
||||
→ **{{download_url}}**
|
||||
|
||||
Three things to do in the next 5 minutes:
|
||||
|
||||
**1. Download the installer for your OS** (Mac `.dmg`, Windows `.exe`, or Linux `.tar.gz`). About 280 MB. The link auto-detects.
|
||||
|
||||
**2. Run it.** First launch takes ~5 seconds; a browser tab opens to `127.0.0.1:8501`. That's the app — running locally on your machine. No data leaves the box. (Yes, even if you're on the corporate VPN. Especially then.)
|
||||
|
||||
**3. Drop in a real lead list.** Don't bother with the bundled samples — the gate report only gets interesting when the data is real. Pull last quarter's webform export, or your most recent Apollo / LinkedIn pull, drag it into the analyzer, and click **"Run all"**. You'll see what the dedupe + format pipeline does in about 30 seconds.
|
||||
|
||||
If something doesn't work: just reply. I read every reply.
|
||||
|
||||
Refund: also just reply. 30-day no-questions; no form.
|
||||
|
||||
Tomorrow I'll send a sample 3-vendor lead list (HubSpot + LinkedIn + Apollo, synthetic data) so you can see the dedupe confidence tiers in action on a known input. After that you'll get one email a week for the next month — practical tips, no upsell. Unsubscribe at the bottom of any of them.
|
||||
|
||||
Welcome aboard.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
|
||||
P.S. If you have a RevOps friend who'd find this useful: {{landing_page}}.
|
||||
36
marketing/emails/revops/01-day1.md
Normal file
36
marketing/emails/revops/01-day1.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# RevOps · Day 1 — Try it on this 3-vendor lead list first
|
||||
|
||||
**Subject:** Try it on this 3-vendor lead list first
|
||||
**Send:** Day 1, ~9am buyer-local-time
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
Yesterday's email had your download. Today's email has a *file* — a synthetic 3-vendor lead list (HubSpot + LinkedIn scrape + Apollo pull) that I built specifically to break naive dedupe.
|
||||
|
||||
→ **{{sample_file_url}}** (1.2 MB CSV, 4,800 rows — fully synthetic, no real prospects)
|
||||
|
||||
What's hidden in there:
|
||||
|
||||
- The same person from 3 sources, with intentionally inconsistent fields:
|
||||
- HubSpot row: full email + company; no LinkedIn URL
|
||||
- LinkedIn row: name + title + LinkedIn URL; no email
|
||||
- Apollo row: email + phone + company; misspelled name
|
||||
- ~120 obvious duplicates (same email, different case)
|
||||
- ~80 cross-source duplicates (different keys, same person — these are the ones HubSpot's native dedupe misses)
|
||||
- ~40 phone numbers in 5 different formats per country (+1, +44, +61)
|
||||
- One row per 200 with a hidden zero-width space in the email
|
||||
|
||||
Drop it into DataTools, click **"Run all"** in the analyzer, then run the **dedupe** tool with the default 0.85 threshold.
|
||||
|
||||
Look at three things in the output:
|
||||
|
||||
1. **The cleaned CSV** — what your import would look like
|
||||
2. **The audit CSV** — every change, every rule, confidence per change
|
||||
3. **The manual-review queue** (`<filename>.review.csv`) — the 0.85-0.95 confidence range. This is where the real dedupe value is; auto-merging this range is what gets people in trouble.
|
||||
|
||||
Try it once on the sample, then once on a real list. Reply and tell me what it caught (or missed) — the v1.1 fuzzy-matching tuning comes from real-world feedback.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
36
marketing/emails/revops/02-day3.md
Normal file
36
marketing/emails/revops/02-day3.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# RevOps · Day 3 — The dedupe rule that catches LinkedIn drift
|
||||
|
||||
**Subject:** The dedupe rule that catches LinkedIn drift
|
||||
**Send:** Day 3
|
||||
**Goal:** deepen feature understanding around the cross-source dedupe
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
The thing native HubSpot / Salesforce dedupe can't do, and the thing DataTools is actually best at: **cross-source matching**, where the same person shows up via LinkedIn, a webform, and a trade-show import — with no shared key.
|
||||
|
||||
The rule that does the work is in the dedupe tool's **"Block by domain, fuzzy on name+title"** mode. Here's what it does:
|
||||
|
||||
**Step 1 — Block.** Group rows by email domain. (LinkedIn rows with no email get bucketed by `domain(linkedin_url)` — usually their company website if they listed it.) This avoids the O(n²) explosion and rules out cross-company false positives.
|
||||
|
||||
**Step 2 — Within each block, fuzzy-match on `first_name + last_name + title`.** Token-set ratio at 0.85 default. Catches:
|
||||
|
||||
- "Sarah O'Brien, VP Marketing" = "sarah obrien, vp of marketing"
|
||||
- "Mike Chen, Head of Sales" = "Michael Chen, Sales Lead" (this one needs a 0.78 threshold; configurable)
|
||||
- "J. Smith, Director" = "Jane Smith, Director" (only with a strong company-name match)
|
||||
|
||||
**Step 3 — Confidence-tier the merge.** ≥0.95 auto-merges. 0.85-0.95 goes to `<filename>.review.csv` for you to eyeball. <0.85 stays unmerged.
|
||||
|
||||
**Step 4 — Field-precedence on merge.** When records merge, you choose which source wins per field. Default precedence (configurable):
|
||||
|
||||
- `title`, `company`, `linkedin_url` → LinkedIn wins (more recent)
|
||||
- `email`, `phone` → Webform wins (verified)
|
||||
- `lifecycle_stage`, `owner` → HubSpot wins (your CRM is canonical)
|
||||
|
||||
**One trap to avoid:** don't run dedupe before format standardization. If phone formats are inconsistent across sources, the dedupe tool sees "+14155550143" and "(415) 555-0143" as different keys. Always run **format → analyzer → dedupe → gate** in that order. The pipeline UI enforces this; the per-tool runs don't.
|
||||
|
||||
Reply if you want me to walk through the precedence config on a screen-share — happy to do this for any buyer in the first 30 days.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
34
marketing/emails/revops/03-day7.md
Normal file
34
marketing/emails/revops/03-day7.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# RevOps · Day 7 — Run it before every HubSpot import
|
||||
|
||||
**Subject:** Run it before every HubSpot import
|
||||
**Send:** Day 7
|
||||
**Goal:** reframe from one-off tool to per-campaign workflow
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
A week in. By now you've probably run DataTools on a real list once or twice and confirmed the dedupe catches more than HubSpot's native check.
|
||||
|
||||
The thing that turns DataTools into a per-month-cost saver instead of a one-off purchase: **make it the gate on every import.**
|
||||
|
||||
The pattern that works:
|
||||
|
||||
**1. One DataTools run per campaign source.** Webform pull → DataTools. LinkedIn scrape → DataTools. Apollo export → DataTools. Each run produces a "clean" CSV.
|
||||
|
||||
**2. Concatenate the cleaned CSVs.** Standard pandas `concat` or just paste in Excel.
|
||||
|
||||
**3. One more DataTools run on the concatenation.** This is the cross-source dedupe pass — the one that catches the same person across the three sources.
|
||||
|
||||
**4. Compare against your current HubSpot export.** DataTools' dedupe against your existing CRM as the second source catches the people you already paid for last quarter and don't need to import again.
|
||||
|
||||
**5. Import only the residue** — the rows that survived all four passes — into HubSpot.
|
||||
|
||||
The buyers running this pipeline tell me they've cut their HubSpot marketing-contact bill 15-25% within two months. Not because their pipeline got smaller — because they stopped paying for duplicates.
|
||||
|
||||
**One thing to set up once:** save your dedupe settings as a `.datatools-preset.json` and commit it to your RevOps team's repo (or a shared Drive folder). Same preset every campaign means consistent results across whoever's running it that week.
|
||||
|
||||
If you want, reply with a sanitized lead list and I'll suggest a starting preset for your sources — happy to do this for the first 50 buyers.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
34
marketing/emails/revops/04-day14.md
Normal file
34
marketing/emails/revops/04-day14.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# RevOps · Day 14 — Two-minute trick: the confidence tiers
|
||||
|
||||
**Subject:** Two-minute trick: the confidence tiers
|
||||
**Send:** Day 14
|
||||
**Goal:** surface the manual-review queue — non-obvious, high-value
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
The single most-skipped feature in DataTools is also the one with the highest payoff per minute: the **manual-review queue**.
|
||||
|
||||
Here's what's happening under the hood: every dedupe decision DataTools makes has a confidence score (0.0 to 1.0). The dedupe tool by default puts decisions into three buckets:
|
||||
|
||||
- **≥0.95** → auto-merge (cleaned CSV)
|
||||
- **0.85 - 0.95** → manual-review queue (`<filename>.review.csv`)
|
||||
- **<0.85** → unmerged (kept as separate rows)
|
||||
|
||||
The 0.85-0.95 bucket is the magic. It's the range where a tuned algorithm catches *most* duplicates but where the wrong choice is a real cost (merging two genuinely different people = lost prospect; not merging two duplicates = paid contact you didn't need).
|
||||
|
||||
The 2-minute workflow:
|
||||
|
||||
1. Run dedupe.
|
||||
2. Open `<filename>.review.csv`. Each row is a candidate merge with: confidence, the two records side-by-side, the rule that fired.
|
||||
3. Eyeball each row. Mark `keep_merge` (Y/N) in the rightmost column.
|
||||
4. Re-run dedupe with the `--apply-review-decisions <filename>.review.csv` flag (or click "Apply review decisions" in the GUI).
|
||||
5. Final cleaned CSV reflects your manual choices.
|
||||
|
||||
For a 5,000-row lead list, the review queue is typically 20-60 rows. ~3 minutes of work. The output is dramatically better than auto-merge-everything-≥0.85, which is what most tools (including HubSpot's) do silently.
|
||||
|
||||
**Pro move:** save your `keep_merge` decisions over time. After 3-4 campaigns you'll have a corpus of "yes-merges" and "no-merges" you can use to retune the auto-merge threshold for *your* data. Most teams find their sweet spot is somewhere in 0.88-0.92.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
26
marketing/emails/revops/05-day30.md
Normal file
26
marketing/emails/revops/05-day30.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# RevOps · Day 30 — Heard from another RevOps lead?
|
||||
|
||||
**Subject:** Heard from another RevOps lead?
|
||||
**Send:** Day 30
|
||||
**Goal:** referral / review ask
|
||||
|
||||
---
|
||||
|
||||
Hi {{first_name}},
|
||||
|
||||
A month in. If DataTools earned its $49 — would you do me one small favor?
|
||||
|
||||
**Pick the one that's easiest.**
|
||||
|
||||
1. **Gumroad review** (60 seconds): {{download_url}}#reviews — every line helps the next RevOps lead trust the listing enough to click "buy".
|
||||
2. **Reply to this email with one sentence I can quote** on the RevOps landing page. Anonymous if you prefer; I'll never use a name without explicit permission.
|
||||
3. **Share the landing page** with one RevOps friend who'd benefit: {{landing_page}}. No referral commission, just a link.
|
||||
|
||||
If DataTools *didn't* earn its $49 — also reply. Tell me what's missing or broken. The 30-day refund window is still open and I'd rather refund than have an unhappy customer in the wild.
|
||||
|
||||
Either way, this is the last automated email you'll get from me. After this you only hear from me when there's a v1.x update or if you reply to one of the previous emails.
|
||||
|
||||
Thanks for being an early buyer — the first 50 customers shape the next 5,000.
|
||||
|
||||
— Michael
|
||||
{{support_email}}
|
||||
Reference in New Issue
Block a user