Files
datatools-dev/marketing/emails/revops/01-day1.md
Michael e1f364f010 feat: Tier B operator scaffolding — bundle, copy SoT, posts, emails
Pick up and finish yesterday's cut-off Tier B pass.

- build/: PyInstaller scaffold (datatools.spec + launcher.py +
  hook-streamlit.py + README) — folder-mode bundle, locked
  127.0.0.1, per-OS recipe
- marketing/COPY.md: single source of truth for every customer-facing
  string — landing H1/sub/CTAs, demo CTAs, email subjects, Gumroad
  listing, banned phrases
- marketing/community-posts/: 9 drafts (3 posts × 3 niches:
  bookkeeper, revops, shopify-pet) — story / tip / soft-offer
- marketing/emails/: 18 drafts (Gumroad delivery + 5-touch
  onboarding × 3 niches), per-niche segmentation guidance
- docs/NEXT-STEPS.md: flip 2.2 / 2.4 / 3.1 / 3.4 to done with
  pointers to the new assets; add Phase 0 inventory rows
- .gitignore: narrow `build/` ignore so PyInstaller spec + launcher
  + hooks get tracked, only generated artifacts (build/build/,
  build/__pycache__/, build/dist/) stay ignored

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 14:04:37 +00:00

37 lines
1.7 KiB
Markdown

# RevOps · Day 1 — Try it on this 3-vendor lead list first
**Subject:** Try it on this 3-vendor lead list first
**Send:** Day 1, ~9am buyer-local-time
---
Hi {{first_name}},
Yesterday's email had your download. Today's email has a *file* — a synthetic 3-vendor lead list (HubSpot + LinkedIn scrape + Apollo pull) that I built specifically to break naive dedupe.
**{{sample_file_url}}** (1.2 MB CSV, 4,800 rows — fully synthetic, no real prospects)
What's hidden in there:
- The same person from 3 sources, with intentionally inconsistent fields:
- HubSpot row: full email + company; no LinkedIn URL
- LinkedIn row: name + title + LinkedIn URL; no email
- Apollo row: email + phone + company; misspelled name
- ~120 obvious duplicates (same email, different case)
- ~80 cross-source duplicates (different keys, same person — these are the ones HubSpot's native dedupe misses)
- ~40 phone numbers in 5 different formats per country (+1, +44, +61)
- One row per 200 with a hidden zero-width space in the email
Drop it into DataTools, click **"Run all"** in the analyzer, then run the **dedupe** tool with the default 0.85 threshold.
Look at three things in the output:
1. **The cleaned CSV** — what your import would look like
2. **The audit CSV** — every change, every rule, confidence per change
3. **The manual-review queue** (`<filename>.review.csv`) — the 0.85-0.95 confidence range. This is where the real dedupe value is; auto-merging this range is what gets people in trouble.
Try it once on the sample, then once on a real list. Reply and tell me what it caught (or missed) — the v1.1 fuzzy-matching tuning comes from real-world feedback.
— Michael
{{support_email}}