Pick up and finish yesterday's cut-off Tier B pass. - build/: PyInstaller scaffold (datatools.spec + launcher.py + hook-streamlit.py + README) — folder-mode bundle, locked 127.0.0.1, per-OS recipe - marketing/COPY.md: single source of truth for every customer-facing string — landing H1/sub/CTAs, demo CTAs, email subjects, Gumroad listing, banned phrases - marketing/community-posts/: 9 drafts (3 posts × 3 niches: bookkeeper, revops, shopify-pet) — story / tip / soft-offer - marketing/emails/: 18 drafts (Gumroad delivery + 5-touch onboarding × 3 niches), per-niche segmentation guidance - docs/NEXT-STEPS.md: flip 2.2 / 2.4 / 3.1 / 3.4 to done with pointers to the new assets; add Phase 0 inventory rows - .gitignore: narrow `build/` ignore so PyInstaller spec + launcher + hooks get tracked, only generated artifacts (build/build/, build/__pycache__/, build/dist/) stay ignored Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
37 lines
1.7 KiB
Markdown
37 lines
1.7 KiB
Markdown
# RevOps · Day 1 — Try it on this 3-vendor lead list first
|
|
|
|
**Subject:** Try it on this 3-vendor lead list first
|
|
**Send:** Day 1, ~9am buyer-local-time
|
|
|
|
---
|
|
|
|
Hi {{first_name}},
|
|
|
|
Yesterday's email had your download. Today's email has a *file* — a synthetic 3-vendor lead list (HubSpot + LinkedIn scrape + Apollo pull) that I built specifically to break naive dedupe.
|
|
|
|
→ **{{sample_file_url}}** (1.2 MB CSV, 4,800 rows — fully synthetic, no real prospects)
|
|
|
|
What's hidden in there:
|
|
|
|
- The same person from 3 sources, with intentionally inconsistent fields:
|
|
- HubSpot row: full email + company; no LinkedIn URL
|
|
- LinkedIn row: name + title + LinkedIn URL; no email
|
|
- Apollo row: email + phone + company; misspelled name
|
|
- ~120 obvious duplicates (same email, different case)
|
|
- ~80 cross-source duplicates (different keys, same person — these are the ones HubSpot's native dedupe misses)
|
|
- ~40 phone numbers in 5 different formats per country (+1, +44, +61)
|
|
- One row per 200 with a hidden zero-width space in the email
|
|
|
|
Drop it into DataTools, click **"Run all"** in the analyzer, then run the **dedupe** tool with the default 0.85 threshold.
|
|
|
|
Look at three things in the output:
|
|
|
|
1. **The cleaned CSV** — what your import would look like
|
|
2. **The audit CSV** — every change, every rule, confidence per change
|
|
3. **The manual-review queue** (`<filename>.review.csv`) — the 0.85-0.95 confidence range. This is where the real dedupe value is; auto-merging this range is what gets people in trouble.
|
|
|
|
Try it once on the sample, then once on a real list. Reply and tell me what it caught (or missed) — the v1.1 fuzzy-matching tuning comes from real-world feedback.
|
|
|
|
— Michael
|
|
{{support_email}}
|