feat: Tier B operator scaffolding — bundle, copy SoT, posts, emails
Pick up and finish yesterday's cut-off Tier B pass. - build/: PyInstaller scaffold (datatools.spec + launcher.py + hook-streamlit.py + README) — folder-mode bundle, locked 127.0.0.1, per-OS recipe - marketing/COPY.md: single source of truth for every customer-facing string — landing H1/sub/CTAs, demo CTAs, email subjects, Gumroad listing, banned phrases - marketing/community-posts/: 9 drafts (3 posts × 3 niches: bookkeeper, revops, shopify-pet) — story / tip / soft-offer - marketing/emails/: 18 drafts (Gumroad delivery + 5-touch onboarding × 3 niches), per-niche segmentation guidance - docs/NEXT-STEPS.md: flip 2.2 / 2.4 / 3.1 / 3.4 to done with pointers to the new assets; add Phase 0 inventory rows - .gitignore: narrow `build/` ignore so PyInstaller spec + launcher + hooks get tracked, only generated artifacts (build/build/, build/__pycache__/, build/dist/) stay ignored Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
39
marketing/community-posts/revops/01-story.md
Normal file
39
marketing/community-posts/revops/01-story.md
Normal file
@@ -0,0 +1,39 @@
|
||||
# RevOps · Post 1 — Story
|
||||
|
||||
**Where to post:** r/revops, r/sales, RevGenius Slack, Modern Sales Pros,
|
||||
Pavilion communities, LinkedIn (your own feed).
|
||||
|
||||
**Format:** ~400 words. Tactical war-story style. Don't pitch in the body.
|
||||
|
||||
---
|
||||
|
||||
## Title
|
||||
|
||||
We were paying HubSpot for 4,200 duplicate contacts. Here's the dedupe pipeline that caught them.
|
||||
|
||||
## Body
|
||||
|
||||
Last quarter I ran a count on our HubSpot instance: ~4,200 contacts that were almost-certainly the same person as another contact already in the system. Our HubSpot bill is per-marketing-contact, so this was a real number. ($X/month — pick your tier.)
|
||||
|
||||
The problem is that HubSpot's native "find duplicates" tool is exact-match-only on a small set of fields. It misses:
|
||||
|
||||
- "Sarah O'Brien" vs "Sarah Obrien" (apostrophe / no-apostrophe)
|
||||
- "+1 (415) 555-0143" vs "415-555-0143" vs "4155550143" (phone formats)
|
||||
- "sarah@acme.com" vs "Sarah@acme.com" (case)
|
||||
- Same person from a LinkedIn scrape (no phone) + a webform fill (no LinkedIn URL) + a trade-show import (only email + company)
|
||||
|
||||
Here's the 4-step pipeline I run before *every* HubSpot import now. You can build the first 3 with Python + pandas + rapidfuzz; the 4th is the one that matters and is the easiest to skip:
|
||||
|
||||
**Step 1 — Normalize before comparing.** Lowercase emails, strip phone formatting to E.164, trim whitespace, normalize unicode (NFKC). This alone catches ~40% of dupes.
|
||||
|
||||
**Step 2 — Fuzzy-match on name + company, blocked by email domain.** Don't fuzzy-match across the whole list (O(n²) and full of false positives). Block by email domain first — only compare contacts within the same company. Use rapidfuzz token-set ratio at threshold 85.
|
||||
|
||||
**Step 3 — Cross-source merge logic.** When LinkedIn-source and webform-source records match, *the LinkedIn one wins on title/company* (more recent), *the webform one wins on phone/email* (verified). Document this rule somewhere your team can read it.
|
||||
|
||||
**Step 4 — Confidence tiers, not yes/no.** Don't auto-merge anything below 95% confidence. Auto-merge 95-100. Queue 85-95 for manual review. Drop everything below 85. The manual queue is the magic — it catches the cases the algorithm doesn't dare touch and trains you on what your data actually looks like.
|
||||
|
||||
I eventually wrapped all this into a desktop tool I called DataTools because I got tired of re-running the script every campaign. Local-only, $49 if anyone wants it: datatools.app/revops. But the 4-step framework above is the real takeaway — works regardless of what tool you use.
|
||||
|
||||
What's your dedupe pipeline look like?
|
||||
|
||||
— {{your-name}}
|
||||
27
marketing/community-posts/revops/02-tip.md
Normal file
27
marketing/community-posts/revops/02-tip.md
Normal file
@@ -0,0 +1,27 @@
|
||||
# RevOps · Post 2 — Tip
|
||||
|
||||
**Where to post:** LinkedIn, RevGenius Slack #tips channel,
|
||||
RevOps Co-op, Modern Sales Pros.
|
||||
|
||||
**Format:** ~150 words. Tactical. One idea, one sentence-of-pitch
|
||||
at the bottom.
|
||||
|
||||
---
|
||||
|
||||
## Title
|
||||
|
||||
The 30-second pre-import check that catches LinkedIn-scrape duplicates before they hit HubSpot
|
||||
|
||||
## Body
|
||||
|
||||
Before you import a LinkedIn scrape (Apollo, Lusha, Cognism — same problem) into HubSpot:
|
||||
|
||||
Open the file. Sort by `email`. Look for blanks.
|
||||
|
||||
LinkedIn-sourced rows often have *no email* — just name + company + LinkedIn URL. If you import them as-is, HubSpot creates a new contact for each one. The next time someone fills your webform with the same name + company, HubSpot creates *another* new contact, because there's no key to match on.
|
||||
|
||||
Two-minute fix: before import, generate a synthetic dedupe key as `lower(first_name)|lower(last_name)|domain(company_url)`. Sort by it. Anything with >1 row is a likely dupe — review and merge before HubSpot ever sees it.
|
||||
|
||||
If you're doing this monthly across multiple lead sources and want to automate it (plus phone normalization, fuzzy matching, the whole pipeline), I built a $49 desktop tool: datatools.app/revops. Local — your prospect list never goes to a server.
|
||||
|
||||
— {{your-name}}
|
||||
35
marketing/community-posts/revops/03-soft-offer.md
Normal file
35
marketing/community-posts/revops/03-soft-offer.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# RevOps · Post 3 — Soft offer
|
||||
|
||||
**Where to post:** IndieHackers, r/revops monthly self-promo,
|
||||
RevGenius #tools-and-software, LinkedIn (your own feed).
|
||||
|
||||
**Format:** ~250 words.
|
||||
|
||||
---
|
||||
|
||||
## Title
|
||||
|
||||
DataTools — a $49 desktop CSV pipeline for the lead-list cleanup you do before every HubSpot import
|
||||
|
||||
## Body
|
||||
|
||||
Built this for myself first. {{your-context — e.g., "I run RevOps at a 30-person SaaS"}} and the part of the job I dreaded was the pre-import scrub: LinkedIn export + Apollo pull + last quarter's webform list, deduped against each other and against what's already in HubSpot. Six tabs in a Google Sheet, regexes I half-remember, vlookups, an hour and a half.
|
||||
|
||||
**DataTools** does the six steps as one pipeline:
|
||||
|
||||
- **Format standardizer** — phones to E.164 (50+ country codes, per-row country awareness), emails lowercased, URLs canonicalized
|
||||
- **Dedupe** — fuzzy matching with confidence tiers (95+ auto, 85-95 manual queue, <85 dropped), blocked by email domain so it scales to 50k-row lists
|
||||
- **Gate** — block bad rows from your import with a per-rule report ("142 rows missing email, 38 rows with malformed phones, 12 rows with corporate-blacklist domains")
|
||||
- **Text cleaner** — strips hidden chars, BOMs, weird unicode
|
||||
- **Analyzer** — finds problems before you process (mixed encodings, inconsistent delimiters, near-duplicate rows)
|
||||
- **Splitter** — chunk huge files for tools with row limits
|
||||
|
||||
Runs **locally** — Mac/Win/Linux. Your prospect data never goes to a server. (This was the actual reason I shipped it instead of using Clearbit / cloud tools — legal didn't want third-party touching prospect data after the {{2024 / 2025}} compliance review.)
|
||||
|
||||
**$49 one-time.** No subscription. No per-record fee. v1.x updates included.
|
||||
|
||||
Demo (with synthetic data) and download: datatools.app/revops
|
||||
|
||||
Happy to answer questions in the thread.
|
||||
|
||||
— {{your-name}}
|
||||
Reference in New Issue
Block a user