Files

Michael e1f364f010 feat: Tier B operator scaffolding — bundle, copy SoT, posts, emails

Pick up and finish yesterday's cut-off Tier B pass.

- build/: PyInstaller scaffold (datatools.spec + launcher.py +
  hook-streamlit.py + README) — folder-mode bundle, locked
  127.0.0.1, per-OS recipe
- marketing/COPY.md: single source of truth for every customer-facing
  string — landing H1/sub/CTAs, demo CTAs, email subjects, Gumroad
  listing, banned phrases
- marketing/community-posts/: 9 drafts (3 posts × 3 niches:
  bookkeeper, revops, shopify-pet) — story / tip / soft-offer
- marketing/emails/: 18 drafts (Gumroad delivery + 5-touch
  onboarding × 3 niches), per-niche segmentation guidance
- docs/NEXT-STEPS.md: flip 2.2 / 2.4 / 3.1 / 3.4 to done with
  pointers to the new assets; add Phase 0 inventory rows
- .gitignore: narrow `build/` ignore so PyInstaller spec + launcher
  + hooks get tracked, only generated artifacts (build/build/,
  build/__pycache__/, build/dist/) stay ignored

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-02 14:04:37 +00:00

1.8 KiB

Raw Blame History

RevOps · Day 14 — Two-minute trick: the confidence tiers

Subject: Two-minute trick: the confidence tiers Send: Day 14 Goal: surface the manual-review queue — non-obvious, high-value

Hi {{first_name}},

The single most-skipped feature in DataTools is also the one with the highest payoff per minute: the manual-review queue.

Here's what's happening under the hood: every dedupe decision DataTools makes has a confidence score (0.0 to 1.0). The dedupe tool by default puts decisions into three buckets:

≥0.95 → auto-merge (cleaned CSV)
0.85 - 0.95 → manual-review queue (<filename>.review.csv)
<0.85 → unmerged (kept as separate rows)

The 0.85-0.95 bucket is the magic. It's the range where a tuned algorithm catches most duplicates but where the wrong choice is a real cost (merging two genuinely different people = lost prospect; not merging two duplicates = paid contact you didn't need).

The 2-minute workflow:

Run dedupe.
Open <filename>.review.csv. Each row is a candidate merge with: confidence, the two records side-by-side, the rule that fired.
Eyeball each row. Mark keep_merge (Y/N) in the rightmost column.
Re-run dedupe with the --apply-review-decisions <filename>.review.csv flag (or click "Apply review decisions" in the GUI).
Final cleaned CSV reflects your manual choices.

For a 5,000-row lead list, the review queue is typically 20-60 rows. ~3 minutes of work. The output is dramatically better than auto-merge-everything-≥0.85, which is what most tools (including HubSpot's) do silently.

Pro move: save your keep_merge decisions over time. After 3-4 campaigns you'll have a corpus of "yes-merges" and "no-merges" you can use to retune the auto-merge threshold for your data. Most teams find their sweet spot is somewhere in 0.88-0.92.

— Michael {{support_email}}

1.8 KiB Raw Blame History

RevOps · Day 14 — Two-minute trick: the confidence tiers

1.8 KiB

Raw Blame History