datatools-dev/marketing/emails/revops/02-day3.md

# RevOps · Day 3 — The dedupe rule that catches LinkedIn drift

**Subject:** The dedupe rule that catches LinkedIn drift
**Send:** Day 3
**Goal:** deepen feature understanding around the cross-source dedupe

---

Hi {{first_name}},

The thing native HubSpot / Salesforce dedupe can't do, and the thing DataTools is actually best at: **cross-source matching**, where the same person shows up via LinkedIn, a webform, and a trade-show import — with no shared key.

The rule that does the work is in the dedupe tool's **"Block by domain, fuzzy on name+title"** mode. Here's what it does:

**Step 1 — Block.** Group rows by email domain. (LinkedIn rows with no email get bucketed by `domain(linkedin_url)` — usually their company website if they listed it.) This avoids the O(n²) explosion and rules out cross-company false positives.

**Step 2 — Within each block, fuzzy-match on `first_name + last_name + title`.** Token-set ratio at 0.85 default. Catches:

- "Sarah O'Brien, VP Marketing" = "sarah obrien, vp of marketing"
- "Mike Chen, Head of Sales" = "Michael Chen, Sales Lead" (this one needs a 0.78 threshold; configurable)
- "J. Smith, Director" = "Jane Smith, Director" (only with a strong company-name match)

**Step 3 — Confidence-tier the merge.** ≥0.95 auto-merges. 0.85-0.95 goes to `<filename>.review.csv` for you to eyeball. <0.85 stays unmerged.

**Step 4 — Field-precedence on merge.** When records merge, you choose which source wins per field. Default precedence (configurable):

- `title`, `company`, `linkedin_url` → LinkedIn wins (more recent)
- `email`, `phone` → Webform wins (verified)
- `lifecycle_stage`, `owner` → HubSpot wins (your CRM is canonical)

**One trap to avoid:** don't run dedupe before format standardization. If phone formats are inconsistent across sources, the dedupe tool sees "+14155550143" and "(415) 555-0143" as different keys. Always run **format → analyzer → dedupe → gate** in that order. The pipeline UI enforces this; the per-tool runs don't.

Reply if you want me to walk through the precedence config on a screen-share — happy to do this for any buyer in the first 30 days.

— Michael
{{support_email}}