DataTools / for RevOps
$49 — one-time, no subscription Get DataTools →
For RevOps · marketing ops · agency lead-gen · audience-builders

Dedupe lead lists across HubSpot, LinkedIn,
and manual scrapes — locally.

The same prospect shows up as alice@acme.com in HubSpot, Alice.Johnson@acme.com in LinkedIn Sales Navigator, and alice@acme.com again from your VA's manual scrape. Their phone is (415) 555-1234 in one source and 4155551234 in another. DataTools fuzzy-matches across sources, normalizes phones to E.164 with per-row country awareness, and produces one canonical lead per real person — without uploading a single contact to a third-party tool.

Get DataTools — $49 → Try the live demo ↓ One-time payment · cross-platform · runs offline
50+
country codes
3
CRM sources unified
0
cloud uploads ever
If your last campaign launch was held up by data hygiene

Five pains DataTools fixes before you import to HubSpot

💸

HubSpot / Marketo / Iterable bills you for every duplicate contact

10 k contacts → enterprise tier at $4–8 k/mo. 18 % cross-source duplicate rate from Apollo + ZoomInfo + LinkedIn means you're at 8.2 k unique people but paying for 10 k. Every month. Forever.

What it costs: $200–$800 per 1 k duplicate contacts — recurring, every month.

🚫

Sender reputation tanks when you mail to invalid or duplicate addresses

One bad sending session — to addresses your team scraped or imported without hygiene — and your domain reputation takes weeks to recover. Your good campaigns sit in spam folders during the recovery.

What it costs: catastrophic — entire email programme degraded for 2–6 weeks.

⚖️

GDPR makes uploading to a cloud cleaner a legal-review marathon

Every cloud-based lead-cleaner needs you to upload your prospect list. Your legal team needs 4–8 weeks to bless that. DataTools is desktop-only — no upload, no DPA, no review, no delay.

What it costs: 4–8 weeks of legal-review delay per tool, every time.

🪢

Apollo + ZoomInfo + LinkedIn + manual scrapes all use different schemas

Each export has its own column names, scoring scale, country format. Unifying them by hand for one campaign costs 1–3 days. Doing it for every campaign is unsustainable.

What it costs: 1–3 days per campaign of manual unification + judgement calls that drift across team members.

🛡️

Suppression lists across 5+ marketing platforms get out of sync

Each platform has its own suppression format. Out-of-sync lists let opted-out contacts slip through, triggering CAN-SPAM / GDPR exposure and the kind of "we got a complaint" email no one wants.

What it costs: compliance risk + churn-back cost + stakeholder trust.

📞

International dialer fails because phone formats vary

Calling list to 15 countries with mixed formats means dialler rejects 8–15 % of numbers, your reps spend the day on "number invalid" tones instead of conversations.

What it costs: rep productivity × failure rate × team size.

Live demo · runs in your browser

Try it on a real-looking 3-vendor lead list

The demo below loads a 25-row lead worksheet combining HubSpot, LinkedIn Sales Navigator, and manual scraping — with the same prospect appearing in two or three sources, country names spelled three different ways (USA, US, United States), and 13 different international phone formats. Click Run pipeline and watch the 5-step pipeline (text clean → format → missing → column map → dedup) collapse 25 rows to 19 with a single canonical record per prospect.

Demo runs on free hosting. Capped at 100 input rows · output watermarked. The paid product has no caps and runs entirely offline.
Built for the agency RevOps day

Three workflows you do every campaign

🪢

Email-list dedup across lead sources

HubSpot exports + LinkedIn Sales Navigator + the VA's spreadsheet, all merged. Fuzzy match across email + phone + name catches the cross-source duplicates that broke your last campaign send.

🌍

Multi-platform audience reconciliation

Build one canonical audience from Meta, Google Ads, LinkedIn, and your CRM. Each platform exports a different shape; Map Columns aligns them all, dedup merges the survivors with their most-complete fields.

🛡️

Suppression-list management

Suppression lists need to dedupe across email + phone + first-party identifiers. Add a row, dedupe, ship the canonical CSV to every platform — without uploading the suppression list to any of them.

If your campaigns target outside the US — almost everyone's do

50+ country codes. Per-row country awareness.

Your HubSpot list has (415) 555-1234. Your scraped list from the same prospect has +1 415 555 1234. Your Italian prospect entered +39 06 6982. Your Brazilian lead has 11 3071 0000. Each comes from a row tagged with its country — DataTools reads that column per row and parses every phone correctly to E.164.

For platforms that charge per contact

Every duplicate you don't catch costs you for the life of the contract.

HubSpot prices on contacts. Klaviyo prices on contacts. Marketo, Iterable, ActiveCampaign — all priced on contacts. Every duplicate you don't catch is a recurring tax on your campaign. DataTools catches them once, before import, with a fuzzy matcher that's tuned to the cross-source noise you actually see.

Real numbers from the demo: 25 input rows from three sources collapse to 19 — that's 6 duplicates the cross-source noise was hiding. On a 50,000-row campaign list, that ratio typically saves 12,000+ contacts a month, every month.
The thing every cloud cleaner can't say

Your prospects' contact info never leaves your computer.

Cloud lead-cleaning tools require you to upload your audience. That audience is your single most valuable agency asset — and once it's on someone else's server, your client's privacy story is no longer in your hands. DataTools is a desktop app. There is no upload step.

$ python -m src.cli_pipeline campaign_q1.csv --pipeline revops_pipeline.json --apply Reading campaign_q1.csv... 53,802 rows, 14 columns Executing pipeline: text_clean (160 ms) {cells_changed: 8,205} format_standardize (1.4 s) {cells_changed: 41,889 — 50 country codes} missing (140 ms) {sentinels_standardized: 6,710} column_map (220 ms) {columns_renamed: 4, columns_added: 1} dedup (4.8 s) {duplicates_removed: 12,344, merged: 12,344} Initial rows: 53,802 → Final rows: 41,458 Total elapsed: 6.7 s $ # 12,344 fewer contacts to pay for. for $49.
In the bundle

Six tools. One pipeline. One $49 download.

1 · Find Duplicates

Fuzzy match across email + phone + name + company; merge survivors with most-complete fields.

2 · Clean Text

Smart quotes from copy-paste, NBSP from spreadsheet exports, BOM from Excel.

3 · Standardize Formats

E.164 phones with per-row country, canonical emails, name casing, ISO dates.

4 · Fix Missing Values

Detect TBD, (unknown), across vendor exports.

5 · Map Columns

Project to your CRM's required schema, coerce score to integer, reorder for import.

6 · Automated Workflows

Save the cleanup as JSON. Drop next campaign's combined export on it. Same dedup, automated.

Pricing — pay once, own it

$49. No subscription. No per-campaign fee.

$149
one-time

Full DataTools Suite

Available when 3+ bundles ship. Includes everything in the RevOps pack plus the Shopify and Bookkeeper bundles. Save $48.

Coming when ready

Questions

Does this replace HubSpot's deduplication?

No — it cleans data before import to HubSpot (or LinkedIn, Marketo, Klaviyo, etc.). HubSpot's dedup runs on already-imported contacts; DataTools catches duplicates that haven't yet cost you a contract slot.

Does it handle international phones correctly?

Yes — via Google's libphonenumber, with 50+ country codes. The killer feature is per-row country: point a column at it (any column with values like US, USA, United States, +1, JP, Japan) and DataTools parses each row in its own region. No more UK numbers bucketed as malformed US.

Can I use it on multiple clients without paying again?

Yes. The licence is per-operator, not per-client. Run it on every agency client's lead list for the same $49.

How does fuzzy match work across columns?

Out of the box, the dedup engine builds default strategies based on column names — typically email + phone with exact match, name with Jaro-Winkler at 85%. You can override via JSON: pick which columns to match on, which algorithm, and what threshold. Strategies survive in the saved pipeline so next campaign uses the same rules.

What's the audit trail look like?

A row-by-row CSV: every modified cell with its original value, new value, and which rule fired. A separate JSON file describes the pipeline that produced it. Together they reproduce the cleanup deterministically — your client can verify it on their machine.

What's your refund policy?

Try the live demo above on the sample dataset before you buy. If DataTools doesn't fit your workflow within 14 days, email for a refund — no questions asked.

Stop paying twice for the same contact.

One $49 download. Catches the cross-source duplicates HubSpot and LinkedIn can't see, normalizes phones for 50+ countries, and saves a pipeline you can re-run on next campaign's combined list.

Get DataTools — $49 →