DataTools / for Shopify
$49 — one-time, no subscription Get DataTools →
For Shopify operators · pet supplies · subscription stores · DTC

Klaviyo-import-ready customer lists.
In 30 seconds. Locally.

Your Shopify customer export is a mess of formatting drift, disguised duplicates, and inconsistent phone numbers. DataTools fixes all of it in one pass — fuzzy-dedupes the same customer Klaviyo would charge you for twice, standardises phones across your international subscribers, and hands you a cleaned CSV. Your data never leaves your computer.

Get DataTools — $49 → Try the live demo ↓ One-time payment · cross-platform · runs offline
6
tools, one bundle
1 GB
customer file in 2.5 min
0
cloud uploads ever
If any of these sound like your Tuesday

Five pains DataTools fixes in one pass

💸

Klaviyo / Mailchimp / Omnisend bills you for every duplicate

Same customer signs up twice — once with a typo, once with a plus-tag, once on mobile. Your subscriber list has 10–18 % duplicate rate and you're paying for every one of them, every month, forever.

What it costs: $30–$300/mo per percent of dupes on a 50 k-list — recurring.

📵

Your product feed got rejected by Google Merchant Center

Smart quotes from a copy-paste in product titles. NBSP in SKU. Inconsistent attribute casing. Feed bounces, the launch sits for 24–72 hours while you try to find the bad row in a 12,000-line CSV.

What it costs: 1–3 days of delayed campaign × the campaign value.

🪢

Orders from Shopify + Etsy + Amazon + Faire don't speak the same language

Each platform's export uses different column names for "customer email" / "ship country" / "order total." Merging takes hours of manual rename and copy-paste before the analysis can even begin.

What it costs: 4–8 hours per month manually merging exports.

🔁

Subscription churn looks higher than it is

Pet-box subscribers cancel, then re-sub three months later under a different email or device. Your cohort report says churn is 20 % when it's actually 12 % — and you're over-paying for acquisition because LTV is mis-calculated.

What it costs: wrong CAC ceiling for the next year of paid ads.

🌍

VAT MOSS / EU tax breaks because country is spelled three ways

Your UK customers are tagged UK, U.K., and United Kingdom — all in one export. The VAT report aggregates them as three different markets. Compliance friction every quarter.

What it costs: compliance risk + repeated manual normalization.

🔒

Cloud cleaners want you to upload your customer list

Your customer list is your single most valuable business asset. Uploading it to a SaaS to clean it is the privacy story you do not want. DataTools is desktop-only — your list never leaves your computer.

What it costs: nothing — and that's the point.

Live demo · runs in your browser

Try it on a real-looking Shopify customer export

The demo below loads a sample 15-row Shopify customer file with pollution we've seen in actual stores: smart quotes from copy-paste, duplicates with email-case drift, international phones from the UK, Spain, Germany, Australia, and Japan, and the usual mess of N/A / (blank) / ? sentinels. Click Run pipeline and watch every column get cleaned in under a second.

Demo runs on free hosting (Streamlit Community Cloud). Capped at 100 input rows · output watermarked with one trailing row. The paid product has no caps and runs entirely offline.
Built for the Shopify operator

Five workflows you do every week

🧹

Customer-list cleanup

Catches the same customer who shows up as john@gmail.com, John@Gmail.com, and j.ohn@gmail.com. Fuzzy match merges the spellings, exact match catches the obvious ones.

📦

Product catalogue dedup

SKU whitespace, near-identical product names, copy-paste smart quotes in titles — gone. Audit log shows every change.

🛒

Abandoned-cart hygiene

Before re-engagement: dedupe across email + phone, drop sentinels-as-missing, format dates so your sequence triggers fire correctly.

📥

Subscriber-list import to Klaviyo

Klaviyo charges per contact. Every duplicate you don't catch costs you for the life of the subscription. Catch them once, pay once.

🔗

Multi-channel order consolidation

Orders from Shopify + Etsy + a wholesale spreadsheet, each with a different column for "customer email." Map Columns aligns them; dedup merges across channels.

⚙️

Repeatable pipeline

Save the cleanup as a JSON file. Drop next week's export on it. Same cleanup, zero re-configuration. Automatable via the CLI.

The thing every cloud cleaner can't say

Your customer list never leaves your computer.

DataTools is a desktop app. There's no upload step, no SaaS account, no subscription, no "trust our security policy." The first thing you can do after install is open your browser's network tab, run the cleaner on your real customer file, and verify zero outbound requests.

Why it matters for Shopify: your customer list is your single most valuable business asset. Cloud cleaners require you to upload it. We don't.
$ python -m src.cli_pipeline customers.csv --apply Reading customers.csv... 47,832 rows, 14 columns Executing pipeline: text_clean (140 ms) {cells_changed: 12,408} format_standardize (810 ms) {cells_changed: 31,202} missing (95 ms) {sentinels_standardized: 8,129} dedup (3.1 s) {duplicates_removed: 2,347} Initial rows: 47,832 → Final rows: 45,485 Total elapsed: 4.2 s $ # zero network calls. zero. promise.
For when your client asks "what changed?"

Every change auditable. Every cell logged.

Every modification is recorded with the original value, the new value, and which rule fired. Hand the audit CSV to your accountant, your marketing manager, or your boss along with the cleaned file. No "I trust the AI" hand-waving — they see exactly what happened.

Real example: the demo above standardized 27 cells across 15 customers. The audit log lists each one — row, column, before, after, which standardizer fired. The dedup audit lists every duplicate group with the survivor and its losers.
If you sell internationally — most pet brands do

Phones, addresses, and currencies from anywhere on Earth.

Your subscriber from London entered her phone as 020 7946 0958. Your Tokyo customer entered 03-3210-7000. Your German wholesale buyer wrote €2.410,75. Excel thinks all of them are mistakes. DataTools knows what country each row is from (per-row country column) and parses every one correctly to E.164 phones, ISO dates, and numeric amounts.

In the bundle

Six tools. One pipeline. One $49 download.

1 · Find Duplicates

Fuzzy match (Jaro-Winkler), 5 normalizers, survivor rules, interactive review.

2 · Clean Text

Whitespace, smart chars, NBSP, BOM, line endings, case ops.

3 · Standardize Formats

Dates, phones, emails, addresses, names, currencies, booleans.

4 · Fix Missing Values

Disguised-null detection, profile, mean/median/mode/ffill, drop strategies.

5 · Map Columns

Fuzzy auto-rename, target schema, type coercion, required-field defaults.

6 · Automated Workflows

Chain tools in recommended order, save/load JSON, automate weekly cleanups.

Pricing — pay once, own it

$49. No subscription. No ceiling on rows or files.

$149
one-time

Full DataTools Suite

Available when 3+ bundles ship. Includes everything in the Shopify pack plus the Bookkeeper and RevOps bundles. Save $48.

Coming when ready

Questions

Does this work with Shopify Plus?

Yes — the input is just CSV / Excel from any source. Your Shopify Plus exports work the same as the standard plan, the same as a Shopify-to-CSV pipeline you've stitched together yourself. The cleaner doesn't care.

How does this compare to Excel's "Remove Duplicates"?

Excel does exact deduplication. John@Gmail.com and john@gmail.com are different customers to Excel. DataTools fuzzy-matches across case, whitespace, formatting, and even close-but-not-identical strings. The demo above merges 4 customer pairs Excel would leave duplicated.

How big a file can it handle?

1 GB CSV with international phones + addresses processes in about 2.5 minutes on a typical workstation. Streaming mode keeps memory bounded regardless of input size — we tested it on 26 million rows.

Do I need to know Python to use it?

No. The GUI is a browser interface that opens automatically when you double-click the app. It loads your file, you click Run, you download the cleaned file. The CLI is there for power users who want to script weekly cleanups.

What about my privacy?

Your customer list never leaves your computer. There is no cloud component, no telemetry, no "anonymous usage stats." When the app is running you can confirm zero outbound network requests in your browser's developer tools.

What's your refund policy?

Try the live demo above on the sample dataset before you buy. If you still find DataTools doesn't fit your workflow within 14 days, email for a refund — no questions asked.

Will there be updates?

Yes. The v1.x line is included free for everyone who buys DataTools today. We ship a patch every 30 days adding country support, edge-case fixes, and small features.

Stop deduplicating customers by hand.

One $49 download. Mac, Windows, or Linux. Runs offline. Catches the duplicates Excel misses, standardizes the phones from your international customers, and saves a pipeline you can re-run on next week's export.

Get DataTools — $49 →