Sweep follow-up to 93e43fc. Display labels now consistent across docs,
landing pages, CLI output, code comments, docstrings, and test prose.
Five parallel surfaces touched:
- docs (EN + ES): README, USER-GUIDE, CLI-REFERENCE, and 11 internal
design/planning docs
- landing pages: index + bookkeeper/revops/shopify-pet
- src: CLI module docstrings, _TOOL_DISPLAY dicts in cli_analyze.py
and gui/components/_legacy.py, core module headers, every tool
page's module docstring
- tests: class/method/module docstrings and section-header comments
- test-cases READMEs
Page slugs (1_Deduplicator etc.), tool_id strings (01_deduplicator
etc.), Python class names (TestDeduplicatorWorkflow, FeatureFlag.*),
URL paths, anchor IDs, CSS classes, and asset filenames were left
intact since they're code identifiers / structural references.
All 2033 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9.9 KiB
Business
Creator-only. Do not ship to buyers. Version: 1.6 · Updated: 2026-05-01 · Owner: Michael
1. Executive summary
Sell niche Python automation tools as one-time downloadable digital products. Target non-technical users who hate Excel/CSV grunt work but can't code. Distribute via Gumroad / Lemon Squeezy with automated delivery. Cross-platform from launch. Each bundle ships GUI (primary, browser-local) + CLI.
- Pricing: $49-79 per bundle · $149 full suite (when 3+ exist).
- Goal: lifestyle cashflow. No saleable-asset exit required.
2. Market opportunity
- Persistent, evergreen pain: data cleaning is universal.
- Low competition in vertical niches (Shopify pet-supplies feeds vs. generic CSV cleaners).
- ~100% gross margin after creation.
- Hosted browser demo as try-before-buy conversion lever (added v1.3).
Timing reality: marketplaces + community posts → days/weeks to first sale. Own-domain SEO is a 6-18 month compounding asset, not an early channel.
3. Target customers
Primary:
- Shopify owners (Pet Supplies = priority niche).
- Small business owners needing reporting + finance.
- Freelancers / consultants handling client data.
- Local marketing agencies.
Anti-personas:
- Enterprise data teams (build their own).
- Pure technical buyers (
pip installsomething free).
4. Product strategy
Lead: Excel & CSV Data Cleaning Mastery Bundle (highest pain, broadest demand).
Roadmap:
- Data Cleaning Mastery (in progress)
- Automated Business Reporting
- Ecommerce Data Pipeline
- Small Business Finance
- Marketing Public Data Aggregation
- AI Ecommerce Aggregation — Shopify Pet Supplies
Sequence rule: don't start bundle 2 until bundle 1 has paying customers + one external review. Five parallel skeletons is a known failure mode.
Surface: desktop install per OS (PyInstaller) with Streamlit GUI + CLI. Constrained demo on Streamlit Community Cloud.
4a. Lead bundle — Find Duplicates
Highest pain density across all 4 personas. Feeds landing copy, demo design, feature priority. Tech spec: TECHNICAL.md §11.1.
Use cases by persona
Shopify:
- Customer list cleanup (
john@gmail.comvsJohn@Gmail.comvsj.ohn@gmail.com). - Product catalog dedup (SKU whitespace, near-identical names).
- Abandoned-cart cleanup before re-engagement.
- Order export consolidation across channels.
- Subscriber-list hygiene before Klaviyo / Mailchimp import (per-contact pricing).
Bookkeeper: 6. Bank export reconciliation across overlapping date ranges. 7. Vendor list consolidation across QB + spreadsheets + email. 8. Customer master cleanup pre-invoicing migration. 9. Expense report dedup (same receipt twice).
Freelancer: 10. Pre-analysis cleanup of client dumps. 11. Survey response dedup (same respondent, multiple devices). 12. Lead list cleanup before client handoff.
Marketing agency: 13. Email-list dedup across lead sources. 14. Multi-platform audience reconciliation. 15. Suppression-list management.
Highest pain × frequency: 1, 5, 6, 13. Build feature set + demo dataset + landing copy around these.
Competitive landscape
| Tool | Price | Strength | Weakness |
|---|---|---|---|
| Excel Remove Duplicates | Free | Universal, zero install | Exact only. No fuzzy. No audit. |
Pandas drop_duplicates |
Free | Powerful | Requires Python. |
| OpenRefine | Free | Powerful clustering | Steep curve, dated GUI. |
| Dedupe.io | $30+/mo | ML fuzzy | Recurring + cloud upload. |
| WinPure / Data Ladder | $300-2000+ | Enterprise | Wrong tier. |
| Power Query | Free | Integrated | Exact only without M-code. |
Market gap
Fuzzy match quality of OpenRefine, with the zero-learning UX of Excel, sold once for under $100, runs locally.
Defensible only if fuzzy matching works without docs. Mediocre fuzzy → loses to free Excel. Learning required → loses to free OpenRefine. Tier 1 spec (TECHNICAL.md §11.1) is the minimum viable feature set to occupy this gap.
5. Pricing
| Tier | Price | Notes |
|---|---|---|
| Single bundle | $49-79 | Impulse-purchase sweet spot for solo operators |
| Full suite (3+ bundles) | $149 | Anchor; drives bundle attach |
Why: < $99 avoids procurement friction. > $99 triggers SaaS-support expectations that conflict with no-touch. < $30 competes with free, signals "toy".
6. Revenue targets
| Horizon | Target | Notes |
|---|---|---|
| 90 days | First paying customer | Validates funnel, not business |
| 6 months | $1k-3k/mo | Lead bundle + marketplace + demo |
| 12 months | $5k/mo | Triggers "fully async" revisit |
| 24 months | $10k/mo | Stretch. Needs hit product or 3+ bundles compounding |
$20k+/mo achievable but requires audience/brand asset that operator constraints exclude.
7. Marketing
Channels (priority order, early stage)
- Hosted browser demo — free Streamlit Community Cloud, linked from every listing. Direct conversion lever for digital downloads where buyers can't evaluate quality otherwise.
- Marketplace listings — Gumroad search, Lemon Squeezy directory, GitHub.
- Niche communities — value-first posts in subreddits, Indie Hackers, niche Slack/Discord. Demo doubles as the shareable asset.
- Programmatic SEO — long-tail landing pages (compounds over months).
- Strong GitHub README as discovery surface.
Demo design
- Same core engine as paid product, GUI-only.
- Constraints: row limit (100), output watermark, sample dataset preloaded + small upload (capped).
- Persistent CTA: "Like what you see? Get the full version for $49 →".
- No login. Friction kills conversion.
- Streamlit Community Cloud (free) at launch. $5/mo VPS if rate-limited.
Target keywords
python csv cleaner bundle · excel data cleaning scripts · automated data deduplicator python · csv duplicate removal tool · shopify product feed cleaner.
Funnel
Discovery → Demo (try-before-buy) → Landing page → Gumroad → Stripe → automated email delivery → upsell sequence to next bundle.
Support
Self-serve docs in every download. Email only. No live chat, no calls.
8. The "fully async, no-touch" constraint
Locked criteria require automated, no-touch marketing + sales. Long-term steady state.
Revisit trigger: $5k/mo MRR.
Why: pre-PMF, the no-touch rule excludes the channels most likely to produce first traction (founder outreach to 50 Shopify pet operators, community participation, customer interviews). Strict adherence may cost more revenue than it saves time.
Action at trigger: re-evaluate selective non-async (e.g., 2 hr/wk community participation) vs. additional bundle dev. Decision lives with the operator; this just flags the trigger.
9. Cost structure
Recurring monthly cap: $1,200.
| Item | Cost |
|---|---|
| Gumroad / Lemon Squeezy fees | ~10% of revenue |
| Domain | ~$15/yr |
| Landing-page hosting | $0-20/mo (static via Cloudflare/Netlify/GH Pages) |
| Demo hosting | $0 at launch (Streamlit Community Cloud); plan $5-10/mo VPS migration |
| Email service | $0-30/mo |
| Apple Developer Program | $99/yr (~$8/mo) |
| Inno Setup, PyInstaller, Python | Free |
| Total fixed monthly | ~$30-70/mo |
Headroom enables optional ad spend ($100-200/mo) once a bundle has proven conversion data.
10. macOS code signing
Cost: $99/yr to Apple Developer Program. Decision: pay it.
Why required: macOS Gatekeeper hard-blocks unsigned apps with "This app cannot be opened because the developer cannot be verified" — the only obvious button is "Move to Trash." The bypass (right-click → Open) exists but the target buyer won't perform it.
What $99 buys: code-signing certificate (removes hard block) + notarization service (removes "downloaded from internet" warning). Result: clean double-click experience.
Setup: Apple ID + government ID (individuals) or D-U-N-S number (orgs). First approval takes 1-2 weeks. Once approved, sign + notarize is automated in CI.
11. Risks & mitigation
| Risk | Mitigation |
|---|---|
| Free GitHub scripts commoditize | Niche verticals + polished GUI + cross-platform installers + hosted demo |
| Slow early traction | Lead with demo + marketplaces + communities, not own-domain SEO |
| Refund chargebacks | Clear scope on landing, demo lets buyers validate, working samples included |
| macOS install friction | Apple Dev Program ($99/yr), code sign + notarize |
| Browser-launch UX confusion | One sentence in installer + email; persistent in-app "runs locally" message; pywebview wrap as v1.1 if needed |
| Support burden | Robust installers, idiot-proof docs, sample data included |
| IP theft / resale | License file. Accept partial protection; focus on staying ahead via updates |
| Marketplace policy change | Multi-marketplace day 1; own domain as fallback |
| Streamlit direction change | Low probability; flagged as criteria-relock trigger in DECISIONS §8 |
12. Success metrics (monthly)
- Units sold per bundle.
- Conversion rate (landing → purchase).
- Demo-to-purchase rate (added v1.3): demo visits → Gumroad clicks → purchases.
- Refund rate (target < 5%).
- Support tickets / 100 sales (target < 10).
- Organic traffic to product pages.
- Per-platform install success.
13. Honest status (2026-05-01)
- 3 of 9 tools shipped (Find Duplicates, Clean Text, Standardize Formats).
- Cross-platform build pipeline designed, not yet built.
- macOS code signing not yet set up.
- Streamlit GUI shipped for the 3 ready tools.
- Hosted demo not yet deployed.
- No paying customers.
- No live landing page.
Next concrete steps before marketing spend:
- Stand up the PyInstaller pipeline with Streamlit launcher (1-3 days first time).
- Deploy constrained demo to Streamlit Community Cloud.
- Enroll in Apple Developer Program (start in parallel — 1-2 wk lead time).
- Single landing page for the lead bundle, demo prominently linked.
- Finish 2 more tools to Ready state (CLI + GUI).
- List on Gumroad with sample output proof, per-platform installers, demo link.