- NEW LICENSE_TESSERACT.txt at the repo root: header noting it covers the bundled Tesseract OCR binary (Apache 2.0, upstream tesseract-ocr/tesseract, copyright Google + contributors) and the eng.traineddata from tessdata_best (also Apache 2.0). Clarifies DataTools itself remains proprietary. Full canonical Apache 2.0 license text included. - README.md + README.es.md (Download section): bumped size estimate ~200 MB → ~300 MB, added a short paragraph stating Tesseract OCR is bundled (no separate install required), with a link to the new license file. - docs/USER-GUIDE.md + docs/USER-GUIDE.es.md (§1.6 System requirements): bumped disk estimate, added a paragraph stating Tesseract 5.5 + eng.traineddata ship inside every installer / portable / AppImage, with a source-install fallback hint pointing developers to DEVELOPER.md. - docs/DEVELOPER.md: new "PDF Extractor — bundled Tesseract" section documenting the runtime layout (sys._MEIPASS / tesseract / …), discovery order, source of bytes (build/vendor/tessdata + per- platform fetch in make_release.py), version pin, update recipe. - docs/TECHNICAL.md: new §3.10 "Bundled Tesseract (PDF Extractor OCR)" — short version of the discovery order for the build pipeline section. - build/README.md: distribution-outputs paragraph now lists Tesseract among bundled deps with the ~250-300 MB estimate; new "Tesseract bundling" section: layout diagram, resolver order, source of bytes + 5.5.0 pin, update steps, license-file ref. Out-of-scope gaps noted by the docs sweep: - docs/FUTURE-TOOLS.md §D still describes Tesseract bundling as a high-risk packaging headache; now superseded. Worth a one-line "(resolved — bundled as of v1.x)" callout in a future pass. - USER-GUIDE §2 "What's included" table doesn't list PDF Extractor at all (it shipped in b8aff86…967d3f6). Separate gap to close. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🌐 Language: English · Español
Excel & CSV Data Cleaning Mastery Bundle
9 Python data-cleaning tools, every one with a CLI and a browser GUI. Local-only, no internet. Windows / macOS / Linux.
Quick Start
- Download from your purchase email. Two flavors per OS — pick one:
- Installer (
.dmgon macOS,.exeon Windows) — wires up Desktop + Start Menu / Launchpad shortcuts. - Portable .zip — unzip and double-click. No install, no admin rights, runs from anywhere.
- Installer (
- Open it (no Python needed; everything is bundled inside).
- The app starts a local server and opens your browser. Nothing leaves your machine.
Full step-by-step including SmartScreen / Gatekeeper workarounds: USER-GUIDE.md §1.
Docs
Buyer-facing (ships with the product):
- English: USER-GUIDE.md · CLI-REFERENCE.md
- Español: USER-GUIDE.es.md · CLI-REFERENCE.es.md
Creator-only (do not ship):
- BUSINESS.md — market, pricing, marketing
- TECHNICAL.md — architecture, build pipeline, standards
- DECISIONS.md — locked criteria, decision log
- RECOVERY.md — full rebuild guide
- REQUIREMENTS.md — numbered support matrix
Version: 1.6 · Updated: 2026-05-01 · Owner: Michael