Files
datatools-dev/build
Michael db5ec084da docs+code: rename tool labels everywhere
Sweep follow-up to 93e43fc. Display labels now consistent across docs,
landing pages, CLI output, code comments, docstrings, and test prose.
Five parallel surfaces touched:

- docs (EN + ES): README, USER-GUIDE, CLI-REFERENCE, and 11 internal
  design/planning docs
- landing pages: index + bookkeeper/revops/shopify-pet
- src: CLI module docstrings, _TOOL_DISPLAY dicts in cli_analyze.py
  and gui/components/_legacy.py, core module headers, every tool
  page's module docstring
- tests: class/method/module docstrings and section-header comments
- test-cases READMEs

Page slugs (1_Deduplicator etc.), tool_id strings (01_deduplicator
etc.), Python class names (TestDeduplicatorWorkflow, FeatureFlag.*),
URL paths, anchor IDs, CSS classes, and asset filenames were left
intact since they're code identifiers / structural references.

All 2033 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 19:50:09 +00:00
..

Build — DataTools desktop installer

Cross-platform PyInstaller bundle for Mac / Windows / Linux. The single deliverable the buyer downloads from Gumroad. Owner: Michael · Updated: 2026-05-01

This directory is the build pipeline. Source of truth for the bundle shape, hidden-import lists, per-platform recipes, and the launcher that boots Streamlit inside the bundle.

Files

build/
├── launcher.py           Entry point PyInstaller wraps. Boots a local
│                         Streamlit server, opens browser, locks server
│                         to 127.0.0.1 so the privacy claim holds.
├── datatools.spec        PyInstaller spec — hidden imports, data files,
│                         Mac .app bundle config. Reads the version
│                         from src/__init__.py.
├── installer.iss         Inno Setup script — Windows .exe installer.
├── macos/
│   └── build_dmg.sh      Wraps dist/DataTools.app into a .dmg with a
│                         drag-to-/Applications layout.
├── appimage/
│   ├── AppRun            Entry point invoked when the AppImage runs.
│   ├── datatools.desktop Linux desktop-entry metadata.
│   └── build.sh          Wraps dist/DataTools/ into an .AppImage.
├── hooks/                PyInstaller hooks for libs the static analyser
│   └── hook-streamlit.py misses (Streamlit's dynamic imports).
├── icon.icns             macOS app icon (TODO: produce from a 1024×1024
│                         PNG. Optional — bundle still builds without).
├── icon.ico              Windows app icon (TODO).
├── icon.png              Linux AppImage icon (TODO — build.sh generates
│                         a placeholder if missing).
└── README.md             this file

CI: .github/workflows/build.yml runs the full pipeline on tag push (matrix: macos-latest, windows-latest, ubuntu-latest) and attaches the resulting installers to a GitHub Release. Manual workflow_dispatch runs upload them as workflow artifacts only.

Releasing

  1. Bump __version__ in src/__init__.py.
  2. git commit -am "release: vX.Y.Z" && git tag vX.Y.Z.
  3. git push && git push --tags.
  4. CI builds all three platforms and creates a GitHub Release with the installers attached.
  5. Mirror the GitHub Release assets to Gumroad (manual until v2).

Signing (Phase 2 — needs accounts/credentials)

Both code-signing steps are intentionally not in CI yet because they require credentials the owner sets up first.

macOS — Apple Developer Program enrollment ($99/yr). Once enrolled, add these GitHub Secrets and uncomment the codesign + notarytool steps in build.yml:

Secret Value
MACOS_DEVELOPER_ID_CERT_P12_BASE64 base64-encoded .p12 cert
MACOS_DEVELOPER_ID_CERT_PASSWORD password for the .p12
MACOS_NOTARY_APPLE_ID Apple ID email
MACOS_NOTARY_TEAM_ID 10-char team ID
MACOS_NOTARY_PASSWORD app-specific password

Windows — Code-signing cert from Sectigo / DigiCert (~$200-400/yr, or ~$300-500 for an EV cert that bypasses SmartScreen). Add:

Secret Value
WINDOWS_CERT_PFX_BASE64 base64-encoded .pfx cert
WINDOWS_CERT_PASSWORD password for the .pfx

Until those are wired, buyers will see:

  • macOS: "DataTools is damaged and can't be opened" — fix by removing the quarantine attribute (xattr -cr /Applications/DataTools.app). Acceptable for the technical buyer; blocking for the non-technical buyer. Don't ship to non-technical without notarization.
  • Windows: SmartScreen "Windows protected your PC" — buyer clicks "More info → Run anyway". Friction but not blocking.
  • Linux: AppImage runs without complaint (Linux has no equivalent trust-store).

Per-platform recipe

Each platform builds on its own machine — PyInstaller does not cross-compile. Pick the platform that matches the bundle you need. GitHub Actions matrix runners are the simplest way to produce all three from one push (see "CI build" below).

Mac (Intel + Apple Silicon, universal2)

# One-time:
pyenv install 3.12
pyenv local 3.12
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install pyinstaller

# Build:
pyinstaller build/datatools.spec --clean

# Output:
#   dist/DataTools/         — folder mode (faster cold start)
#   dist/DataTools.app/     — macOS .app bundle (drag-drop into /Applications)

# Sign + notarize (after Apple Developer Program enrollment per BUSINESS.md §10):
codesign --deep --force --options runtime \
  --sign "Developer ID Application: <YOUR-NAME> (<TEAMID>)" \
  dist/DataTools.app

# Notarize:
xcrun notarytool submit dist/DataTools.app \
  --apple-id "<YOUR-APPLE-ID>" \
  --team-id  "<TEAMID>" \
  --password "<APP-SPECIFIC-PASSWORD>" \
  --wait

# Staple the notarization ticket so Gatekeeper sees it offline:
xcrun stapler staple dist/DataTools.app

# Wrap for distribution:
hdiutil create -volname "DataTools" -srcfolder dist/DataTools.app \
  -ov -format UDZO dist/DataTools-1.0.0-mac.dmg

Windows

# One-time:
py -3.12 -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
pip install pyinstaller

# Build:
pyinstaller build\datatools.spec --clean

# Output:
#   dist\DataTools\          — folder mode
#   dist\DataTools\DataTools.exe

# Wrap with Inno Setup (free):
#   1. Install Inno Setup (https://jrsoftware.org/isdl.php)
#   2. Create installer.iss next to this README:
#        [Setup]
#        AppName=DataTools
#        AppVersion=1.0.0
#        DefaultDirName={autopf}\DataTools
#        OutputDir=..\..\dist
#        OutputBaseFilename=DataTools-1.0.0-win-setup
#        Compression=lzma
#        SolidCompression=yes
#        [Files]
#        Source: "..\..\dist\DataTools\*"; DestDir: "{app}"; Flags: recursesubdirs
#        [Icons]
#        Name: "{autoprograms}\DataTools"; Filename: "{app}\DataTools.exe"
#   3. Compile: ISCC.exe build\installer.iss

# Code-sign (optional but reduces SmartScreen warnings):
#   Use signtool with a code-signing cert (Sectigo / DigiCert).
#   Without signing, buyer sees "Windows protected your PC" once;
#   they click "More info → Run anyway." Acceptable for v1.

Linux (AppImage)

python3.12 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install pyinstaller

pyinstaller build/datatools.spec --clean
# dist/DataTools/ — folder mode

# Wrap as AppImage (single-file portable app):
#   1. Download appimagetool from https://appimage.org/
#   2. Set up the AppDir layout:
#        DataTools.AppDir/
#        ├── AppRun                     -> ./DataTools/DataTools
#        ├── DataTools.desktop          (icon + entry config)
#        ├── icon.png
#        └── usr/bin/                   -> dist/DataTools/*
#   3. ./appimagetool DataTools.AppDir dist/DataTools-1.0.0-linux-x86_64.AppImage

.github/workflows/build.yml (template):

name: Build installers
on:
  workflow_dispatch:
  push:
    tags: [ 'v*' ]
jobs:
  build:
    strategy:
      matrix:
        os: [macos-latest, windows-latest, ubuntu-latest]
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: '3.12' }
      - run: pip install -r requirements.txt pyinstaller
      - run: pyinstaller build/datatools.spec --clean
      - uses: actions/upload-artifact@v4
        with:
          name: DataTools-${{ matrix.os }}
          path: dist/

Mac code-signing in CI requires the cert + private key as a GitHub secret (encoded with base64). Detailed walkthrough belongs in a later doc — for v1, sign locally and upload to GitHub Releases.

Common pitfalls

Symptom Fix
Bundle is 800+ MB Check the excludes list in datatools.spec. matplotlib / scipy / tkinter are the usual suspects.
App launches, browser opens, page is blank Streamlit's static assets aren't bundled. Re-run with --log-level=DEBUG and confirm the static dir was collected by collect_data_files('streamlit').
App launches but logs ImportError: streamlit.runtime.X Add X to hidden_imports in the spec or to hook-streamlit.py.
Mac Gatekeeper says "DataTools is damaged and can't be opened" The bundle wasn't signed + notarized. Don't ship to buyers without these — see the Mac recipe above.
Windows SmartScreen blocks first launch Buyer clicks "More info → Run anyway". Code-signing reduces but doesn't eliminate this; for v1 it's an accepted friction.
Bundle works on dev machine but crashes on a clean machine Likely a missing C runtime. On Windows, install VC++ redistributable into the installer alongside the bundle.

Testing the bundle

Smoke-test on a clean machine (or VM) — your dev machine has too much state to trust:

1. Boot a clean Mac / Win / Linux VM.
2. Copy the .dmg / .exe / .AppImage onto it.
3. Install / drag-drop into Applications / chmod +x.
4. Double-click the app icon.
5. Browser should open to http://127.0.0.1:850x within 5 seconds.
6. Drop samples/demo/shopify_pet_customers.csv into the
   Automated Workflows page; click Run; AFTER preview should appear.
7. Confirm in the network tab: zero outbound calls except to
   127.0.0.1 and the Streamlit static asset paths (also local).

Step 7 is the privacy-claim integrity check from docs/POST-LAUNCH.md §6 — do this once per release, then trust it.

Versioning

Bump the version string in three places per release:

  • datatools.spec (CFBundleVersion + CFBundleShortVersionString)
  • the Inno Setup AppVersion line
  • the AppImage filename

A single source of truth (e.g. src/__init__.py) is a future refactor — for v1 the three-spot update is fine.