Files
datatools-dev/build/README.md
Michael 9c426194b1 build: add single-command release script + portable zip artifacts
One-developer workflow: ``python build/make_release.py`` on each
target OS produces both the installer and a portable .zip for that
platform. Preflight checks PyInstaller / Pillow / iscc / hdiutil /
ditto / appimagetool and bails with install hints if anything is
missing — no half-built dist/.

New scripts:
- build/make_release.py   — orchestrator, auto-detects host OS.
- build/generate_icons.py — icon.ico / icon.icns / icon.png from
  src/gui/assets/datatools_icon_256.png (Pillow ships ICO + ICNS
  writers; no platform tooling needed).
- build/build_portable_zip.py — Win/Linux portable zip via stdlib.
- build/macos/build_zip.sh — Mac portable .app via ditto so
  bundle metadata survives.

installer.iss now adds: Quick Launch task (opt-in, legacy Win 7),
App Paths registry entry (Win+R "DataTools" works), SetupIconFile,
UninstallDisplayIcon, AppSupportURL, AppUpdatesURL.

CI workflow uploads installer + portable per platform and attaches
both to GitHub Releases on tag push.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 19:30:17 +00:00

331 lines
13 KiB
Markdown

# Build — DataTools desktop installer
> Cross-platform PyInstaller bundle for Mac / Windows / Linux. The
> single deliverable the buyer downloads from Gumroad.
> **Owner**: Michael · **Updated**: 2026-05-01
This directory is the build pipeline. Source of truth for the bundle
shape, hidden-import lists, per-platform recipes, and the launcher
that boots Streamlit inside the bundle.
## Files
```
build/
├── launcher.py Entry point PyInstaller wraps. Boots a local
│ Streamlit server, opens browser, locks server
│ to 127.0.0.1 so the privacy claim holds.
├── datatools.spec PyInstaller spec — hidden imports, data files,
│ Mac .app bundle config. Reads the version
│ from src/__init__.py.
├── installer.iss Inno Setup script — Windows .exe installer.
│ Adds Start Menu + Desktop + App Paths entries.
├── generate_icons.py Builds icon.ico / icon.icns / icon.png from
│ src/gui/assets/datatools_icon_256.png. Run
│ once before pyinstaller (CI does this).
├── build_portable_zip.py Cross-platform: zips dist/DataTools/ into a
│ no-install portable download. Used by the
│ Windows + Linux portable artifacts.
├── macos/
│ ├── build_dmg.sh Wraps dist/DataTools.app into a .dmg with a
│ │ drag-to-/Applications layout (installer).
│ └── build_zip.sh Wraps dist/DataTools.app into a portable
│ .zip via ditto (preserves bundle metadata).
├── appimage/
│ ├── AppRun Entry point invoked when the AppImage runs.
│ ├── datatools.desktop Linux desktop-entry metadata.
│ └── build.sh Wraps dist/DataTools/ into an .AppImage.
├── hooks/ PyInstaller hooks for libs the static analyser
│ └── hook-streamlit.py misses (Streamlit's dynamic imports).
├── icon.{ico,icns,png} Generated by generate_icons.py — gitignored.
└── README.md this file
```
## Distribution outputs per platform
Each CI run produces two downloads per platform — an installer for
buyers who want shortcuts wired automatically, and a portable .zip
for buyers (or IT-locked-down machines) that can't run installers:
| Platform | Installer | Portable |
|----------|----------------------------------------|------------------------------------------------|
| macOS | `DataTools-<ver>-mac.dmg` | `DataTools-<ver>-mac-portable.zip` (ditto .app)|
| Windows | `DataTools-<ver>-win-setup.exe` | `DataTools-<ver>-win-portable.zip` |
| Linux | `DataTools-<ver>-linux-x86_64.AppImage`| (the AppImage IS the portable) |
All six outputs are self-contained: every dependency (Python, pandas,
streamlit, pdfplumber, the lot) is frozen into the bundle. The buyer
does not need to install Python, pip, or anything else first.
## Easy-launch surface
| Affordance | Windows | macOS |
|------------------|--------------------------------------------------|------------------------------------------------------|
| Desktop shortcut | Inno Setup `desktopicon` task (checked default) | The .app bundle in /Applications is the icon |
| App menu | Start Menu → DataTools (always installed) | Launchpad + Spotlight (auto from /Applications) |
| Taskbar / Dock | User pins manually (OS forbids programmatic pin) | User pins manually after first launch |
| Run from terminal| `DataTools` (registered via App Paths) | `open -a DataTools` (auto from .app bundle) |
CI: `.github/workflows/build.yml` runs the full pipeline on tag push
(matrix: macos-latest, windows-latest, ubuntu-latest) and attaches
the resulting installers to a GitHub Release. Manual
`workflow_dispatch` runs upload them as workflow artifacts only.
## Releasing
### Single-command local build (recommended for one-developer workflow)
PyInstaller can't cross-compile, so a single machine produces one
platform's packages. Run this on each target OS:
```bash
# One-time setup per machine:
pip install -r requirements.txt
pip install pyinstaller pillow
# Windows only: install Inno Setup from https://jrsoftware.org/isdl.php
# Linux only: drop appimagetool onto PATH (see preflight output)
# Build everything for the current OS:
python build/make_release.py
```
Outputs land in `dist/`:
- Windows host → `DataTools-<ver>-win-setup.exe` + `DataTools-<ver>-win-portable.zip`
- macOS host → `DataTools-<ver>-mac.dmg` + `DataTools-<ver>-mac-portable.zip`
- Linux host → `DataTools-<ver>-linux-x86_64.AppImage`
Useful flags:
```bash
python build/make_release.py --preflight # check tooling, build nothing
python build/make_release.py --clean # wipe dist/ first
python build/make_release.py --skip-installer # just the portable zip
python build/make_release.py --skip-portable # just the installer
```
### CI build (push tag → GitHub Release)
If you have CI runners for all three OSes:
1. Bump `__version__` in `src/__init__.py`.
2. `git commit -am "release: vX.Y.Z" && git tag vX.Y.Z`.
3. `git push && git push --tags`.
4. CI builds all three platforms and creates a Release with the
installers + portable zips attached.
5. Mirror the Release assets to Gumroad (manual until v2).
## Signing (Phase 2 — needs accounts/credentials)
Both code-signing steps are intentionally not in CI yet because they
require credentials the owner sets up first.
**macOS** — Apple Developer Program enrollment ($99/yr). Once enrolled,
add these GitHub Secrets and uncomment the `codesign` + `notarytool`
steps in `build.yml`:
| Secret | Value |
|---|---|
| `MACOS_DEVELOPER_ID_CERT_P12_BASE64` | base64-encoded `.p12` cert |
| `MACOS_DEVELOPER_ID_CERT_PASSWORD` | password for the .p12 |
| `MACOS_NOTARY_APPLE_ID` | Apple ID email |
| `MACOS_NOTARY_TEAM_ID` | 10-char team ID |
| `MACOS_NOTARY_PASSWORD` | app-specific password |
**Windows** — Code-signing cert from Sectigo / DigiCert (~$200-400/yr,
or ~$300-500 for an EV cert that bypasses SmartScreen). Add:
| Secret | Value |
|---|---|
| `WINDOWS_CERT_PFX_BASE64` | base64-encoded `.pfx` cert |
| `WINDOWS_CERT_PASSWORD` | password for the .pfx |
Until those are wired, buyers will see:
- macOS: "DataTools is damaged and can't be opened" — fix by removing
the quarantine attribute (`xattr -cr /Applications/DataTools.app`).
Acceptable for the technical buyer; **blocking** for the
non-technical buyer. Don't ship to non-technical without notarization.
- Windows: SmartScreen "Windows protected your PC" — buyer clicks
"More info → Run anyway". Friction but not blocking.
- Linux: AppImage runs without complaint (Linux has no equivalent
trust-store).
## Per-platform recipe
Each platform builds on its own machine — PyInstaller does **not**
cross-compile. Pick the platform that matches the bundle you need.
GitHub Actions matrix runners are the simplest way to produce all
three from one push (see "CI build" below).
### Mac (Intel + Apple Silicon, universal2)
```bash
# One-time:
pyenv install 3.12
pyenv local 3.12
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install pyinstaller
# Build:
pyinstaller build/datatools.spec --clean
# Output:
# dist/DataTools/ — folder mode (faster cold start)
# dist/DataTools.app/ — macOS .app bundle (drag-drop into /Applications)
# Sign + notarize (after Apple Developer Program enrollment per BUSINESS.md §10):
codesign --deep --force --options runtime \
--sign "Developer ID Application: <YOUR-NAME> (<TEAMID>)" \
dist/DataTools.app
# Notarize:
xcrun notarytool submit dist/DataTools.app \
--apple-id "<YOUR-APPLE-ID>" \
--team-id "<TEAMID>" \
--password "<APP-SPECIFIC-PASSWORD>" \
--wait
# Staple the notarization ticket so Gatekeeper sees it offline:
xcrun stapler staple dist/DataTools.app
# Wrap for distribution:
hdiutil create -volname "DataTools" -srcfolder dist/DataTools.app \
-ov -format UDZO dist/DataTools-1.0.0-mac.dmg
```
### Windows
```powershell
# One-time:
py -3.12 -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
pip install pyinstaller
# Build:
pyinstaller build\datatools.spec --clean
# Output:
# dist\DataTools\ — folder mode
# dist\DataTools\DataTools.exe
# Wrap with Inno Setup (free):
# 1. Install Inno Setup (https://jrsoftware.org/isdl.php)
# 2. Create installer.iss next to this README:
# [Setup]
# AppName=DataTools
# AppVersion=1.0.0
# DefaultDirName={autopf}\DataTools
# OutputDir=..\..\dist
# OutputBaseFilename=DataTools-1.0.0-win-setup
# Compression=lzma
# SolidCompression=yes
# [Files]
# Source: "..\..\dist\DataTools\*"; DestDir: "{app}"; Flags: recursesubdirs
# [Icons]
# Name: "{autoprograms}\DataTools"; Filename: "{app}\DataTools.exe"
# 3. Compile: ISCC.exe build\installer.iss
# Code-sign (optional but reduces SmartScreen warnings):
# Use signtool with a code-signing cert (Sectigo / DigiCert).
# Without signing, buyer sees "Windows protected your PC" once;
# they click "More info → Run anyway." Acceptable for v1.
```
### Linux (AppImage)
```bash
python3.12 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install pyinstaller
pyinstaller build/datatools.spec --clean
# dist/DataTools/ — folder mode
# Wrap as AppImage (single-file portable app):
# 1. Download appimagetool from https://appimage.org/
# 2. Set up the AppDir layout:
# DataTools.AppDir/
# ├── AppRun -> ./DataTools/DataTools
# ├── DataTools.desktop (icon + entry config)
# ├── icon.png
# └── usr/bin/ -> dist/DataTools/*
# 3. ./appimagetool DataTools.AppDir dist/DataTools-1.0.0-linux-x86_64.AppImage
```
## CI build (recommended once the spec is stable)
`.github/workflows/build.yml` (template):
```yaml
name: Build installers
on:
workflow_dispatch:
push:
tags: [ 'v*' ]
jobs:
build:
strategy:
matrix:
os: [macos-latest, windows-latest, ubuntu-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: { python-version: '3.12' }
- run: pip install -r requirements.txt pyinstaller
- run: pyinstaller build/datatools.spec --clean
- uses: actions/upload-artifact@v4
with:
name: DataTools-${{ matrix.os }}
path: dist/
```
Mac code-signing in CI requires the cert + private key as a GitHub
secret (encoded with `base64`). Detailed walkthrough belongs in a
later doc — for v1, sign locally and upload to GitHub Releases.
## Common pitfalls
| Symptom | Fix |
|---|---|
| Bundle is 800+ MB | Check the ``excludes`` list in ``datatools.spec``. ``matplotlib`` / ``scipy`` / ``tkinter`` are the usual suspects. |
| App launches, browser opens, page is blank | Streamlit's static assets aren't bundled. Re-run with `--log-level=DEBUG` and confirm the static dir was collected by `collect_data_files('streamlit')`. |
| App launches but logs ``ImportError: streamlit.runtime.X`` | Add ``X`` to ``hidden_imports`` in the spec or to ``hook-streamlit.py``. |
| Mac Gatekeeper says "DataTools is damaged and can't be opened" | The bundle wasn't signed + notarized. Don't ship to buyers without these — see the Mac recipe above. |
| Windows SmartScreen blocks first launch | Buyer clicks "More info → Run anyway". Code-signing reduces but doesn't eliminate this; for v1 it's an accepted friction. |
| Bundle works on dev machine but crashes on a clean machine | Likely a missing C runtime. On Windows, install [VC++ redistributable](https://aka.ms/vs/17/release/vc_redist.x64.exe) into the installer alongside the bundle. |
## Testing the bundle
Smoke-test on a **clean** machine (or VM) — your dev machine has too
much state to trust:
```
1. Boot a clean Mac / Win / Linux VM.
2. Copy the .dmg / .exe / .AppImage onto it.
3. Install / drag-drop into Applications / chmod +x.
4. Double-click the app icon.
5. Browser should open to http://127.0.0.1:850x within 5 seconds.
6. Drop samples/demo/shopify_pet_customers.csv into the
Automated Workflows page; click Run; AFTER preview should appear.
7. Confirm in the network tab: zero outbound calls except to
127.0.0.1 and the Streamlit static asset paths (also local).
```
Step 7 is the privacy-claim integrity check from
`docs/POST-LAUNCH.md` §6 — do this once per release, then trust it.
## Versioning
Bump the version string in three places per release:
- `datatools.spec` (CFBundleVersion + CFBundleShortVersionString)
- the Inno Setup `AppVersion` line
- the AppImage filename
A single source of truth (e.g. `src/__init__.py`) is a future
refactor — for v1 the three-spot update is fine.