Files
datatools-dev/build/README.md
Michael db5ec084da docs+code: rename tool labels everywhere
Sweep follow-up to 93e43fc. Display labels now consistent across docs,
landing pages, CLI output, code comments, docstrings, and test prose.
Five parallel surfaces touched:

- docs (EN + ES): README, USER-GUIDE, CLI-REFERENCE, and 11 internal
  design/planning docs
- landing pages: index + bookkeeper/revops/shopify-pet
- src: CLI module docstrings, _TOOL_DISPLAY dicts in cli_analyze.py
  and gui/components/_legacy.py, core module headers, every tool
  page's module docstring
- tests: class/method/module docstrings and section-header comments
- test-cases READMEs

Page slugs (1_Deduplicator etc.), tool_id strings (01_deduplicator
etc.), Python class names (TestDeduplicatorWorkflow, FeatureFlag.*),
URL paths, anchor IDs, CSS classes, and asset filenames were left
intact since they're code identifiers / structural references.

All 2033 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 19:50:09 +00:00

267 lines
9.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Build — DataTools desktop installer
> Cross-platform PyInstaller bundle for Mac / Windows / Linux. The
> single deliverable the buyer downloads from Gumroad.
> **Owner**: Michael · **Updated**: 2026-05-01
This directory is the build pipeline. Source of truth for the bundle
shape, hidden-import lists, per-platform recipes, and the launcher
that boots Streamlit inside the bundle.
## Files
```
build/
├── launcher.py Entry point PyInstaller wraps. Boots a local
│ Streamlit server, opens browser, locks server
│ to 127.0.0.1 so the privacy claim holds.
├── datatools.spec PyInstaller spec — hidden imports, data files,
│ Mac .app bundle config. Reads the version
│ from src/__init__.py.
├── installer.iss Inno Setup script — Windows .exe installer.
├── macos/
│ └── build_dmg.sh Wraps dist/DataTools.app into a .dmg with a
│ drag-to-/Applications layout.
├── appimage/
│ ├── AppRun Entry point invoked when the AppImage runs.
│ ├── datatools.desktop Linux desktop-entry metadata.
│ └── build.sh Wraps dist/DataTools/ into an .AppImage.
├── hooks/ PyInstaller hooks for libs the static analyser
│ └── hook-streamlit.py misses (Streamlit's dynamic imports).
├── icon.icns macOS app icon (TODO: produce from a 1024×1024
│ PNG. Optional — bundle still builds without).
├── icon.ico Windows app icon (TODO).
├── icon.png Linux AppImage icon (TODO — build.sh generates
│ a placeholder if missing).
└── README.md this file
```
CI: `.github/workflows/build.yml` runs the full pipeline on tag push
(matrix: macos-latest, windows-latest, ubuntu-latest) and attaches
the resulting installers to a GitHub Release. Manual
`workflow_dispatch` runs upload them as workflow artifacts only.
## Releasing
1. Bump `__version__` in `src/__init__.py`.
2. `git commit -am "release: vX.Y.Z" && git tag vX.Y.Z`.
3. `git push && git push --tags`.
4. CI builds all three platforms and creates a GitHub Release with
the installers attached.
5. Mirror the GitHub Release assets to Gumroad (manual until v2).
## Signing (Phase 2 — needs accounts/credentials)
Both code-signing steps are intentionally not in CI yet because they
require credentials the owner sets up first.
**macOS** — Apple Developer Program enrollment ($99/yr). Once enrolled,
add these GitHub Secrets and uncomment the `codesign` + `notarytool`
steps in `build.yml`:
| Secret | Value |
|---|---|
| `MACOS_DEVELOPER_ID_CERT_P12_BASE64` | base64-encoded `.p12` cert |
| `MACOS_DEVELOPER_ID_CERT_PASSWORD` | password for the .p12 |
| `MACOS_NOTARY_APPLE_ID` | Apple ID email |
| `MACOS_NOTARY_TEAM_ID` | 10-char team ID |
| `MACOS_NOTARY_PASSWORD` | app-specific password |
**Windows** — Code-signing cert from Sectigo / DigiCert (~$200-400/yr,
or ~$300-500 for an EV cert that bypasses SmartScreen). Add:
| Secret | Value |
|---|---|
| `WINDOWS_CERT_PFX_BASE64` | base64-encoded `.pfx` cert |
| `WINDOWS_CERT_PASSWORD` | password for the .pfx |
Until those are wired, buyers will see:
- macOS: "DataTools is damaged and can't be opened" — fix by removing
the quarantine attribute (`xattr -cr /Applications/DataTools.app`).
Acceptable for the technical buyer; **blocking** for the
non-technical buyer. Don't ship to non-technical without notarization.
- Windows: SmartScreen "Windows protected your PC" — buyer clicks
"More info → Run anyway". Friction but not blocking.
- Linux: AppImage runs without complaint (Linux has no equivalent
trust-store).
## Per-platform recipe
Each platform builds on its own machine — PyInstaller does **not**
cross-compile. Pick the platform that matches the bundle you need.
GitHub Actions matrix runners are the simplest way to produce all
three from one push (see "CI build" below).
### Mac (Intel + Apple Silicon, universal2)
```bash
# One-time:
pyenv install 3.12
pyenv local 3.12
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install pyinstaller
# Build:
pyinstaller build/datatools.spec --clean
# Output:
# dist/DataTools/ — folder mode (faster cold start)
# dist/DataTools.app/ — macOS .app bundle (drag-drop into /Applications)
# Sign + notarize (after Apple Developer Program enrollment per BUSINESS.md §10):
codesign --deep --force --options runtime \
--sign "Developer ID Application: <YOUR-NAME> (<TEAMID>)" \
dist/DataTools.app
# Notarize:
xcrun notarytool submit dist/DataTools.app \
--apple-id "<YOUR-APPLE-ID>" \
--team-id "<TEAMID>" \
--password "<APP-SPECIFIC-PASSWORD>" \
--wait
# Staple the notarization ticket so Gatekeeper sees it offline:
xcrun stapler staple dist/DataTools.app
# Wrap for distribution:
hdiutil create -volname "DataTools" -srcfolder dist/DataTools.app \
-ov -format UDZO dist/DataTools-1.0.0-mac.dmg
```
### Windows
```powershell
# One-time:
py -3.12 -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
pip install pyinstaller
# Build:
pyinstaller build\datatools.spec --clean
# Output:
# dist\DataTools\ — folder mode
# dist\DataTools\DataTools.exe
# Wrap with Inno Setup (free):
# 1. Install Inno Setup (https://jrsoftware.org/isdl.php)
# 2. Create installer.iss next to this README:
# [Setup]
# AppName=DataTools
# AppVersion=1.0.0
# DefaultDirName={autopf}\DataTools
# OutputDir=..\..\dist
# OutputBaseFilename=DataTools-1.0.0-win-setup
# Compression=lzma
# SolidCompression=yes
# [Files]
# Source: "..\..\dist\DataTools\*"; DestDir: "{app}"; Flags: recursesubdirs
# [Icons]
# Name: "{autoprograms}\DataTools"; Filename: "{app}\DataTools.exe"
# 3. Compile: ISCC.exe build\installer.iss
# Code-sign (optional but reduces SmartScreen warnings):
# Use signtool with a code-signing cert (Sectigo / DigiCert).
# Without signing, buyer sees "Windows protected your PC" once;
# they click "More info → Run anyway." Acceptable for v1.
```
### Linux (AppImage)
```bash
python3.12 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install pyinstaller
pyinstaller build/datatools.spec --clean
# dist/DataTools/ — folder mode
# Wrap as AppImage (single-file portable app):
# 1. Download appimagetool from https://appimage.org/
# 2. Set up the AppDir layout:
# DataTools.AppDir/
# ├── AppRun -> ./DataTools/DataTools
# ├── DataTools.desktop (icon + entry config)
# ├── icon.png
# └── usr/bin/ -> dist/DataTools/*
# 3. ./appimagetool DataTools.AppDir dist/DataTools-1.0.0-linux-x86_64.AppImage
```
## CI build (recommended once the spec is stable)
`.github/workflows/build.yml` (template):
```yaml
name: Build installers
on:
workflow_dispatch:
push:
tags: [ 'v*' ]
jobs:
build:
strategy:
matrix:
os: [macos-latest, windows-latest, ubuntu-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: { python-version: '3.12' }
- run: pip install -r requirements.txt pyinstaller
- run: pyinstaller build/datatools.spec --clean
- uses: actions/upload-artifact@v4
with:
name: DataTools-${{ matrix.os }}
path: dist/
```
Mac code-signing in CI requires the cert + private key as a GitHub
secret (encoded with `base64`). Detailed walkthrough belongs in a
later doc — for v1, sign locally and upload to GitHub Releases.
## Common pitfalls
| Symptom | Fix |
|---|---|
| Bundle is 800+ MB | Check the ``excludes`` list in ``datatools.spec``. ``matplotlib`` / ``scipy`` / ``tkinter`` are the usual suspects. |
| App launches, browser opens, page is blank | Streamlit's static assets aren't bundled. Re-run with `--log-level=DEBUG` and confirm the static dir was collected by `collect_data_files('streamlit')`. |
| App launches but logs ``ImportError: streamlit.runtime.X`` | Add ``X`` to ``hidden_imports`` in the spec or to ``hook-streamlit.py``. |
| Mac Gatekeeper says "DataTools is damaged and can't be opened" | The bundle wasn't signed + notarized. Don't ship to buyers without these — see the Mac recipe above. |
| Windows SmartScreen blocks first launch | Buyer clicks "More info → Run anyway". Code-signing reduces but doesn't eliminate this; for v1 it's an accepted friction. |
| Bundle works on dev machine but crashes on a clean machine | Likely a missing C runtime. On Windows, install [VC++ redistributable](https://aka.ms/vs/17/release/vc_redist.x64.exe) into the installer alongside the bundle. |
## Testing the bundle
Smoke-test on a **clean** machine (or VM) — your dev machine has too
much state to trust:
```
1. Boot a clean Mac / Win / Linux VM.
2. Copy the .dmg / .exe / .AppImage onto it.
3. Install / drag-drop into Applications / chmod +x.
4. Double-click the app icon.
5. Browser should open to http://127.0.0.1:850x within 5 seconds.
6. Drop samples/demo/shopify_pet_customers.csv into the
Automated Workflows page; click Run; AFTER preview should appear.
7. Confirm in the network tab: zero outbound calls except to
127.0.0.1 and the Streamlit static asset paths (also local).
```
Step 7 is the privacy-claim integrity check from
`docs/POST-LAUNCH.md` §6 — do this once per release, then trust it.
## Versioning
Bump the version string in three places per release:
- `datatools.spec` (CFBundleVersion + CFBundleShortVersionString)
- the Inno Setup `AppVersion` line
- the AppImage filename
A single source of truth (e.g. `src/__init__.py`) is a future
refactor — for v1 the three-spot update is fine.