build: drop the local Python release method, return to CI-only installer builds

Removes the single-command Python packaging method (build/make_release.py
+ build/build_portable_zip.py + build/macos/build_zip.sh) and the portable
.zip artifacts it produced. Release builds go back to the original GitHub
Actions process: the CI matrix builds one installer per platform (.dmg /
.exe / .AppImage) on tag push and attaches them to a GitHub Release.

Tesseract OCR bundling is preserved: the fetch helpers the workflow depends
on (fetch_tessdata, fetch_tesseract_for_platform) are extracted into a
standalone build/tesseract.py, which build.yml now imports.

Docs (README, build/README, DEVELOPER, TECHNICAL, USER-GUIDE, vendor README,
es translations) updated to drop the portable-zip flavor and point at the
new module.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-22 17:47:36 +00:00
parent 28ab51a869
commit fd9606c67b
13 changed files with 127 additions and 608 deletions

View File

@@ -105,7 +105,7 @@ datas += [
]
# ----- Tesseract OCR bundle ----------------------------------------
# ``build/make_release.py`` stages the per-platform Tesseract binary
# ``build/tesseract.py`` stages the per-platform Tesseract binary
# + its runtime libs (DLLs/dylibs/sos) into
# ``build/_tesseract/<target>/`` and the shared eng.traineddata into
# ``build/vendor/tessdata/``. We add both to ``datas`` so PyInstaller
@@ -119,16 +119,16 @@ datas += [
# from ``Path(sys._MEIPASS) / "tesseract" / ...``. Keep the two ends
# in sync — if you rename "tesseract" here, update pdf_extract.py too.
#
# The orchestrator (make_release.py) sets DATATOOLS_TESS_STAGING to
# the right per-platform dir before invoking PyInstaller. For ad-hoc
# `pyinstaller build/datatools.spec` runs without the orchestrator,
# fall back to the canonical staging path.
# CI (.github/workflows/build.yml) sets DATATOOLS_TESS_STAGING to the
# right per-platform dir before invoking PyInstaller. For ad-hoc
# `pyinstaller build/datatools.spec` runs without that env var, fall
# back to the canonical staging path.
_tess_staging_env = os.environ.get("DATATOOLS_TESS_STAGING")
if _tess_staging_env:
_tess_staging = Path(_tess_staging_env)
else:
# Pick the obvious per-host staging dir as a fallback so spec-only
# builds (without the orchestrator) still work in dev.
# builds (without the CI env var) still work in dev.
import sys as _sys_for_target
_target_guess = (
"win" if _sys_for_target.platform.startswith("win")
@@ -149,8 +149,8 @@ else:
# though, since the OCR feature will silently fail at runtime.
print(
f"WARNING: {_tess_staging} is empty or missing OCR will be "
"disabled in the bundle. Run build/make_release.py (which "
"calls fetch_tesseract_for_platform) before pyinstaller, or "
"disabled in the bundle. Run build/tesseract.py's "
"fetch_tesseract_for_platform before pyinstaller, or "
"pre-stage the binary manually."
)
@@ -159,8 +159,8 @@ if (_tessdata / "eng.traineddata").exists():
else:
print(
f"WARNING: {_tessdata}/eng.traineddata is missing OCR will "
"have no language data at runtime. Run build/make_release.py "
"or fetch manually per build/vendor/README.md."
"have no language data at runtime. Run build/tesseract.py's "
"fetch_tessdata or fetch manually per build/vendor/README.md."
)
# Bundle the Apache-2.0 LICENSE text alongside the binary. The docs