Compare commits

...

3 Commits

Author SHA1 Message Date
2bd94c4441 docs: document installer + portable downloads in en/es
Repo READMEs now show both download flavors side-by-side with
first-launch warnings (SmartScreen, Gatekeeper) and link to the
deeper walkthrough.

USER-GUIDE §1 rewritten from a 9-line stub into six subsections:
- §1.1 Windows: installer (5 steps) + portable (4 steps)
- §1.2 macOS:   DMG (5 steps incl. right-click-Open) + portable
- §1.3 Linux:   AppImage flow (unchanged)
- §1.4 First-launch: port selection, localhost binding, browser open
- §1.5 How the GUI works
- §1.6 System requirements

§6 Troubleshooting picks up portable-specific items: Safari unzip
quirks, antivirus quarantine on Win portable, license file location.

docs/README and Spanish mirrors updated to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 19:30:28 +00:00
9c426194b1 build: add single-command release script + portable zip artifacts
One-developer workflow: ``python build/make_release.py`` on each
target OS produces both the installer and a portable .zip for that
platform. Preflight checks PyInstaller / Pillow / iscc / hdiutil /
ditto / appimagetool and bails with install hints if anything is
missing — no half-built dist/.

New scripts:
- build/make_release.py   — orchestrator, auto-detects host OS.
- build/generate_icons.py — icon.ico / icon.icns / icon.png from
  src/gui/assets/datatools_icon_256.png (Pillow ships ICO + ICNS
  writers; no platform tooling needed).
- build/build_portable_zip.py — Win/Linux portable zip via stdlib.
- build/macos/build_zip.sh — Mac portable .app via ditto so
  bundle metadata survives.

installer.iss now adds: Quick Launch task (opt-in, legacy Win 7),
App Paths registry entry (Win+R "DataTools" works), SetupIconFile,
UninstallDisplayIcon, AppSupportURL, AppUpdatesURL.

CI workflow uploads installer + portable per platform and attaches
both to GitHub Releases on tag push.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 19:30:17 +00:00
6627895a10 test: fix v3 branding drift, add reconcile CLI + registry coverage
GUI/lang-pack tests were asserting against pre-v3 strings ("Data
Cleaning Mastery", "Maestría en limpieza…") that the brand refresh
replaced with "UNALOGIX DataTools" + "Clean. Normalize. Transform."
Updated assertions to the current copy and switched the findings
panel tests to the redesigned flat-list layout (per-finding "Open
Tool →" buttons instead of per-tool expanders).

New coverage:
- tests/test_cli_reconcile.py (13) — preview/apply, tolerance flags,
  sign inversion, key flags, error paths, Excel input.
- tests/test_tools_registry.py (27) — unique tool_ids, page_slug →
  real file, valid sections/tiers, localized accessor fallbacks,
  explicit pins for PDF Extractor + Reconciler entries.
- tests/test_reconcile.py — one-side-empty, key-pass tagging,
  additional validation cases, input-DataFrame immutability.
- tests/gui/test_smoke.py — PAGE_SLUGS now includes 10_PDF_Extractor
  and 11_Reconciler in both en/es.
- tests/gui/test_workflows.py — TestPdfExtractorWorkflow and
  TestReconcilerWorkflow render checks.

Net: 2317 passed → 2418 passed, 0 failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 19:30:02 +00:00
23 changed files with 1632 additions and 160 deletions

View File

@@ -1,8 +1,18 @@
name: Build installers
# Triggers:
# * Tag push (v*) → produces installers, attaches to a GitHub Release.
# * Manual dispatch → produces installers as workflow artifacts only.
# * Tag push (v*) → produces installers + portable zips, attaches them
# to a GitHub Release.
# * Manual dispatch → uploads everything as workflow artifacts only.
#
# Outputs per platform (downloadable by buyers):
# * macOS: .dmg installer + portable .zip (signed .app inside).
# * Windows: .exe installer + portable .zip (no-install).
# * Linux: .AppImage (already portable; no separate zip).
#
# Self-contained: every artifact ships its own Python interpreter + every
# runtime dep through PyInstaller. No pre/post install steps on the
# buyer's machine.
#
# What this workflow doesn't do (yet):
# * Code signing (Mac Developer ID, Windows code-signing cert).
@@ -29,14 +39,17 @@ jobs:
matrix:
include:
- os: macos-latest
artifact_name: DataTools-mac.dmg
artifact_path: dist/DataTools-*-mac.dmg
platform: mac
installer_glob: dist/DataTools-*-mac.dmg
portable_glob: dist/DataTools-*-mac-portable.zip
- os: windows-latest
artifact_name: DataTools-win.exe
artifact_path: dist/DataTools-*-win-setup.exe
platform: win
installer_glob: dist/DataTools-*-win-setup.exe
portable_glob: dist/DataTools-*-win-portable.zip
- os: ubuntu-latest
artifact_name: DataTools-linux.AppImage
artifact_path: dist/DataTools-*-linux-x86_64.AppImage
platform: linux
installer_glob: dist/DataTools-*-linux-x86_64.AppImage
portable_glob: '' # AppImage is already a portable single file
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
@@ -50,7 +63,7 @@ jobs:
run: |
pip install --upgrade pip
pip install -r requirements.txt
pip install pyinstaller
pip install pyinstaller pillow
- name: Read version
id: version
@@ -59,15 +72,22 @@ jobs:
VER=$(python -c "import re; print(re.search(r'__version__\s*=\s*\"([^\"]+)\"', open('src/__init__.py').read()).group(1))")
echo "version=$VER" >> "$GITHUB_OUTPUT"
- name: Generate platform icons
run: python build/generate_icons.py
- name: Build PyInstaller bundle
run: pyinstaller build/datatools.spec --clean --noconfirm
# ---- Per-platform packaging ----------------------------------
# ---- Per-platform installer packaging ------------------------
- name: Package macOS DMG
- name: Package macOS DMG (installer)
if: matrix.os == 'macos-latest'
run: bash build/macos/build_dmg.sh "${{ steps.version.outputs.version }}"
- name: Package macOS portable .zip
if: matrix.os == 'macos-latest'
run: bash build/macos/build_zip.sh "${{ steps.version.outputs.version }}"
- name: Install Inno Setup (Windows)
if: matrix.os == 'windows-latest'
run: choco install innosetup --no-progress -y
@@ -78,6 +98,10 @@ jobs:
run: |
iscc /DAppVersion=${{ steps.version.outputs.version }} build\installer.iss
- name: Package Windows portable .zip
if: matrix.os == 'windows-latest'
run: python build/build_portable_zip.py win ${{ steps.version.outputs.version }}
- name: Install AppImage tooling (Linux)
if: matrix.os == 'ubuntu-latest'
run: |
@@ -92,17 +116,32 @@ jobs:
# ---- Upload + release ----------------------------------------
- name: Upload artifact
- name: Upload installer artifact
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.artifact_name }}
path: ${{ matrix.artifact_path }}
name: DataTools-${{ matrix.platform }}-installer
path: ${{ matrix.installer_glob }}
if-no-files-found: error
- name: Attach to Release (tag push only)
- name: Upload portable artifact
if: matrix.portable_glob != ''
uses: actions/upload-artifact@v4
with:
name: DataTools-${{ matrix.platform }}-portable
path: ${{ matrix.portable_glob }}
if-no-files-found: error
- name: Attach installer to Release (tag push only)
if: startsWith(github.ref, 'refs/tags/v')
uses: softprops/action-gh-release@v2
with:
files: ${{ matrix.artifact_path }}
files: ${{ matrix.installer_glob }}
fail_on_unmatched_files: true
generate_release_notes: true
- name: Attach portable to Release (tag push only)
if: startsWith(github.ref, 'refs/tags/v') && matrix.portable_glob != ''
uses: softprops/action-gh-release@v2
with:
files: ${{ matrix.portable_glob }}
fail_on_unmatched_files: true

5
.gitignore vendored
View File

@@ -11,6 +11,11 @@ dist/
build/build/
build/__pycache__/
build/dist/
# Generated by build/generate_icons.py from src/gui/assets/datatools_icon_256.png.
# Build artifacts, not source — regenerated each CI run.
build/icon.ico
build/icon.icns
build/icon.png
.pytest_cache/
# Claude Code agent worktrees + local settings

View File

@@ -20,15 +20,21 @@ Limpieza local de CSV / Excel. CLI + GUI en el navegador, sin nube, sin ceremoni
## Descarga (usuarios no técnicos)
Instaladores precompilados — no se requiere Python:
Paquetes precompilados — sin instalar Python, sin permisos de administrador, sin internet en ejecución. Cada versión ofrece dos formatos por sistema operativo: un **instalador** que crea accesos directos en el escritorio + menú Inicio / Launchpad, y un **.zip portable** que descomprimes y haces doble clic. Elige el que te permita tu política de TI.
| Plataforma | Descarga | Nota de primer arranque |
| Plataforma | Instalador (recomendado) | Portable (sin instalar) |
|---|---|---|
| **macOS** | `DataTools-X.Y.Z-mac.dmg` | Arrastra DataTools.app a /Applications y haz doble clic. |
| **Windows** | `DataTools-X.Y.Z-win-setup.exe` | Ejecuta el instalador; se inicia desde el menú Inicio. |
| **Linux** | `DataTools-X.Y.Z-linux-x86_64.AppImage` | `chmod +x` al archivo y luego doble clic. |
| **macOS** | `DataTools-X.Y.Z-mac.dmg` — ábrelo, arrastra DataTools.app a /Applications, ejecútalo desde Launchpad. | `DataTools-X.Y.Z-mac-portable.zip` — descomprime donde quieras, doble clic en `DataTools.app`. |
| **Windows** | `DataTools-X.Y.Z-win-setup.exe` — ejecuta el instalador (por usuario, sin admin). Crea acceso directo en el escritorio + entrada en el menú Inicio. | `DataTools-X.Y.Z-win-portable.zip` — descomprime donde quieras, doble clic en `DataTools.exe`. |
| **Linux** | `DataTools-X.Y.Z-linux-x86_64.AppImage` `chmod +x` y doble clic. | El AppImage ya es portable. |
Última versión: consulta [GitHub Releases](https://git.invixiom.com/giteadmin/datatools-dev/releases) (o el listado de Gumroad). Los instaladores ocupan ~150200 MB; el lanzador arranca un servidor local en http://127.0.0.1:8501 y abre tu navegador. Nada se envía a la nube.
Última versión: consulta [GitHub Releases](https://git.invixiom.com/giteadmin/datatools-dev/releases) (o el listado de Gumroad). Cada paquete ocupa ~200 MB descomprimido; al primer arranque la app levanta un servidor local en http://127.0.0.1:8501 y abre tu navegador predeterminado. Nada sale de tu equipo — instalador y portable son idénticos por dentro.
**Avisos del primer arranque (una sola vez):**
- **macOS** sin firma: clic derecho → **Abrir** → confirma. (Las compilaciones firmadas se lo saltan.)
- **Windows** SmartScreen: pulsa **Más información****Ejecutar de todas formas**.
Guía detallada de instalación y resolución de problemas: [Guía del usuario §1](docs/USER-GUIDE.es.md#1-instalaci%C3%B3n).
## Instalar desde el código (desarrolladores)

View File

@@ -20,15 +20,21 @@ Local CSV / Excel cleaning. CLI + browser GUI, no cloud, no install ceremony. GU
## Download (non-technical users)
Pre-built installers — no Python required:
Pre-built bundles — no Python install, no admin rights, no internet at runtime. Each release ships two flavors per OS: an **installer** that wires up Desktop + Start Menu / Launchpad shortcuts, and a **portable .zip** you unzip and double-click. Pick whichever your IT policy allows.
| Platform | Download | First-launch note |
| Platform | Installer (recommended) | Portable (no install) |
|---|---|---|
| **macOS** | `DataTools-X.Y.Z-mac.dmg` | Drag DataTools.app into /Applications, then double-click. |
| **Windows** | `DataTools-X.Y.Z-win-setup.exe` | Run the installer; launches from Start Menu. |
| **Linux** | `DataTools-X.Y.Z-linux-x86_64.AppImage` | `chmod +x` the file, then double-click. |
| **macOS** | `DataTools-X.Y.Z-mac.dmg` — open, drag DataTools.app into /Applications, launch from Launchpad. | `DataTools-X.Y.Z-mac-portable.zip` — unzip anywhere, double-click `DataTools.app`. |
| **Windows** | `DataTools-X.Y.Z-win-setup.exe` — run installer (per-user, no admin). Desktop shortcut + Start Menu entry created. | `DataTools-X.Y.Z-win-portable.zip` — unzip anywhere, double-click `DataTools.exe`. |
| **Linux** | `DataTools-X.Y.Z-linux-x86_64.AppImage` `chmod +x`, double-click. | The AppImage is already portable. |
Latest release: see [GitHub Releases](https://git.invixiom.com/giteadmin/datatools-dev/releases) (or the Gumroad listing). The installers are ~150200 MB; the launcher boots a local server at http://127.0.0.1:8501 and opens your browser. Nothing is sent to the cloud.
Latest release: see [GitHub Releases](https://git.invixiom.com/giteadmin/datatools-dev/releases) (or the Gumroad listing). Each bundle is ~200 MB unpacked; on first launch the app starts a local server at http://127.0.0.1:8501 and opens your default browser. Nothing leaves your machine — installers and portables are byte-identical inside.
**First-launch warnings (one-time):**
- **macOS** unsigned builds: right-click → **Open** → confirm. (Signed builds skip this.)
- **Windows** SmartScreen: click **More info****Run anyway**.
Detailed install + troubleshooting walkthrough: [User Guide §1](docs/USER-GUIDE.md#1-install).
## Install from source (developers)

View File

@@ -19,23 +19,53 @@ build/
│ Mac .app bundle config. Reads the version
│ from src/__init__.py.
├── installer.iss Inno Setup script — Windows .exe installer.
│ Adds Start Menu + Desktop + App Paths entries.
├── generate_icons.py Builds icon.ico / icon.icns / icon.png from
│ src/gui/assets/datatools_icon_256.png. Run
│ once before pyinstaller (CI does this).
├── build_portable_zip.py Cross-platform: zips dist/DataTools/ into a
│ no-install portable download. Used by the
│ Windows + Linux portable artifacts.
├── macos/
── build_dmg.sh Wraps dist/DataTools.app into a .dmg with a
drag-to-/Applications layout.
── build_dmg.sh Wraps dist/DataTools.app into a .dmg with a
drag-to-/Applications layout (installer).
│ └── build_zip.sh Wraps dist/DataTools.app into a portable
│ .zip via ditto (preserves bundle metadata).
├── appimage/
│ ├── AppRun Entry point invoked when the AppImage runs.
│ ├── datatools.desktop Linux desktop-entry metadata.
│ └── build.sh Wraps dist/DataTools/ into an .AppImage.
├── hooks/ PyInstaller hooks for libs the static analyser
│ └── hook-streamlit.py misses (Streamlit's dynamic imports).
├── icon.icns macOS app icon (TODO: produce from a 1024×1024
│ PNG. Optional — bundle still builds without).
├── icon.ico Windows app icon (TODO).
├── icon.png Linux AppImage icon (TODO — build.sh generates
│ a placeholder if missing).
├── icon.{ico,icns,png} Generated by generate_icons.py — gitignored.
└── README.md this file
```
## Distribution outputs per platform
Each CI run produces two downloads per platform — an installer for
buyers who want shortcuts wired automatically, and a portable .zip
for buyers (or IT-locked-down machines) that can't run installers:
| Platform | Installer | Portable |
|----------|----------------------------------------|------------------------------------------------|
| macOS | `DataTools-<ver>-mac.dmg` | `DataTools-<ver>-mac-portable.zip` (ditto .app)|
| Windows | `DataTools-<ver>-win-setup.exe` | `DataTools-<ver>-win-portable.zip` |
| Linux | `DataTools-<ver>-linux-x86_64.AppImage`| (the AppImage IS the portable) |
All six outputs are self-contained: every dependency (Python, pandas,
streamlit, pdfplumber, the lot) is frozen into the bundle. The buyer
does not need to install Python, pip, or anything else first.
## Easy-launch surface
| Affordance | Windows | macOS |
|------------------|--------------------------------------------------|------------------------------------------------------|
| Desktop shortcut | Inno Setup `desktopicon` task (checked default) | The .app bundle in /Applications is the icon |
| App menu | Start Menu → DataTools (always installed) | Launchpad + Spotlight (auto from /Applications) |
| Taskbar / Dock | User pins manually (OS forbids programmatic pin) | User pins manually after first launch |
| Run from terminal| `DataTools` (registered via App Paths) | `open -a DataTools` (auto from .app bundle) |
CI: `.github/workflows/build.yml` runs the full pipeline on tag push
(matrix: macos-latest, windows-latest, ubuntu-latest) and attaches
the resulting installers to a GitHub Release. Manual
@@ -43,12 +73,46 @@ the resulting installers to a GitHub Release. Manual
## Releasing
### Single-command local build (recommended for one-developer workflow)
PyInstaller can't cross-compile, so a single machine produces one
platform's packages. Run this on each target OS:
```bash
# One-time setup per machine:
pip install -r requirements.txt
pip install pyinstaller pillow
# Windows only: install Inno Setup from https://jrsoftware.org/isdl.php
# Linux only: drop appimagetool onto PATH (see preflight output)
# Build everything for the current OS:
python build/make_release.py
```
Outputs land in `dist/`:
- Windows host → `DataTools-<ver>-win-setup.exe` + `DataTools-<ver>-win-portable.zip`
- macOS host → `DataTools-<ver>-mac.dmg` + `DataTools-<ver>-mac-portable.zip`
- Linux host → `DataTools-<ver>-linux-x86_64.AppImage`
Useful flags:
```bash
python build/make_release.py --preflight # check tooling, build nothing
python build/make_release.py --clean # wipe dist/ first
python build/make_release.py --skip-installer # just the portable zip
python build/make_release.py --skip-portable # just the installer
```
### CI build (push tag → GitHub Release)
If you have CI runners for all three OSes:
1. Bump `__version__` in `src/__init__.py`.
2. `git commit -am "release: vX.Y.Z" && git tag vX.Y.Z`.
3. `git push && git push --tags`.
4. CI builds all three platforms and creates a GitHub Release with
the installers attached.
5. Mirror the GitHub Release assets to Gumroad (manual until v2).
4. CI builds all three platforms and creates a Release with the
installers + portable zips attached.
5. Mirror the Release assets to Gumroad (manual until v2).
## Signing (Phase 2 — needs accounts/credentials)

View File

@@ -0,0 +1,69 @@
"""Wrap the PyInstaller folder build into a portable .zip.
Self-contained download: unzip → double-click the launcher → app runs.
No installer, no Python install, no admin rights required.
Usage:
python build/build_portable_zip.py <platform> <version>
Where ``platform`` is one of ``win`` / ``mac`` / ``linux``. The
script just produces a generic ``dist/DataTools/`` zip; on macOS the
preferred portable format is the ``ditto``-wrapped .app — see
``build/macos/build_zip.sh`` for that flow. This helper exists mainly
for Windows + Linux, where there's no .app bundle to wrap.
Output:
dist/DataTools-<version>-<platform>-portable.zip
The zip root is the ``DataTools/`` folder so an unzip produces a
self-contained dir the user can drop anywhere (Desktop, USB stick,
network share). On Windows, the launcher is ``DataTools.exe`` inside
that folder; on Linux, ``DataTools``.
"""
from __future__ import annotations
import shutil
import sys
from pathlib import Path
REPO = Path(__file__).resolve().parent.parent
DIST_DIR = REPO / "dist"
BUNDLE_DIR = DIST_DIR / "DataTools"
def main() -> int:
if len(sys.argv) < 3:
sys.stderr.write(
"usage: python build/build_portable_zip.py <platform> <version>\n"
)
return 2
platform = sys.argv[1]
version = sys.argv[2]
if not BUNDLE_DIR.is_dir():
sys.stderr.write(
f"Bundle dir not found at {BUNDLE_DIR}.\n"
"Run ``pyinstaller build/datatools.spec --clean --noconfirm`` first.\n"
)
return 1
out_stem = DIST_DIR / f"DataTools-{version}-{platform}-portable"
# ``make_archive`` takes a base name (no extension) and produces
# ``<base>.zip``. ``root_dir`` = parent of what we want compressed,
# ``base_dir`` = the folder name inside the archive root. This
# combo yields a single top-level ``DataTools/`` directory inside
# the .zip rather than dumping its contents loose.
archive = shutil.make_archive(
base_name=str(out_stem),
format="zip",
root_dir=str(DIST_DIR),
base_dir="DataTools",
)
size_mb = Path(archive).stat().st_size / (1024 * 1024)
print(f"wrote {archive} ({size_mb:.1f} MB)")
return 0
if __name__ == "__main__":
sys.exit(main())

78
build/generate_icons.py Normal file
View File

@@ -0,0 +1,78 @@
"""Generate platform-specific app icons from the source PNG asset.
Outputs:
build/icon.ico Windows multi-resolution icon (16..256 px sizes).
build/icon.icns macOS icon bundle (16..1024 px scaled tiers).
build/icon.png Plain 256x256 PNG used by the Linux AppImage.
Source: ``src/gui/assets/datatools_icon_256.png`` (the same icon
``st.set_page_config`` uses, so the installer / Dock / Taskbar match
the in-app tab favicon).
Run manually:
python build/generate_icons.py
CI runs this automatically before invoking PyInstaller (see
``.github/workflows/build.yml``). Both files are .gitignored — they
are build artifacts derived from the committed PNG.
Self-contained: pulls only Pillow (already a transitive dep of
``pdfplumber``) so no extra installs are required.
"""
from __future__ import annotations
import sys
from pathlib import Path
from PIL import Image
# Repo layout: this script lives at <REPO>/build/. The source PNG is at
# <REPO>/src/gui/assets/datatools_icon_256.png.
BUILD_DIR = Path(__file__).resolve().parent
REPO = BUILD_DIR.parent
SOURCE_PNG = REPO / "src" / "gui" / "assets" / "datatools_icon_256.png"
# Windows ICO needs every size the OS might render at: taskbar (16/24),
# Start Menu (32/48), tile (64/128), shell properties dialog (256).
ICO_SIZES = [(16, 16), (24, 24), (32, 32), (48, 48), (64, 64),
(128, 128), (256, 256)]
def main() -> int:
if not SOURCE_PNG.exists():
sys.stderr.write(
f"Source icon not found at {SOURCE_PNG}.\n"
"Add a 256x256 (or larger) RGBA PNG there and re-run.\n"
)
return 1
src = Image.open(SOURCE_PNG).convert("RGBA")
if src.size[0] < 256 or src.size[1] < 256:
sys.stderr.write(
f"Source icon is {src.size}; recommend 256x256 or larger "
"so downscaled tiers look crisp.\n"
)
ico_path = BUILD_DIR / "icon.ico"
src.save(ico_path, format="ICO", sizes=ICO_SIZES)
print(f"wrote {ico_path} ({ico_path.stat().st_size:,} bytes)")
icns_path = BUILD_DIR / "icon.icns"
# Pillow's ICNS writer derives the per-tier sizes from the source
# image; passing a 256x256 source yields ic07..ic12 entries which
# cover Finder, Dock, and the Get Info panel.
src.save(icns_path, format="ICNS")
print(f"wrote {icns_path} ({icns_path.stat().st_size:,} bytes)")
# AppImage uses a plain PNG for its desktop entry. Copy the source
# so the AppImage build script doesn't have to know the asset path.
png_path = BUILD_DIR / "icon.png"
src.save(png_path, format="PNG")
print(f"wrote {png_path} ({png_path.stat().st_size:,} bytes)")
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1,11 +1,26 @@
; Inno Setup script for DataTools — Windows installer.
;
; Compile from the repo root:
; iscc /DAppVersion=1.0.0 build\installer.iss
; iscc /DAppVersion=3.0 build\installer.iss
;
; CI passes the version via /DAppVersion to keep src/__init__.py the
; single source of truth. Local manual builds: pass /DAppVersion or
; let the default kick in.
;
; What this installer wires up (covers the "easy launch" surface):
; * Start Menu group: Start → DataTools → DataTools / Uninstall
; * Desktop shortcut: optional, checked by default during install
; * Quick Launch: optional, off by default (legacy Win 7 + power
; users who keep the bar enabled). Windows 10/11
; users pin to taskbar manually via right-click —
; OS security policy forbids programmatic pinning.
; * App Paths entry: so ``DataTools`` typed into Win+R / cmd works.
;
; Self-contained: the installer contains a frozen PyInstaller bundle
; (Python + every runtime dep). No pre-install or post-install steps
; on the buyer's machine. UAC is NOT required because we install
; per-user by default; the prompt only fires if the buyer asks for an
; all-users install.
#ifndef AppVersion
#define AppVersion "0.0.0-dev"
@@ -18,11 +33,15 @@ AppVersion={#AppVersion}
AppVerName=DataTools {#AppVersion}
AppPublisher=DataTools
AppPublisherURL=https://datatools.app
AppSupportURL=https://datatools.app/support
AppUpdatesURL=https://datatools.app/releases
DefaultDirName={autopf}\DataTools
DefaultGroupName=DataTools
DisableProgramGroupPage=yes
OutputDir=..\dist
OutputBaseFilename=DataTools-{#AppVersion}-win-setup
SetupIconFile=icon.ico
UninstallDisplayIcon={app}\DataTools.exe
Compression=lzma2/max
SolidCompression=yes
WizardStyle=modern
@@ -30,20 +49,37 @@ ArchitecturesInstallIn64BitMode=x64
PrivilegesRequired=lowest
PrivilegesRequiredOverridesAllowed=dialog
; Allow per-user install (no UAC prompt) when admin isn't available.
; Buyers without admin rights can still install without IT involvement.
ChangesAssociations=no
CloseApplications=force
RestartApplications=no
[Languages]
Name: "english"; MessagesFile: "compiler:Default.isl"
[Tasks]
Name: "desktopicon"; Description: "Create a &desktop shortcut"; GroupDescription: "Additional shortcuts:"
Name: "quicklaunchicon"; Description: "Create a &Quick Launch shortcut"; GroupDescription: "Additional shortcuts:"; Flags: unchecked; OnlyBelowVersion: 6.1
[Files]
Source: "..\dist\DataTools\*"; DestDir: "{app}"; Flags: recursesubdirs ignoreversion
[Icons]
Name: "{group}\DataTools"; Filename: "{app}\DataTools.exe"
; Start Menu entries — created unconditionally so the app is always
; discoverable via Start search.
Name: "{group}\DataTools"; Filename: "{app}\DataTools.exe"; IconFilename: "{app}\DataTools.exe"
Name: "{group}\Uninstall DataTools"; Filename: "{uninstallexe}"
Name: "{autodesktop}\DataTools"; Filename: "{app}\DataTools.exe"; Tasks: desktopicon
; Desktop shortcut — opt-in via the Tasks page.
Name: "{autodesktop}\DataTools"; Filename: "{app}\DataTools.exe"; IconFilename: "{app}\DataTools.exe"; Tasks: desktopicon
; Quick Launch (legacy) — only relevant on Win 7 and older.
Name: "{userappdata}\Microsoft\Internet Explorer\Quick Launch\DataTools"; Filename: "{app}\DataTools.exe"; IconFilename: "{app}\DataTools.exe"; Tasks: quicklaunchicon
[Registry]
; App Paths — lets the buyer launch from Win+R or cmd with just
; "DataTools" instead of a full path. Per-user hive so the per-user
; install path doesn't need admin to register.
Root: HKCU; Subkey: "Software\Microsoft\Windows\CurrentVersion\App Paths\DataTools.exe"; ValueType: string; ValueName: ""; ValueData: "{app}\DataTools.exe"; Flags: uninsdeletekey
[Run]
Filename: "{app}\DataTools.exe"; Description: "Launch DataTools"; Flags: nowait postinstall skipifsilent

38
build/macos/build_zip.sh Executable file
View File

@@ -0,0 +1,38 @@
#!/usr/bin/env bash
# Wrap dist/DataTools.app into a no-install portable .zip.
#
# Usage:
# bash build/macos/build_zip.sh <version>
#
# Why a portable .zip in addition to the .dmg:
# * Buyers who don't want an installer can unzip and double-click the
# .app directly — no drag-to-/Applications step, no installer
# chrome. Self-contained: the .app holds Python + every dep.
# * IT-locked-down machines often block .dmg auto-mount but allow
# .zip download + extraction.
#
# Run after ``pyinstaller build/datatools.spec --clean --noconfirm``
# has produced ``dist/DataTools.app``. Output goes to
# ``dist/DataTools-<version>-mac-portable.zip``.
set -euo pipefail
VERSION="${1:-0.0.0-dev}"
APP="dist/DataTools.app"
ZIP="dist/DataTools-${VERSION}-mac-portable.zip"
if [[ ! -d "$APP" ]]; then
echo "Error: $APP not found. Run pyinstaller build/datatools.spec first." >&2
exit 1
fi
# ``ditto`` preserves the .app bundle's extended attributes and
# resource forks (a plain ``zip`` strips them and can break code
# signatures + Info.plist resolution on the buyer's machine).
#
# --sequesterRsrc keeps the AppleDouble metadata inside the archive
# rather than as parallel ._ files on disk after extraction.
rm -f "$ZIP"
ditto -c -k --sequesterRsrc --keepParent "$APP" "$ZIP"
echo "Built $ZIP ($(du -h "$ZIP" | cut -f1))"

348
build/make_release.py Normal file
View File

@@ -0,0 +1,348 @@
"""Single-command release builder for DataTools.
PyInstaller can't cross-compile — to produce a Windows .exe you run
this on Windows, for a Mac .dmg you run it on macOS, for a Linux
AppImage you run it on Linux. One script, one OS at a time.
What this script does (in order):
1. Preflight — checks PyInstaller, Pillow, and the platform's
packager (Inno Setup on Win / hdiutil + ditto on Mac /
appimagetool on Linux) are reachable. Bails with install
instructions if anything is missing.
2. Generates icon.ico / icon.icns / icon.png from the PNG asset.
3. Runs PyInstaller against build/datatools.spec.
4. Wraps the PyInstaller output into:
* Windows: DataTools-<ver>-win-setup.exe (Inno Setup)
+ DataTools-<ver>-win-portable.zip
* macOS: DataTools-<ver>-mac.dmg
+ DataTools-<ver>-mac-portable.zip
* Linux: DataTools-<ver>-linux-x86_64.AppImage
5. Prints what landed in dist/ and the byte sizes.
Usage:
python build/make_release.py # build everything for this OS
python build/make_release.py --preflight # check tooling, don't build
python build/make_release.py --skip-installer # only the portable zip
python build/make_release.py --skip-portable # only the installer
python build/make_release.py --clean # wipe dist/ first
Run from the repo root or from build/ — either works.
"""
from __future__ import annotations
import argparse
import platform
import re
import shutil
import subprocess
import sys
from pathlib import Path
REPO = Path(__file__).resolve().parent.parent
BUILD = REPO / "build"
DIST = REPO / "dist"
# ---------------------------------------------------------------------------
# Output helpers — colourless so logs stay readable in any terminal/CI tail.
# ---------------------------------------------------------------------------
def _step(msg: str) -> None:
print(f"\n==> {msg}", flush=True)
def _ok(msg: str) -> None:
print(f" ok: {msg}", flush=True)
def _warn(msg: str) -> None:
print(f" warn: {msg}", flush=True)
def _err(msg: str) -> None:
print(f" ERROR: {msg}", file=sys.stderr, flush=True)
def _run(cmd: list[str], cwd: Path | None = None, env: dict | None = None) -> None:
"""Run *cmd*, stream output, exit on failure with a useful banner."""
printable = " ".join(map(str, cmd))
print(f" $ {printable}", flush=True)
try:
subprocess.run(cmd, check=True, cwd=cwd or REPO, env=env)
except subprocess.CalledProcessError as e:
_err(f"command failed (exit {e.returncode}): {printable}")
sys.exit(e.returncode)
except FileNotFoundError:
_err(f"command not found: {cmd[0]}")
sys.exit(127)
# ---------------------------------------------------------------------------
# Platform detection
# ---------------------------------------------------------------------------
def _detect_platform() -> str:
"""Return ``win`` / ``mac`` / ``linux`` based on sys.platform."""
p = sys.platform
if p.startswith("win"):
return "win"
if p == "darwin":
return "mac"
if p.startswith("linux"):
return "linux"
_err(f"unsupported platform {p!r}; this script handles win/mac/linux only.")
sys.exit(2)
# ---------------------------------------------------------------------------
# Version — single source of truth in src/__init__.py
# ---------------------------------------------------------------------------
def _read_version() -> str:
init_py = (REPO / "src" / "__init__.py").read_text(encoding="utf-8")
m = re.search(r'__version__\s*=\s*["\']([^"\']+)["\']', init_py)
if not m:
_err("could not parse __version__ from src/__init__.py")
sys.exit(1)
return m.group(1)
# ---------------------------------------------------------------------------
# Preflight — check tooling before doing anything destructive
# ---------------------------------------------------------------------------
def _have_module(name: str) -> bool:
try:
__import__(name)
return True
except ImportError:
return False
def _have_command(name: str) -> bool:
return shutil.which(name) is not None
# Per-platform install hints. The error messages quote these so a buyer
# building from source isn't left guessing what to install next.
_INSTALL_HINTS = {
"pyinstaller": "pip install pyinstaller",
"pil": "pip install pillow",
"iscc": "Inno Setup (Windows): https://jrsoftware.org/isdl.php — install, then re-open the shell so iscc lands on PATH.",
"hdiutil": "ships with macOS — if it's missing your Mac install is broken.",
"ditto": "ships with macOS — if it's missing your Mac install is broken.",
"appimagetool": "Linux: download appimagetool-x86_64.AppImage from https://github.com/AppImage/AppImageKit/releases, chmod +x, drop on PATH.",
}
def preflight(target: str) -> None:
"""Verify every tool the target build needs is reachable; exit if not."""
_step(f"preflight ({target})")
missing: list[tuple[str, str]] = []
# Python-side deps — same on every platform. The ``_INSTALL_HINTS``
# lookup uses lowercase keys so module name capitalization doesn't
# need to match.
for mod in ("PyInstaller", "PIL"):
if not _have_module(mod):
hint = _INSTALL_HINTS.get(mod.lower(), f"pip install {mod}")
missing.append((mod.lower(), hint))
else:
_ok(f"{mod} importable")
# PyInstaller's CLI must also be reachable as a binary, not just as
# an importable module — the spec is invoked via the ``pyinstaller``
# command. ``python -m PyInstaller`` is a fine fallback so don't
# hard-fail if only the CLI binary is missing.
if _have_command("pyinstaller"):
_ok("pyinstaller on PATH")
else:
_warn("pyinstaller binary not on PATH — will fall back to `python -m PyInstaller`")
# Platform-specific packagers.
if target == "win":
if _have_command("iscc"):
_ok("Inno Setup (iscc) on PATH")
else:
missing.append(("iscc", _INSTALL_HINTS["iscc"]))
elif target == "mac":
for tool in ("hdiutil", "ditto"):
if _have_command(tool):
_ok(f"{tool} on PATH")
else:
missing.append((tool, _INSTALL_HINTS[tool]))
elif target == "linux":
if _have_command("appimagetool"):
_ok("appimagetool on PATH")
else:
missing.append(("appimagetool", _INSTALL_HINTS["appimagetool"]))
if missing:
_err("missing prerequisites:")
for name, hint in missing:
print(f" - {name}: {hint}", file=sys.stderr)
sys.exit(1)
_ok("all prerequisites present")
# ---------------------------------------------------------------------------
# Build steps
# ---------------------------------------------------------------------------
def step_generate_icons() -> None:
_step("generate icons")
_run([sys.executable, str(BUILD / "generate_icons.py")])
def step_pyinstaller(clean: bool) -> None:
_step("pyinstaller bundle")
# Use ``python -m PyInstaller`` so we don't depend on the binary
# being on PATH (Windows users frequently see this — pip's
# Scripts/ dir isn't auto-added).
cmd = [sys.executable, "-m", "PyInstaller",
str(BUILD / "datatools.spec"),
"--noconfirm"]
if clean:
cmd.append("--clean")
_run(cmd)
def step_package_win(version: str, do_installer: bool, do_portable: bool) -> list[Path]:
out: list[Path] = []
if do_installer:
_step("Windows installer (Inno Setup)")
_run(["iscc", f"/DAppVersion={version}", str(BUILD / "installer.iss")])
out.append(DIST / f"DataTools-{version}-win-setup.exe")
if do_portable:
_step("Windows portable .zip")
_run([sys.executable, str(BUILD / "build_portable_zip.py"), "win", version])
out.append(DIST / f"DataTools-{version}-win-portable.zip")
return out
def step_package_mac(version: str, do_installer: bool, do_portable: bool) -> list[Path]:
out: list[Path] = []
if do_installer:
_step("macOS DMG (installer)")
_run(["bash", str(BUILD / "macos" / "build_dmg.sh"), version])
out.append(DIST / f"DataTools-{version}-mac.dmg")
if do_portable:
_step("macOS portable .zip")
_run(["bash", str(BUILD / "macos" / "build_zip.sh"), version])
out.append(DIST / f"DataTools-{version}-mac-portable.zip")
return out
def step_package_linux(version: str, do_installer: bool, do_portable: bool) -> list[Path]:
# On Linux the AppImage IS the portable. We ignore the two flags
# and always produce the single file — splitting wouldn't add
# value.
if not (do_installer or do_portable):
return []
_step("Linux AppImage")
_run(["bash", str(BUILD / "appimage" / "build.sh"), version])
return [DIST / f"DataTools-{version}-linux-x86_64.AppImage"]
# ---------------------------------------------------------------------------
# Orchestration
# ---------------------------------------------------------------------------
def _summarise(outputs: list[Path]) -> None:
_step("done — outputs")
if not outputs:
_warn("no files produced (everything skipped via flags)")
return
for p in outputs:
if p.exists():
size_mb = p.stat().st_size / (1024 * 1024)
print(f" {p.relative_to(REPO)} ({size_mb:.1f} MB)")
else:
_warn(f"expected output missing: {p.relative_to(REPO)}")
def main() -> int:
parser = argparse.ArgumentParser(
prog="make_release.py",
description=(
"Build the installer + portable zip for the current OS. "
"Cross-compilation isn't supported by PyInstaller — run "
"this once per platform you want to target."
),
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
"--platform", choices=("auto", "win", "mac", "linux"), default="auto",
help="Override OS detection (mostly for testing). Default: auto.",
)
parser.add_argument(
"--preflight", action="store_true",
help="Check tooling and exit without building.",
)
parser.add_argument(
"--clean", action="store_true",
help="Wipe dist/ before building.",
)
parser.add_argument(
"--skip-installer", action="store_true",
help="Don't build the OS installer (.exe / .dmg).",
)
parser.add_argument(
"--skip-portable", action="store_true",
help="Don't build the portable .zip.",
)
args = parser.parse_args()
target = _detect_platform() if args.platform == "auto" else args.platform
version = _read_version()
do_installer = not args.skip_installer
do_portable = not args.skip_portable
print(f"DataTools release builder")
print(f" target: {target} (host: {platform.platform()})")
print(f" version: {version}")
print(f" installer: {'yes' if do_installer else 'no'}")
print(f" portable: {'yes' if do_portable else 'no'}")
print(f" dist dir: {DIST}")
if target != _detect_platform():
_warn(
f"--platform {target} but host is {_detect_platform()}. "
"PyInstaller can't cross-compile — the bundle will be for "
"the HOST, only the packaging step will follow your override. "
"Useful only for testing the packager paths."
)
preflight(target)
if args.preflight:
return 0
if args.clean and DIST.exists():
_step(f"cleaning {DIST}")
shutil.rmtree(DIST)
step_generate_icons()
step_pyinstaller(clean=args.clean)
if target == "win":
outputs = step_package_win(version, do_installer, do_portable)
elif target == "mac":
outputs = step_package_mac(version, do_installer, do_portable)
else:
outputs = step_package_linux(version, do_installer, do_portable)
_summarise(outputs)
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -6,11 +6,13 @@
## Inicio rápido
1. Descarga el instalador para tu sistema operativo desde tu correo de compra.
2. Ejecútalo (no se requieren conocimientos de Python).
3. Lánzalo desde el acceso directo del escritorio → tu navegador predeterminado se abrirá en una página local.
1. Descarga desde tu correo de compra. Dos formatos por sistema operativo — elige uno:
- **Instalador** (`.dmg` en macOS, `.exe` en Windows) — crea acceso directo en el escritorio + entrada en el menú Inicio / Launchpad.
- **.zip portable** — descomprime y haz doble clic. Sin instalación, sin admin, se ejecuta desde cualquier lugar.
2. Ábrelo (no necesitas Python; todo viene incluido).
3. La app arranca un servidor local y abre tu navegador. Nada sale de tu equipo.
Instrucciones completas: [USER-GUIDE.es.md](USER-GUIDE.es.md).
Paso a paso completo incluyendo SmartScreen / Gatekeeper: [USER-GUIDE.es.md §1](USER-GUIDE.es.md#1-instalaci%C3%B3n).
## Documentación

View File

@@ -6,11 +6,13 @@
## Quick Start
1. Download the installer for your OS from your purchase email.
2. Run it (no Python knowledge required).
3. Launch via the desktop shortcut → your default browser opens to a local page.
1. Download from your purchase email. Two flavors per OS — pick one:
- **Installer** (`.dmg` on macOS, `.exe` on Windows) — wires up Desktop + Start Menu / Launchpad shortcuts.
- **Portable .zip** — unzip and double-click. No install, no admin rights, runs from anywhere.
2. Open it (no Python needed; everything is bundled inside).
3. The app starts a local server and opens your browser. Nothing leaves your machine.
Full instructions: [USER-GUIDE.md](USER-GUIDE.md).
Full step-by-step including SmartScreen / Gatekeeper workarounds: [USER-GUIDE.md §1](USER-GUIDE.md#1-install).
## Docs

View File

@@ -25,27 +25,85 @@ Para usar la misma licencia en otro equipo: desactiva éste (página Activar →
## 1. Instalación
No necesitas tener Python instalado — el paquete es autocontenido.
No necesitas tener Python ni permisos de administrador — el paquete trae su propio intérprete y todas las dependencias. Dos formatos por sistema operativo, elige el que tu política de TI permita:
| Sistema operativo | Archivo | Cómo |
|----|------|-----|
| Windows | `BundleName-Setup-1.0.exe` | Doble clic en el instalador → acceso directo en el escritorio. |
| macOS | `BundleName-1.0.dmg` | Monta el DMG y arrástralo a Aplicaciones. Firmado y notarizado. |
| Linux | `BundleName-1.0.AppImage` | `chmod +x`, doble clic. (También hay un `.tar.gz` de respaldo.) |
- **Instalador** — crea automáticamente acceso directo en el escritorio + entrada en el menú Inicio / Launchpad. Recomendado para la mayoría.
- **.zip portable** — descomprime y haz doble clic. No toca el registro, se ejecuta desde cualquier lugar (escritorio, USB, recurso de red). Úsalo si no puedes ejecutar instaladores, quieres una instalación de una sola carpeta que puedas copiar entre equipos, o estás evaluando antes de instalar.
Al iniciar la app, se abre tu navegador predeterminado en una página local (`http://localhost:8501`).
Ambos formatos son idénticos por dentro: mismo Python, mismas dependencias, mismo comportamiento de arranque.
### Cómo funciona la interfaz gráfica (GUI)
### 1.1 Windows
**Opción A — Instalador (`DataTools-<ver>-win-setup.exe`)**
1. Descarga `DataTools-<ver>-win-setup.exe` desde tu correo de licencia o GitHub Releases.
2. Doble clic en el instalador. La primera vez, Windows SmartScreen mostrará **"Windows protegió tu PC"** — pulsa **Más información****Ejecutar de todas formas**. (Este aviso solo aparece una vez por compilación hasta que tengamos un certificado EV de firma de código.)
3. Acepta la ruta de instalación por usuario (`%LOCALAPPDATA%\Programs\DataTools` por defecto — no pide UAC). Marca **Crear acceso directo en el escritorio** si lo quieres (activado por defecto).
4. Pulsa **Instalar** y luego **Finalizar**. El instalador te ofrece lanzar DataTools al terminar.
5. A partir de ahora ejecútalo desde: **Menú Inicio → DataTools**, el **acceso directo del escritorio**, o escribiendo `DataTools` en Ejecutar (Win+R) / cmd.
Para anclarlo a la barra de tareas, lanza la app una vez, clic derecho en su icono de la barra de tareas, y **Anclar a la barra de tareas**. Windows requiere este paso manual — ningún instalador puede anclar por programa.
**Opción B — Portable (`DataTools-<ver>-win-portable.zip`)**
1. Descarga `DataTools-<ver>-win-portable.zip`.
2. Clic derecho en el .zip → **Extraer todo…** → elige una carpeta (p. ej. `C:\Tools\DataTools`).
3. Abre la carpeta `DataTools\` extraída, doble clic en `DataTools.exe`. El aviso de SmartScreen aparece solo la primera vez.
4. Para crear tu propio acceso directo en el escritorio: clic derecho en `DataTools.exe`**Enviar a → Escritorio (crear acceso directo)**.
**Desinstalar** (solo instalador): Configuración → Aplicaciones → DataTools → Desinstalar. Portable: borra la carpeta.
### 1.2 macOS
**Opción A — DMG instalador (`DataTools-<ver>-mac.dmg`)**
1. Descarga `DataTools-<ver>-mac.dmg`.
2. Doble clic en el .dmg. Se abre una ventana de Finder con el icono **DataTools** y un alias **Aplicaciones**.
3. Arrastra **DataTools** sobre **Aplicaciones**. Espera a que termine la copia y expulsa el DMG.
4. En compilaciones sin firma, el primer arranque muestra **"No se puede abrir 'DataTools' porque no se puede verificar al desarrollador"**. Solución: clic derecho en DataTools en /Aplicaciones → **Abrir** → confirma **Abrir** en el diálogo. macOS recuerda la elección — los siguientes arranques no muestran nada.
5. Ejecútalo desde **Launchpad**, **Spotlight** (`⌘ Espacio` → escribe "DataTools"), o **Aplicaciones** en Finder.
Para mantener DataTools en el Dock: lanza la app, clic derecho en su icono del Dock → **Opciones → Mantener en el Dock**. macOS no permite que los instaladores fijen al Dock automáticamente.
**Opción B — Portable (`DataTools-<ver>-mac-portable.zip`)**
1. Descarga `DataTools-<ver>-mac-portable.zip`. Safari descomprime al descargar por defecto; en Finder verás `DataTools.app` directamente.
2. Mueve `DataTools.app` a **Aplicaciones** si quieres que aparezca en Launchpad — o déjalo en el escritorio, un USB o un recurso de red. La .app portable se ejecuta desde cualquier sitio.
3. Doble clic en `DataTools.app`. Clic derecho → **Abrir** la primera vez (misma rutina que con el DMG).
**Desinstalar**: arrastra `DataTools.app` a la Papelera. Tus archivos de datos siguen donde estén — la app no instala nada más.
### 1.3 Linux
`DataTools-<ver>-linux-x86_64.AppImage` ya es portable — no hay .zip aparte.
1. Descarga el .AppImage.
2. `chmod +x DataTools-*.AppImage`.
3. Doble clic, o ejecútalo desde la terminal.
Si tu distro no incluye FUSE 2: `sudo apt install libfuse2` (Debian/Ubuntu) o equivalente.
### 1.4 Qué pasa al arrancar por primera vez
El lanzador (llamado `DataTools.exe` / `DataTools.app` / `DataTools.AppImage`) hace tres cosas, en orden:
1. Elige un puerto TCP libre en `127.0.0.1` — normalmente el 8501; si está ocupado prueba 8502, 8503, …
2. Arranca un servidor Streamlit local en ese puerto. El servidor solo está enlazado a localhost, nunca a tu red.
3. Abre tu navegador predeterminado en `http://127.0.0.1:<puerto>/`. Si el navegador no se abre en 5 segundos, pega esa URL manualmente.
La ventana del lanzador queda abierta en segundo plano. Cerrarla detiene el servidor — la pestaña del navegador dirá "no se puede acceder a este sitio" la próxima vez.
### 1.5 Cómo funciona la GUI
- Se ejecuta localmente en tu equipo. **Sin internet, sin subidas.**
- El navegador es solo la capa de visualización. Cerrarlo detiene el programa subyacente.
- ¿Prefieres la terminal? Cada herramienta incluye también una interfaz de línea de comandos (CLI) — ver Sección 3.
- El navegador es solo la capa de visualización. Cerrarlo NO detiene la app — cierra la ventana del lanzador (o sal de la .app de macOS desde el Dock) para terminar del todo.
- ¿Prefieres la terminal? Cada herramienta incluye también una CLI — ver Sección 3.
### Requisitos del sistema
### 1.6 Requisitos del sistema
- Windows 10/11 (64 bits), macOS 11+, Linux moderno (2020+).
- Navegador moderno (Chrome, Edge, Firefox, Safari, últimos 3 años).
- ~400-500 MB de espacio libre en disco.
- ~400 MB de espacio libre en disco (el paquete ocupa ~200 MB; el resto es espacio de trabajo para CSV grandes).
Matriz de soporte completa: [REQUIREMENTS.md](REQUIREMENTS.md) (solo en inglés).
@@ -137,12 +195,15 @@ El archivo original nunca se modifica.
## 6. Solución de problemas
- **La GUI no se abre / el navegador no se inicia** — espera 10-15 s; visita manualmente `http://localhost:8501`. Error de puerto ocupado → cierra otras instancias.
- **La GUI no se abre / el navegador no se inicia** — espera 10-15 s; visita manualmente `http://127.0.0.1:8501` (o el puerto que muestre la ventana del lanzador). Error de puerto ocupado → cierra otras instancias. El lanzador recorre los puertos 85018550 buscando uno libre, así que una instancia colgada puede desplazar la URL.
- **¿Por qué se abre el navegador?** — patrón de aplicación web local (igual que Jupyter o RStudio). Nada sale de tu equipo.
- **Windows SmartScreen** — pulsa "Más información" → "Ejecutar de todas formas". Estándar para software sin firma EV.
- **macOS "La aplicación está dañada"** — descárgala de nuevo (probablemente se corrompió en tránsito).
- **El AppImage de Linux no se ejecuta** — `chmod +x archivo.AppImage`. Si falta FUSE → `sudo apt install libfuse2` o usa el `.tar.gz`.
- **Windows SmartScreen** — pulsa "Más información" → "Ejecutar de todas formas". Una sola vez por compilación hasta que tengamos un certificado EV.
- **macOS "La aplicación está dañada" / "no se puede verificar al desarrollador"** — clic derecho en la app → **Abrir** → confirma. Si el mensaje persiste, el archivo se corrompió en tránsito — vuelve a descargarlo. Último recurso: `xattr -cr /Applications/DataTools.app` limpia el atributo de cuarentena.
- **macOS — el .zip portable extraído no abre** — Safari descomprime al descargar; si ves una carpeta `__MACOSX/` o archivos `._DataTools.app` usaste otro descompresor. Vuelve a extraer con la Utilidad de Archivo integrada (clic derecho en el .zip → **Abrir con → Utilidad de Archivo**) para preservar los metadatos de la .app.
- **Windows — el antivirus pone en cuarentena `DataTools.exe` del portable** — tu antivirus no reconoce el paquete. Añade la carpeta extraída a la lista blanca. El instalador .exe activa menos antivirus porque es un envoltorio Inno Setup conocido.
- **El AppImage de Linux no se ejecuta** — `chmod +x archivo.AppImage`. Si falta FUSE → `sudo apt install libfuse2`.
- **Lento con archivos grandes** — por encima de ~100k filas tarda más; la barra de progreso lo indica. Para millones de filas → usa la CLI directamente.
- **¿Dónde guarda la app mi licencia / configuración?** — `~/.datatools/` en macOS y Linux, `C:\Users\<tú>\.datatools\` en Windows. Tus archivos de entrada y salida siguen donde los dejes; la app nunca los copia a otro sitio.
- **Necesito ayuda** — escribe al correo que aparece en tu recibo de compra.
## 7. Licencia

View File

@@ -25,27 +25,85 @@ To use the same license on a different machine: deactivate this one (Activate pa
## 1. Install
You don't need Python — the bundle is self-contained.
You don't need Python and you don't need admin rights — the bundle ships its own interpreter and every dependency. Two flavors per OS, pick whichever your IT policy allows:
| OS | File | How |
|----|------|-----|
| Windows | `BundleName-Setup-1.0.exe` | Double-click installer → desktop shortcut. |
| macOS | `BundleName-1.0.dmg` | Mount, drag to Applications. Signed + notarized. |
| Linux | `BundleName-1.0.AppImage` | `chmod +x`, double-click. (`.tar.gz` fallback available.) |
- **Installer** — wires up Desktop shortcut + Start Menu / Launchpad entry automatically. Recommended for most users.
- **Portable .zip** — unzip and double-click. No registry writes, runs from anywhere (Desktop, USB stick, network share). Use this if you can't run installers, want a single-folder install you can copy between machines, or are evaluating before committing to install.
Launching opens your default browser to a local page (`http://localhost:8501`).
Both flavors are byte-identical inside: same Python, same dependencies, same launch behavior.
### How the GUI works
### 1.1 Windows
**Option A — Installer (`DataTools-<ver>-win-setup.exe`)**
1. Download `DataTools-<ver>-win-setup.exe` from your release email or GitHub Releases.
2. Double-click the installer. On the first run Windows SmartScreen will say **"Windows protected your PC"** — click **More info****Run anyway**. (This warning only appears once per build until we have an EV code-signing cert.)
3. Accept the per-user install location (`%LOCALAPPDATA%\Programs\DataTools` by default — no admin prompt). Check **Create a desktop shortcut** if you want one (on by default).
4. Click **Install**, then **Finish**. The installer offers to launch DataTools immediately.
5. From now on launch from: **Start Menu → DataTools**, the **Desktop shortcut**, or just type `DataTools` into Windows Run (Win+R) / cmd.
To pin to the taskbar, launch the app once, right-click its icon in the taskbar, then **Pin to taskbar**. Windows requires this manual step — no installer is allowed to pin programmatically.
**Option B — Portable (`DataTools-<ver>-win-portable.zip`)**
1. Download `DataTools-<ver>-win-portable.zip`.
2. Right-click the .zip → **Extract All…** → pick a folder (e.g. `C:\Tools\DataTools`).
3. Open the extracted `DataTools\` folder, double-click `DataTools.exe`. SmartScreen warning fires the first time only.
4. To create your own desktop shortcut later: right-click `DataTools.exe`**Send to → Desktop (create shortcut)**.
**Uninstall** (installer only): Settings → Apps → DataTools → Uninstall. Portable: delete the folder.
### 1.2 macOS
**Option A — Installer DMG (`DataTools-<ver>-mac.dmg`)**
1. Download `DataTools-<ver>-mac.dmg`.
2. Double-click the .dmg. A Finder window opens showing the **DataTools** icon and an **Applications** alias.
3. Drag **DataTools** onto **Applications**. Wait for the copy to finish, then eject the DMG.
4. On unsigned builds the first launch shows **"DataTools" cannot be opened because the developer cannot be verified**. Fix: right-click DataTools in /Applications → **Open** → confirm **Open** in the dialog. macOS remembers this choice — subsequent launches are clean.
5. Launch from **Launchpad**, **Spotlight** (`⌘ Space` → type "DataTools"), or **Applications** in Finder.
To keep DataTools in the Dock: launch the app, right-click its Dock icon → **Options → Keep in Dock**. macOS doesn't allow installers to pin to the Dock automatically.
**Option B — Portable (`DataTools-<ver>-mac-portable.zip`)**
1. Download `DataTools-<ver>-mac-portable.zip`. Safari auto-unzips on download; in Finder you'll see `DataTools.app` directly.
2. Move `DataTools.app` to **Applications** if you want it discoverable via Launchpad — or keep it on your Desktop, a USB stick, or a network share. The portable .app runs from anywhere.
3. Double-click `DataTools.app`. Right-click → **Open** the first time (same unsigned-build dance as the DMG).
**Uninstall**: drag `DataTools.app` to the Trash. Your data files stay where you put them — nothing else is installed.
### 1.3 Linux
`DataTools-<ver>-linux-x86_64.AppImage` is already portable — no separate zip needed.
1. Download the .AppImage.
2. `chmod +x DataTools-*.AppImage`.
3. Double-click, or run it from a terminal.
If your distro doesn't ship FUSE 2: `sudo apt install libfuse2` (Debian/Ubuntu) or equivalent.
### 1.4 What happens on first launch
The launcher (called `DataTools.exe` / `DataTools.app` / `DataTools.AppImage`) does three things, in order:
1. Picks a free TCP port on `127.0.0.1` — usually 8501, falls back through 8502, 8503, … if another app is using 8501.
2. Starts a local Streamlit server on that port. The server is **bound to localhost only**, never to your LAN.
3. Opens your default browser at `http://127.0.0.1:<port>/`. If the browser doesn't open within 5 seconds, paste that URL into your browser manually.
The launcher window stays open in the background. Closing it stops the server — the browser tab will say "this site can't be reached" the next time you click it.
### 1.5 How the GUI works
- Runs locally on your machine. **No internet, no upload.**
- Browser is just the display surface. Closing it stops the underlying program.
- The browser is just the display surface. Closing it does NOT stop the app — close the launcher window (or quit the macOS .app from the Dock) to fully exit.
- Prefer the terminal? Every tool ships with a CLI too (Section 3).
### System requirements
### 1.6 System requirements
- Windows 10/11 (64-bit), macOS 11+, modern Linux (2020+).
- Modern browser (Chrome, Edge, Firefox, Safari, last 3 years).
- ~400-500 MB free disk space.
- ~400 MB free disk space (the bundle itself is ~200 MB; the rest is working scratch space for large CSVs).
Full numbered support matrix: [REQUIREMENTS.md](REQUIREMENTS.md).
@@ -137,12 +195,15 @@ Original input is never modified.
## 6. Troubleshooting
- **GUI won't launch / browser doesn't open** — wait 10-15 s; manually visit `http://localhost:8501`. Port-in-use error → close other instances.
- **GUI won't launch / browser doesn't open** — wait 10-15 s; manually visit `http://127.0.0.1:8501` (or whichever port the launcher window prints). Port-in-use error → close other instances. The launcher walks ports 85018550 looking for a free one, so a stale instance can shift the URL.
- **Why does my browser open?** — local web app pattern (same as Jupyter, RStudio). Nothing leaves your machine.
- **Windows SmartScreen** — click "More info" → "Run anyway". Standard for non-EV-signed software.
- **macOS "App is damaged"** — re-download (file likely corrupted in transit).
- **Linux AppImage won't run** — `chmod +x file.AppImage`. Missing FUSE → `sudo apt install libfuse2` or use `.tar.gz`.
- **Windows SmartScreen** — click "More info" → "Run anyway". One-time per build until we have an EV-signed cert.
- **macOS "App is damaged" / "developer cannot be verified"** — right-click the app → **Open** → confirm. If the message persists, the file was likely corrupted in transit — re-download. As a last resort: `xattr -cr /Applications/DataTools.app` clears the quarantine attribute.
- **macOS portable .zip — extracted but won't open** — Safari unzips on download by default; if you see a `__MACOSX/` folder or `._DataTools.app` file you used a different unarchiver. Re-extract with the built-in Archive Utility (right-click the .zip → **Open With → Archive Utility**) so the .app's metadata is preserved.
- **Windows portable .zip — antivirus quarantines DataTools.exe** — your AV doesn't recognize the bundle. Allowlist the extracted folder. The installer .exe trips fewer AV products because it's a known Inno Setup wrapper.
- **Linux AppImage won't run** — `chmod +x file.AppImage`. Missing FUSE → `sudo apt install libfuse2`.
- **Slow on large file** — over ~100k rows takes longer; progress bar shows. Multi-million rows → use the CLI directly.
- **Where does the app store my license / settings?** — `~/.datatools/` on macOS + Linux, `C:\Users\<you>\.datatools\` on Windows. Your input/output files stay where you put them; the app never copies them anywhere else.
- **Need help** — email the address on your purchase receipt.
## 7. License

View File

@@ -85,8 +85,8 @@ class TestGatePassesWithTrialLicense:
home_app.run()
text = collected_text(home_app)
# With a valid license, the activation form should NOT be the
# primary content; we should see the home title + tool cards.
assert "Data Cleaning Mastery" in text
# primary content; we should see the home tagline + tool cards.
assert "Clean. Normalize. Transform." in text
assert "Activate DataTools" not in text # form not shown inline
def test_sidebar_shows_active_status(self, trial_license, home_app):
@@ -150,7 +150,7 @@ class TestActivationFormSubmission:
# After activation the page reruns and the activation form
# should be gone — we should see the home page proper.
text = collected_text(home_app)
assert "Data Cleaning Mastery" in text
assert "Clean. Normalize. Transform." in text
def test_trial_button_absent_paid_only(self, no_license_env, home_app):
"""v1.6 dropped the user-facing trial flow — paid licenses only.

View File

@@ -59,7 +59,7 @@ class TestLanguageSwitch:
lang = home_app.session_state["ui_lang"] if "ui_lang" in home_app.session_state else "en"
assert lang == "en"
text = collected_text(home_app)
assert "Data Cleaning Mastery" in text
assert "Clean. Normalize. Transform." in text
def test_selecting_spanish_persists_in_session(self, home_app):
home_app.run()
@@ -72,22 +72,22 @@ class TestLanguageSwitch:
selector = home_app.sidebar.selectbox[0]
selector.select("es").run()
text = collected_text(home_app)
assert "Maestría" in text, (
"after selecting Spanish, the home title should switch to "
f"'🧹 DataTools — Maestría…'; got:\n{text[:300]}"
assert "Limpia. Normaliza. Transforma." in text, (
"after selecting Spanish, the home tagline should switch to "
f"'Limpia. Normaliza. Transforma.'; got:\n{text[:300]}"
)
def test_selecting_back_to_english_reverts(self, home_app):
# Start in Spanish, then flip back.
with_language(home_app, "es")
home_app.run()
assert "Maestría" in collected_text(home_app)
assert "Limpia. Normaliza. Transforma." in collected_text(home_app)
selector = home_app.sidebar.selectbox[0]
selector.select("en").run()
text = collected_text(home_app)
assert "Data Cleaning Mastery" in text
assert "Maestría" not in text
assert "Clean. Normalize. Transform." in text
assert "Limpia. Normaliza. Transforma." not in text
# ---------------------------------------------------------------------------
@@ -96,26 +96,34 @@ class TestLanguageSwitch:
class TestLocalizedChrome:
"""A spot-check on the parts of the chrome that aren't the selector:
the bottom footer caption and the home-page hero text. Other strings
are pinned indirectly by ``TestEveryPageRenders.test_expected_*``."""
the home-page privacy pill (visible to AppTest) and the upload
section heading. The sticky footer caption is rendered via a
component-iframe and isn't visible through ``collected_text``."""
def test_footer_english(self, home_app):
def test_privacy_pill_english(self, home_app):
home_app.run()
text = collected_text(home_app)
assert "Your data never leaves" in text
assert "Runs 100% locally" in text
def test_footer_spanish(self, home_app):
def test_privacy_pill_spanish(self, home_app):
with_language(home_app, "es")
home_app.run()
text = collected_text(home_app)
assert "Tus datos nunca salen" in text
assert "Se ejecuta 100% en local" in text
def test_upload_section_heading_localizes(self, home_app):
with_language(home_app, "es")
home_app.run()
text = collected_text(home_app)
# ``📤 Sube uno o más archivos para empezar`` from the es pack.
assert "Sube uno o más archivos" in text
# The visible "Files" section heading is hard-coded English
# in the redesigned home page; what's still localized is the
# file_uploader widget's label (``upload.uploader_label_multi``).
# AppTest exposes uploaders separately from the text-bearing
# widget collections, so we check the uploader's label
# attribute directly.
labels = [u.label for u in home_app.file_uploader]
assert any("Importa archivos" in lbl for lbl in labels), (
f"Spanish uploader label missing; got: {labels}"
)
# ---------------------------------------------------------------------------

View File

@@ -98,11 +98,19 @@ class TestHeader:
# ---------------------------------------------------------------------------
# Per-tool grouping → one expander per tool id
# Per-finding row → one "Open Tool" button per targeted finding
# ---------------------------------------------------------------------------
#
# The findings panel was redesigned (mockup-v2): it now renders ONE
# severity-sorted flat list rather than per-tool expanders. Each finding
# with a known tool id gets a tertiary button labelled
# ``"{Tool display name} →"`` that switches pages on click. Findings
# with no tool id (file-level CSV-shape warnings, encoding flags, etc.)
# render without a button — the description still shows so the user
# isn't blind to them.
class TestGrouping:
def test_findings_grouped_into_per_tool_expanders(self):
class TestRowsRenderForFindings:
def test_one_button_per_targeted_finding(self):
findings = [
_make_finding(tool="02_text_cleaner", id="whitespace_padding"),
_make_finding(tool="02_text_cleaner", id="nbsp_padding"),
@@ -110,96 +118,96 @@ class TestGrouping:
]
app = _harness(findings)
app.run()
labels = [e.label for e in app.expander]
# Two unique tools → two expanders. Each label carries the
# tool's display name + finding count.
text_cleaner_expanders = [lbl for lbl in labels if "Clean Text" in lbl]
format_expanders = [lbl for lbl in labels if "Standardize Formats" in lbl]
assert len(text_cleaner_expanders) == 1, (
f"expected one Clean Text expander; got: {labels}"
labels = [b.label for b in app.button]
# Each targeted finding gets its own "Open Tool" button — three
# findings → three buttons (two pointing at Clean Text, one at
# Standardize Formats).
clean_text_buttons = [l for l in labels if l == "Clean Text →"]
format_buttons = [l for l in labels if l == "Standardize Formats →"]
assert len(clean_text_buttons) == 2, (
f"expected 2 Clean Text buttons; got: {labels}"
)
assert len(format_expanders) == 1, (
f"expected one Standardize Formats expander; got: {labels}"
assert len(format_buttons) == 1, (
f"expected 1 Standardize Formats button; got: {labels}"
)
def test_tool_names_localize_in_spanish(self):
findings = [_make_finding(tool="02_text_cleaner")]
app = _harness(findings, lang="es")
app.run()
labels = [e.label for e in app.expander]
labels = [b.label for b in app.button]
assert any("Limpiar texto" in lbl for lbl in labels), (
f"Spanish tool name missing; expanders: {labels}"
)
def test_finding_count_in_expander_label(self):
findings = [
_make_finding(tool="02_text_cleaner", id=f"f{i}")
for i in range(3)
]
app = _harness(findings)
app.run()
labels = [e.label for e in app.expander]
# Pack template: "{tool} — {n} finding(s)"
text_cleaner_label = next(l for l in labels if "Clean Text" in l)
assert "3" in text_cleaner_label, (
f"expected count '3' in expander label; got {text_cleaner_label!r}"
f"Spanish tool name missing; buttons: {labels}"
)
# ---------------------------------------------------------------------------
# Open-tool button localizes
# Open-tool button labels — confirm the arrow + name format
# ---------------------------------------------------------------------------
class TestOpenToolButton:
"""Each tool section has an ``st.page_link`` to jump to that tool's
page. AppTest exposes page_links as ``app.button`` entries with
label ``"Open {tool}"`` (English) / ``"Abrir {tool}"`` (Spanish)."""
"""Each finding with a known tool gets a tertiary button labelled
``"{Tool name} →"``. The arrow + spacing is the affordance that
distinguishes the row's primary action from the title text."""
def test_open_tool_label_english(self):
findings = [_make_finding(tool="02_text_cleaner")]
app = _harness(findings)
app.run()
# ``st.page_link`` may show up under ``app.button`` or in the
# raw markdown. We probe both.
text = collected_text(app)
# Pack template: "Open {tool} →"
assert "Open Clean Text" in text
labels = [b.label for b in app.button]
assert "Clean Text →" in labels, (
f"expected 'Clean Text →' button; got: {labels}"
)
def test_open_tool_label_spanish(self):
findings = [_make_finding(tool="02_text_cleaner")]
app = _harness(findings, lang="es")
app.run()
text = collected_text(app)
# Pack template: "Abrir {tool} →"
assert "Abrir Limpiar texto" in text
labels = [b.label for b in app.button]
assert "Limpiar texto →" in labels, (
f"expected 'Limpiar texto →' button; got: {labels}"
)
# ---------------------------------------------------------------------------
# Untargeted findings (file-level) go in the "Other" expander
# Untargeted findings (file-level) render without an action button
# ---------------------------------------------------------------------------
class TestUntargetedFindings:
def test_untargeted_goes_to_other_expander_en(self):
"""A finding with ``tool=""`` (e.g., CSV BOM stripped at read time)
is file-level — no tool page to jump to — and the redesigned panel
renders the description without a button. We assert that the row
contributes nothing to ``app.button`` while still appearing in the
rendered markdown."""
def test_untargeted_renders_no_button_en(self):
findings = [
_make_finding(tool="", id="csv_bom_stripped"),
_make_finding(tool="", id="csv_bom_stripped", description="BOM stripped"),
_make_finding(tool="02_text_cleaner", id="nbsp_padding"),
]
app = _harness(findings)
app.run()
labels = [e.label for e in app.expander]
# Pack template: "Other / file-level — {n} finding(s)"
assert any("Other / file-level" in lbl for lbl in labels), (
f"untargeted expander missing; got: {labels}"
labels = [b.label for b in app.button]
# Only the targeted finding contributed a button.
assert "Clean Text →" in labels
# The BOM finding's description must still be visible somewhere.
all_md = "\n".join(
m.body for m in app.markdown if hasattr(m, "body")
)
assert "BOM stripped" in all_md, (
"untargeted finding's description should still render"
)
def test_untargeted_label_spanish(self):
findings = [_make_finding(tool="", id="csv_bom_stripped")]
def test_untargeted_renders_no_button_es(self):
findings = [_make_finding(
tool="", id="csv_bom_stripped", description="BOM eliminado",
)]
app = _harness(findings, lang="es")
app.run()
labels = [e.label for e in app.expander]
# Spanish pack: "Otros / a nivel de archivo — {n} hallazgo(s)"
assert any("Otros / a nivel de archivo" in lbl for lbl in labels), (
f"Spanish 'Other' expander missing; got: {labels}"
labels = [b.label for b in app.button]
# No tool id → no tool-jump button at all.
assert not any("" in lbl for lbl in labels), (
f"untargeted finding should not render a tool button; got: {labels}"
)

View File

@@ -34,6 +34,8 @@ PAGE_SLUGS = [
"7_Multi_File_Merger",
"8_Validator_Reporter",
"9_Pipeline_Runner",
"10_PDF_Extractor",
"11_Reconciler",
"99_Close",
]
@@ -61,17 +63,28 @@ EXPECTED_SUBSTRINGS: dict[str, dict[str, str]] = {
"7_Multi_File_Merger": {"en": "Combine Files", "es": "Combinar archivos"},
"8_Validator_Reporter": {"en": "Quality Check", "es": "Verificación de calidad"},
"9_Pipeline_Runner": {"en": "Automated", "es": "Flujos automatizados"},
# The PDF Extractor and Reconciler pages are English-only today
# (translations tracked as a follow-up). The smoke test value is
# still that the page *renders at all* in 'es'; the substring is
# the same English hero text under both languages.
"10_PDF_Extractor": {"en": "PDF to CSV", "es": "PDF to CSV"},
"11_Reconciler": {"en": "Reconcile", "es": "Reconcile"},
"99_Close": {"en": "Shutting down", "es": "Cerrando"},
}
class TestHomePageRenders:
"""The home page is the only one with full EN/ES coverage in v1.6.
Pin it independently so its translation is non-regressable."""
"""Pin the home hero in both languages.
Since the v3 brand refresh the title is the literal wordmark
("UNALOGIX DataTools") in both packs; the localized tagline is
what shifts between en and es. We assert against the tagline
string, which lives in ``home.caption`` of each pack.
"""
@pytest.mark.parametrize("lang,expected", [
("en", "DataTools — Data Cleaning Mastery"),
("es", "DataTools — Maestría en limpieza de datos"),
("en", "Clean. Normalize. Transform."),
("es", "Limpia. Normaliza. Transforma."),
])
def test_home_renders_in_language(self, home_app, lang, expected):
with_language(home_app, lang)
@@ -81,11 +94,15 @@ class TestHomePageRenders:
)
assert expected in collected_text(home_app)
def test_home_renders_footer_in_es(self, home_app):
def test_home_renders_privacy_pill_in_es(self, home_app):
# The footer caption is rendered via a component-iframe so
# ``collected_text`` can't see it. The privacy pill on the
# home header IS visible to AppTest and carries the same
# locality story, so we pin that instead.
with_language(home_app, "es")
home_app.run()
text = collected_text(home_app)
assert "Tus datos nunca salen" in text or "Se ejecuta localmente" in text
assert "Se ejecuta 100% en local" in text
class TestEveryPageRenders:
"""Parametrize over (page, language). Failure tells you exactly which

View File

@@ -152,6 +152,48 @@ class TestPipelineRunnerWorkflow:
# ---------------------------------------------------------------------------
# PDF to CSV — file-uploader-driven so we can't fully exercise the
# scan flow through AppTest. Pin the initial render (which carries the
# dep-status banner when deps are missing) so a future regression in
# the dep guard shows up here.
# ---------------------------------------------------------------------------
class TestPdfExtractorWorkflow:
def test_page_renders_without_upload(self, app_factory):
app = app_factory("10_PDF_Extractor")
app.run()
assert not app.exception
text = collected_text(app)
assert "PDF to CSV" in text
# ---------------------------------------------------------------------------
# Reconcile Two Files — early-exits at ``st.stop()`` without both
# uploads. Pin both the no-upload state and the title.
# ---------------------------------------------------------------------------
class TestReconcilerWorkflow:
def test_page_renders_without_uploads(self, app_factory):
app = app_factory("11_Reconciler")
app.run()
assert not app.exception
text = collected_text(app)
assert "Reconcile" in text
def test_prompts_for_both_uploads_when_empty(self, app_factory):
# ``st.info("Upload both files to continue.")`` fires when
# either side is missing; that text is the contract we test
# against — if the prompt disappears the user has no idea
# what to do next.
app = app_factory("11_Reconciler")
app.run()
info_messages = [i.body for i in app.info if hasattr(i, "body")]
assert any("Upload both files" in m for m in info_messages), (
f"missing 'Upload both files' prompt; got: {info_messages}"
)
# ---------------------------------------------------------------------------
# Coming-Soon pages still render (just a stub) — pinned so we know if a
# Coming-Soon goes from "stub renders" to "import error".

284
tests/test_cli_reconcile.py Normal file
View File

@@ -0,0 +1,284 @@
"""Tests for src.cli_reconcile — Typer CLI for two-source reconciliation.
The reconciliation engine itself is covered by ``test_reconcile.py``;
this file exercises the CLI surface around it: argument parsing
(comma-separated keys, optional dates), preview vs. apply modes, the
four output files, and error paths for bad inputs.
"""
from __future__ import annotations
import sys
from pathlib import Path
import pandas as pd
import pytest
from typer.testing import CliRunner
from src.cli_reconcile import app
runner = CliRunner()
def _write_bank(path: Path) -> None:
"""Bank-feed-shaped CSV with two transactions."""
path.write_text(
"date,amount,desc\n"
"2026-01-05,100.00,ACME\n"
"2026-01-06,250.00,WIDGET CO\n"
)
def _write_ledger(path: Path) -> None:
"""Ledger-shaped CSV with the same two transactions under
different column names — exercises the rename-on-match path."""
path.write_text(
"posted,amt,memo\n"
"2026-01-05,100.00,Acme Inc\n"
"2026-01-06,250.00,Widget\n"
)
class TestPreviewMode:
"""Default mode (no ``--apply``): print stats only, write nothing."""
def test_basic_preview_succeeds(self, tmp_path):
bank = tmp_path / "bank.csv"
ledger = tmp_path / "ledger.csv"
_write_bank(bank)
_write_ledger(ledger)
result = runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
"--left-date", "date", "--right-date", "posted",
])
assert result.exit_code == 0, result.stdout
assert "Matched:" in result.stdout
assert "Unmatched left:" in result.stdout
# Two-of-two match in the fixture.
assert "Matched: 2" in result.stdout
# The reminder banner is part of the preview UX.
assert "Add --apply" in result.stdout
def test_preview_does_not_write_files(self, tmp_path):
bank = tmp_path / "bank.csv"
ledger = tmp_path / "ledger.csv"
_write_bank(bank)
_write_ledger(ledger)
runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
"--left-date", "date", "--right-date", "posted",
])
# None of the four output suffixes should land beside the input.
for suffix in ("matched", "unmatched_left", "unmatched_right", "review"):
assert not (tmp_path / f"bank_{suffix}.csv").exists()
class TestApplyMode:
"""``--apply``: write the four output files beside the LEFT input."""
def test_apply_writes_four_files(self, tmp_path):
bank = tmp_path / "bank.csv"
ledger = tmp_path / "ledger.csv"
_write_bank(bank)
_write_ledger(ledger)
result = runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
"--left-date", "date", "--right-date", "posted",
"--apply",
])
assert result.exit_code == 0, result.stdout
# All four output files land beside the left input, sharing
# its stem.
for suffix in ("matched", "unmatched_left", "unmatched_right", "review"):
out = tmp_path / f"bank_{suffix}.csv"
assert out.exists(), f"missing {out.name}"
# Matched.csv carries the two pairs.
matched = pd.read_csv(tmp_path / "bank_matched.csv")
assert len(matched) == 2
def test_apply_with_unmatched_rows(self, tmp_path):
bank = tmp_path / "bank.csv"
ledger = tmp_path / "ledger.csv"
bank.write_text(
"date,amount,desc\n"
"2026-01-05,100.00,ACME\n"
"2026-01-07,99.99,LEFT-ONLY\n"
)
ledger.write_text(
"posted,amt,memo\n"
"2026-01-05,100.00,Acme\n"
"2026-01-08,500.00,RIGHT-ONLY\n"
)
result = runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
"--left-date", "date", "--right-date", "posted",
"--apply",
])
assert result.exit_code == 0
unmatched_l = pd.read_csv(tmp_path / "bank_unmatched_left.csv")
unmatched_r = pd.read_csv(tmp_path / "bank_unmatched_right.csv")
assert "LEFT-ONLY" in unmatched_l["desc"].tolist()
assert "RIGHT-ONLY" in unmatched_r["memo"].tolist()
class TestToleranceFlags:
def test_amount_tolerance_absorbs_rounding(self, tmp_path):
bank = tmp_path / "bank.csv"
ledger = tmp_path / "ledger.csv"
bank.write_text(
"date,amount,desc\n"
"2026-01-05,100.00,ACME\n"
)
ledger.write_text(
"posted,amt,memo\n"
"2026-01-05,100.02,Acme\n"
)
# Without tolerance: no match.
result_no_tol = runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
"--left-date", "date", "--right-date", "posted",
])
assert "Matched: 0" in result_no_tol.stdout
# With tolerance: one match.
result_with_tol = runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
"--left-date", "date", "--right-date", "posted",
"--amount-tolerance", "0.05",
])
assert "Matched: 1" in result_with_tol.stdout
def test_date_tolerance_allows_drift(self, tmp_path):
bank = tmp_path / "bank.csv"
ledger = tmp_path / "ledger.csv"
bank.write_text(
"date,amount,desc\n"
"2026-01-05,100.00,ACME\n"
)
ledger.write_text(
"posted,amt,memo\n"
"2026-01-07,100.00,Acme\n" # 2-day drift
)
result = runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
"--left-date", "date", "--right-date", "posted",
"--date-tolerance", "3",
])
assert "Matched: 1" in result.stdout
class TestSignInversion:
def test_invert_right_sign(self, tmp_path):
bank = tmp_path / "bank.csv"
ledger = tmp_path / "ledger.csv"
bank.write_text(
"date,amount,desc\n"
"2026-01-05,100.00,ACME\n"
)
ledger.write_text(
"posted,amt,memo\n"
"2026-01-05,-100.00,Acme\n" # sign convention flipped
)
result = runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
"--left-date", "date", "--right-date", "posted",
"--invert-right-sign",
])
assert "Matched: 1" in result.stdout
class TestKeyFlags:
def test_comma_separated_keys_pair_off(self, tmp_path):
# Same check number, mismatched posting dates — the date-only
# pass would miss but the key match catches.
bank = tmp_path / "bank.csv"
ledger = tmp_path / "ledger.csv"
bank.write_text(
"date,amount,desc,check_no\n"
"2026-01-05,100.00,ACME,1042\n"
)
ledger.write_text(
"posted,amt,memo,ref\n"
"2026-01-12,100.00,Acme,1042\n" # 7-day drift
)
result = runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
"--left-date", "date", "--right-date", "posted",
"--left-keys", "check_no",
"--right-keys", "ref",
])
assert "Matched: 1" in result.stdout
class TestErrorPaths:
def test_missing_left_file(self, tmp_path):
ledger = tmp_path / "ledger.csv"
_write_ledger(ledger)
result = runner.invoke(app, [
str(tmp_path / "nope.csv"), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
])
assert result.exit_code != 0
assert "not found" in result.stdout.lower() or "not found" in (result.stderr or "").lower()
def test_missing_right_file(self, tmp_path):
bank = tmp_path / "bank.csv"
_write_bank(bank)
result = runner.invoke(app, [
str(bank), str(tmp_path / "nope.csv"),
"--left-amount", "amount", "--right-amount", "amt",
])
assert result.exit_code != 0
def test_unknown_amount_column_surfaces_value_error(self, tmp_path):
# The reconcile engine raises ValueError on unknown column names;
# the CLI catches it and exits 1 with a readable banner.
bank = tmp_path / "bank.csv"
ledger = tmp_path / "ledger.csv"
_write_bank(bank)
_write_ledger(ledger)
result = runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "NOT_A_COLUMN", "--right-amount", "amt",
])
assert result.exit_code == 1
# Banner format: "Error: <message>"
assert "Error" in result.stdout or "Error" in (result.stderr or "")
def test_help_renders(self):
# ``--help`` must work — examples in docstrings reference it.
result = runner.invoke(app, ["--help"])
assert result.exit_code == 0
assert "reconcile" in result.stdout.lower()
class TestExcelInput:
"""Input may be CSV, TSV, or Excel — read_file dispatches by suffix."""
def test_excel_left_file_reads(self, tmp_path):
bank = tmp_path / "bank.xlsx"
df = pd.DataFrame({
"date": ["2026-01-05"],
"amount": [100.00],
"desc": ["ACME"],
})
df.to_excel(bank, index=False)
ledger = tmp_path / "ledger.csv"
_write_ledger(ledger)
result = runner.invoke(app, [
str(bank), str(ledger),
"--left-amount", "amount", "--right-amount", "amt",
"--left-date", "date", "--right-date", "posted",
])
assert result.exit_code == 0, result.stdout
# 1 of 1 left rows matched against the 2-row right ledger.
assert "Matched: 1" in result.stdout

View File

@@ -39,10 +39,16 @@ def _load_pack(code: str) -> dict:
class TestLookup:
def test_returns_english_value_by_default(self):
assert t("home.title", "en").startswith("🧹 DataTools")
# Hero title is "UNALOGIX DataTools" since the v3 rebrand. The
# Spanish value is identical (proper noun); the localized
# tagline lives under ``home.caption`` instead.
assert t("home.title", "en") == "UNALOGIX DataTools"
def test_returns_spanish_value(self):
assert "Maestría" in t("home.title", "es")
# Title stays "UNALOGIX DataTools" in es too; the localized
# tagline is what differs.
assert t("home.title", "es") == "UNALOGIX DataTools"
assert "Limpia" in t("home.caption", "es")
def test_missing_key_falls_back_to_english(self):
# ``tools.99_pipeline_runner.name`` doesn't exist; the pipeline

View File

@@ -315,3 +315,117 @@ class TestResultShape:
assert result.matched.empty
assert result.unmatched_left.empty
assert result.unmatched_right.empty
def test_one_side_empty_keeps_other_unmatched(self):
# A reconcile against an empty ledger should surface every
# left row as unmatched, not crash. Mirror case for the
# other side.
left = _bank([
("2026-01-05", 100.00, "ACME"),
("2026-01-06", 250.00, "WIDGET"),
])
right = _ledger([])
result = reconcile(left, right, ReconcileOptions(
left_amount="amount", right_amount="amt",
left_date="date", right_date="posted",
))
assert result.stats["matched"] == 0
assert result.stats["unmatched_left"] == 2
assert result.stats["unmatched_right"] == 0
def test_match_pass_tagged_for_key_pass(self):
# Pass name on each matched row tells the user *why* the engine
# accepted the pair — verify the "key" label propagates.
left = pd.DataFrame([
{"date": "2026-01-05", "amount": 100.00, "check_no": "1042"},
])
right = pd.DataFrame([
{"posted": "2099-12-31", "amt": 100.00, "ref": "1042"},
])
result = reconcile(left, right, ReconcileOptions(
left_amount="amount", right_amount="amt",
left_date="date", right_date="posted",
left_keys=["check_no"], right_keys=["ref"],
))
assert result.stats["matched"] == 1
assert result.matched.iloc[0]["match_pass"] == "key"
class TestAdditionalValidation:
"""Boundary cases for ``_validate_options`` not pinned elsewhere."""
def test_unknown_left_amount_column_raises(self):
left = pd.DataFrame([{"date": "2026-01-05", "amount": 1.0}])
right = pd.DataFrame([{"posted": "2026-01-05", "amt": 1.0}])
with pytest.raises(ValueError, match="not in left DataFrame"):
reconcile(left, right, ReconcileOptions(
left_amount="NOT_A_COLUMN", right_amount="amt",
))
def test_unknown_right_amount_column_raises(self):
left = pd.DataFrame([{"date": "2026-01-05", "amount": 1.0}])
right = pd.DataFrame([{"posted": "2026-01-05", "amt": 1.0}])
with pytest.raises(ValueError, match="not in right DataFrame"):
reconcile(left, right, ReconcileOptions(
left_amount="amount", right_amount="NOT_A_COLUMN",
))
def test_unknown_left_key_column_raises(self):
left = pd.DataFrame([{"date": "2026-01-05", "amount": 1.0}])
right = pd.DataFrame([{"posted": "2026-01-05", "amt": 1.0}])
with pytest.raises(ValueError, match="left key column"):
reconcile(left, right, ReconcileOptions(
left_amount="amount", right_amount="amt",
left_keys=["nope"], right_keys=["nope"],
))
def test_negative_date_tolerance_rejected(self):
left = pd.DataFrame([{"date": "2026-01-05", "amount": 1.0}])
right = pd.DataFrame([{"posted": "2026-01-05", "amt": 1.0}])
with pytest.raises(ValueError, match="date_tolerance_days"):
reconcile(left, right, ReconcileOptions(
left_amount="amount", right_amount="amt",
left_date="date", right_date="posted",
date_tolerance_days=-1,
))
def test_desc_min_score_out_of_range_rejected(self):
left = pd.DataFrame([{"date": "2026-01-05", "amount": 1.0}])
right = pd.DataFrame([{"posted": "2026-01-05", "amt": 1.0}])
with pytest.raises(ValueError, match="desc_min_score"):
reconcile(left, right, ReconcileOptions(
left_amount="amount", right_amount="amt",
desc_min_score=150,
))
class TestImmutability:
"""The engine must NOT mutate the caller's DataFrames — callers
rely on holding onto their input frames after the call (the GUI
Reconciler page re-renders previews from them)."""
def test_left_df_columns_unchanged(self):
left = _bank([("2026-01-05", 100.00, "ACME")])
right = _ledger([("2026-01-05", 100.00, "Acme Inc")])
before_cols = list(left.columns)
before_id = id(left)
reconcile(left, right, ReconcileOptions(
left_amount="amount", right_amount="amt",
left_date="date", right_date="posted",
))
assert list(left.columns) == before_cols
# And the caller's DataFrame object identity is preserved.
assert id(left) == before_id
def test_amounts_preserved_when_invert_right_sign_set(self):
# Even with --invert-right-sign, the original right amounts
# must come back unchanged in the result.
left = _bank([("2026-01-05", 100.00, "A")])
right = _ledger([("2026-01-05", -100.00, "X")])
original_right_amts = right["amt"].tolist()
reconcile(left, right, ReconcileOptions(
left_amount="amount", right_amount="amt",
left_date="date", right_date="posted",
invert_right_sign=True,
))
assert right["amt"].tolist() == original_right_amts

View File

@@ -0,0 +1,178 @@
"""Tests for src.gui.tools_registry — the per-tool manifest.
The registry is loaded at import time by the home page sidebar nav,
the home grid, and the findings panel's "Open Tool" links. A broken
entry would surface as a sidebar disappearance, a missing card, or a
``KeyError`` in the findings rendering. We pin the invariants those
call sites rely on:
- Every page_slug points at a file that actually exists.
- Every tool_id is unique (the analyzer keys findings on it).
- Every section is one of the declared literals.
- ``tool_by_id`` round-trips, ``display_name`` falls back gracefully.
- ``section_label`` resolves localized labels.
"""
from __future__ import annotations
from pathlib import Path
from typing import get_args
import pytest
from src.gui.tools_registry import (
SECTION_LABELS,
TOOLS,
Section,
Tier,
Tool,
display_name,
section_label,
tool_by_id,
tool_description,
tool_name,
tools_for_tier,
tools_in_section,
)
PAGES_DIR = Path(__file__).resolve().parent.parent / "src" / "gui" / "pages"
class TestRegistryInvariants:
def test_all_tool_ids_are_unique(self):
ids = [t.tool_id for t in TOOLS]
assert len(ids) == len(set(ids)), (
f"duplicate tool_id in TOOLS: {sorted(ids)}"
)
def test_all_page_slugs_point_at_real_files(self):
for tool in TOOLS:
page_file = PAGES_DIR / f"{tool.page_slug}.py"
assert page_file.exists(), (
f"{tool.tool_id}{tool.page_slug}.py does not exist"
)
def test_all_sections_are_declared_literals(self):
valid = set(get_args(Section))
for tool in TOOLS:
assert tool.section in valid, (
f"{tool.tool_id} has unknown section {tool.section!r}; "
f"valid: {sorted(valid)}"
)
def test_all_tiers_are_declared_literals(self):
valid = set(get_args(Tier))
for tool in TOOLS:
assert tool.tier in valid, (
f"{tool.tool_id} has unknown tier {tool.tier!r}; "
f"valid: {sorted(valid)}"
)
def test_every_section_has_a_display_label(self):
for section in get_args(Section):
assert section in SECTION_LABELS, (
f"section {section!r} has no SECTION_LABELS entry"
)
def test_no_orphan_section_labels(self):
# The other direction: a SECTION_LABELS key that isn't a
# declared Section literal is dead config.
valid = set(get_args(Section))
for key in SECTION_LABELS:
assert key in valid, (
f"SECTION_LABELS has stray key {key!r} not in Section"
)
class TestToolLookups:
def test_tool_by_id_round_trips_every_entry(self):
for tool in TOOLS:
found = tool_by_id(tool.tool_id)
assert found is tool, (
f"tool_by_id({tool.tool_id!r}) returned {found!r}"
)
def test_tool_by_id_returns_none_for_unknown(self):
assert tool_by_id("not_a_real_tool_id") is None
def test_display_name_falls_back_to_id(self):
# Documented behavior: unknown id returns the id itself so the
# bug is visible in the UI rather than crashing.
assert display_name("not_a_real_tool_id") == "not_a_real_tool_id"
def test_display_name_resolves_known_tool(self):
# Pick a tool we know ships in every build.
assert display_name("02_text_cleaner") == "Clean Text"
class TestTierAndSectionFilters:
def test_tools_for_tier_empty_returns_all(self):
assert tools_for_tier() == list(TOOLS)
def test_tools_for_tier_filters(self):
# Every tool is tier="core" today, so an explicit core filter
# should still match the full set. A "pro"-only call should
# return an empty list.
assert tools_for_tier("core") == list(TOOLS)
assert tools_for_tier("pro") == []
def test_tools_in_section_preserves_registry_order(self):
cleaners = tools_in_section("cleaners")
in_full_order = [t for t in TOOLS if t.section == "cleaners"]
assert cleaners == in_full_order
@pytest.mark.parametrize("section", list(get_args(Section)))
def test_every_section_has_at_least_one_tool(self, section):
assert tools_in_section(section), (
f"section {section!r} has zero tools — sidebar group would be empty"
)
class TestLocalizedAccessors:
def test_tool_name_falls_back_to_registry_default(self):
# An unknown tool id should return the literal id, not crash.
assert tool_name("not_a_real_tool_id") == "not_a_real_tool_id"
def test_tool_name_returns_localized_when_pack_has_key(self):
# The lang packs ship a "tools.{id}.name" key for every shipped
# tool. We don't assert the exact translation here (the lang
# pack parity test pins that); we just check the helper returns
# something non-empty and not the literal lookup key.
name = tool_name("02_text_cleaner")
assert name and name != "tools.02_text_cleaner.name"
def test_tool_description_returns_localized_or_fallback(self):
desc = tool_description("02_text_cleaner")
assert desc and desc != "tools.02_text_cleaner.description"
def test_tool_description_for_unknown_returns_empty(self):
# Unknown ids return the registry fallback (""), not a
# lookup-key string. The home grid avoids rendering empty
# descriptions, so this contract matters.
assert tool_description("not_a_real_tool_id") == ""
@pytest.mark.parametrize("section", list(get_args(Section)))
def test_section_label_returns_non_empty(self, section):
label = section_label(section)
assert label and label != f"nav.section_{section}"
class TestReconcilerAndPdfArePresent:
"""The two newest pages were the most likely to be forgotten in
the registry — pin them explicitly so a regression flagging
"Ready" tools as missing from nav is loud."""
def test_pdf_extractor_present(self):
tool = tool_by_id("10_pdf_extractor")
assert tool is not None
assert tool.page_slug == "10_PDF_Extractor"
assert tool.status == "Ready"
def test_reconciler_present(self):
tool = tool_by_id("11_reconciler")
assert tool is not None
assert tool.page_slug == "11_Reconciler"
assert tool.status == "Ready"
# The new "analysis" section was introduced with this tool;
# if the section disappears, the sidebar group goes empty.
assert tool.section == "analysis"