8.8 KiB
8.8 KiB
RECOVERY.md - Full Project Recovery Guide
Creator-only document. Do not ship to buyers.
Version: 1.6 Last updated: April 28, 2026
If the project is ever lost, this guide plus the source ZIP is enough to rebuild it 100%.
1. What's in the Project
project-root/
├── README.md
├── BUSINESS.md # Creator only
├── TECHNICAL.md # Creator only
├── DECISIONS.md # Creator only - locked criteria, rationale, GUI framework decision
├── USER-GUIDE.md # Ships to buyers
├── RECOVERY.md # Creator only (this file)
│
├── scripts/ # The 9 .py source files (CLI entry points)
│ ├── 01_deduplicator.py # Working
│ ├── 02_text_cleaner.py
│ ├── 03_format_standardizer.py
│ ├── 04_missing_value_handler.py
│ ├── 05_column_mapper_enforcer.py
│ ├── 06_outlier_detector.py
│ ├── 07_multi_file_merger.py
│ ├── 08_validator_reporter.py
│ └── 09_master_orchestrator.py
│
├── src/
│ ├── core/ # Shared business logic - both CLI and GUI call into this
│ ├── cli.py # Typer CLI front-end
│ └── gui/ # Streamlit GUI front-end
│ ├── app.py # Streamlit entry point
│ ├── pages/ # One Streamlit page per script in the bundle
│ └── components.py # Shared widgets
│
├── samples/
│ ├── messy_sales.csv
│ └── bank_export.xlsx
│
├── demo/
│ └── streamlit_app.py # Constrained version for Streamlit Community Cloud
│
├── build/
│ ├── pyinstaller.spec # Cross-platform build spec (handles GUI launcher + CLI binaries)
│ ├── launcher.py # Starts local Streamlit server, opens default browser
│ ├── windows/
│ │ └── installer.iss # Inno Setup wrapper
│ ├── macos/
│ │ ├── entitlements.plist
│ │ └── dmg_settings.py
│ └── linux/
│ └── AppImage/ # AppImage build assets
│
├── ci/
│ └── build.yml # GitHub Actions cross-platform build
│
├── tests/
│
└── requirements.txt
2. Rebuild Steps
From a complete ZIP backup
- Unzip into a clean directory.
- Push to a GitHub repository.
- The CI pipeline (
ci/build.yml) builds Windows, macOS, and Linux artifacts on tagged releases. - Connect the repo to Streamlit Community Cloud and point it at
demo/streamlit_app.pyto redeploy the hosted demo. - For local builds: see Section 3.
- Done.
From documentation only (worst case)
- Read
DECISIONS.mdto understand why the project is what it is. Section 4c locks the GUI framework as Streamlit; Section 4b locks the UX standards. These are non-negotiable. - Read
TECHNICAL.mdSections 2-3 for the build pipeline architecture, including the Streamlit launcher pattern in Section 3.4. - Read
BUSINESS.mdfor product strategy, which bundles to build, and the hosted demo as a marketing asset. - Recreate scripts using the spec in
USER-GUIDE.mdSection 2 (script table),TECHNICAL.mdSection 7 (per-bundle technical notes),TECHNICAL.mdSection 9 (boundary between scripts 04 and 06 - do not relitigate this), andTECHNICAL.mdSection 10 (per-script functional requirements; Section 10.1 is the v1 launch target for the deduplicator). - Set up the cross-platform build pipeline (Section 3 below).
- Recreate installer configs per
TECHNICAL.mdSection 3. - Build the constrained
demo/streamlit_app.pyfor hosted deployment. Constraints: row limit, watermark, sample data only or strict file-size cap.
3. Local Build Setup (per platform)
All platforms (common)
- Install Python 3.11+.
pip install -r requirements.txt pyinstaller- Verify Streamlit app runs locally:
streamlit run src/gui/app.py - Verify CLI runs locally:
python -m src.cli --help
Windows
- Install Inno Setup: https://jrsoftware.org/isinfo.php
- Build:
pyinstaller build/pyinstaller.spec - Wrap in installer: open
build/windows/installer.issin Inno Setup, compile.
macOS
- Install Xcode command line tools:
xcode-select --install - Enroll in Apple Developer Program ($99/yr). Allow 1-2 weeks first time.
- Generate Developer ID Application certificate, install in Keychain.
- Generate app-specific password for
notarytool. - Build:
pyinstaller build/pyinstaller.spec - Sign:
codesign --deep --force --options runtime --sign "Developer ID Application: [Name]" dist/BundleName.app - Package as DMG.
- Notarize:
xcrun notarytool submit BundleName.dmg --wait - Staple:
xcrun stapler staple BundleName.dmg
Linux
- Install AppImage tooling: download
appimagetoolfrom https://appimage.github.io - Build:
pyinstaller build/pyinstaller.spec - Wrap as AppImage using
appimagetoolper the assets inbuild/linux/AppImage/.
Streamlit + PyInstaller specific notes
- A custom PyInstaller hook (
hook-streamlit.py) is required to bundle Streamlit's data files correctly. - Hidden imports must include
streamlit,altair,pyarrow(and their submodules where PyInstaller fails to detect them). - The launcher script (
build/launcher.py) is the actual PyInstaller entry point, not the Streamlit script directly. - Budget 1-3 days the first time getting the Streamlit-PyInstaller spec right; it's reusable across all subsequent bundles.
CI build (recommended)
- Push the repo to GitHub.
- Tag a release:
git tag v1.0.0 && git push --tags - GitHub Actions runs the matrix build, produces all three artifacts.
- Manual step: download artifacts from the Releases page, upload to Gumroad / Lemon Squeezy.
Hosted demo deployment (separate from desktop build)
- Connect GitHub repo to Streamlit Community Cloud (one-time, free).
- Configure the deployment to point at
demo/streamlit_app.py. - The demo updates automatically on git push to the configured branch.
- Custom domain optional via CNAME (verify Streamlit Community Cloud current policy at recovery time).
4. External Dependencies (re-acquire if lost)
| Item | Source | Cost |
|---|---|---|
| Python | https://python.org/downloads | Free |
| PyInstaller | pip install pyinstaller |
Free |
| Streamlit | pip install streamlit |
Free |
| Inno Setup (Windows) | https://jrsoftware.org/isinfo.php | Free |
| Apple Developer Program (macOS signing) | https://developer.apple.com | $99/yr |
| Xcode command line tools (macOS) | xcode-select --install |
Free |
| appimagetool (Linux) | https://appimage.github.io | Free |
| GitHub Actions (CI) | github.com | Free tier covers all three OS runners |
| Streamlit Community Cloud (demo hosting) | streamlit.io/cloud | Free |
| Python libraries | See requirements.txt, pip install -r requirements.txt |
Free |
5. Backup Recommendation
- Primary backup: GitHub repository (private). Source is the source of truth.
- Secondary backup: ZIP of the full project tree on cloud storage (Google Drive / Dropbox / S3).
- Apple Developer credentials: store certificate + app-specific password in a password manager. Losing these requires regenerating, not catastrophic.
- Streamlit Community Cloud connection: stored in Streamlit's UI as a GitHub OAuth link. Re-authorize from a new Streamlit account if lost.
- Back up after every meaningful code or doc change.
- Include this
RECOVERY.mdandDECISIONS.mdin every backup. They contain the irreplaceable context.
6. Recovery Priorities (if rebuilding under time pressure)
If you only have time to rebuild part of the project, this is the order:
- Source:
src/core/andscripts/. Without these there is no product. - DECISIONS.md. Without this you will re-litigate every settled decision (especially GUI framework, dual interface, UX standards) and probably get it wrong differently.
- TECHNICAL.md, especially Sections 9 (04/06 boundary) and 10 (per-script functional requirements). Without these you will rebuild the deduplicator with weaker fuzzy matching than the v1 launch spec demands and ship something that loses to free Excel.
- Streamlit GUI source (
src/gui/). The primary buyer surface; without it the product reverts to CLI-only and the buyer persona will refund. - PyInstaller spec + launcher + per-OS build configs (
build/). Reproducing the Streamlit-PyInstaller integration from scratch is 1-3 days of work. - Apple Developer Program enrollment. 1-2 week lead time. Start this first if Mac distribution matters.
- Hosted demo (
demo/streamlit_app.py). Important marketing asset but not blocking for desktop sales. - Documentation files (USER-GUIDE, BUSINESS, README). Recoverable from memory + this guide.
- CI config (
ci/build.yml). Nice to have, not blocking.