docs: add project documentation files
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
180
docs/RECOVERY.md
Normal file
180
docs/RECOVERY.md
Normal file
@@ -0,0 +1,180 @@
|
||||
# RECOVERY.md - Full Project Recovery Guide
|
||||
|
||||
> **Creator-only document. Do not ship to buyers.**
|
||||
|
||||
**Version**: 1.6
|
||||
**Last updated**: April 28, 2026
|
||||
|
||||
If the project is ever lost, this guide plus the source ZIP is enough to rebuild it 100%.
|
||||
|
||||
---
|
||||
|
||||
## 1. What's in the Project
|
||||
|
||||
```
|
||||
project-root/
|
||||
├── README.md
|
||||
├── BUSINESS.md # Creator only
|
||||
├── TECHNICAL.md # Creator only
|
||||
├── DECISIONS.md # Creator only - locked criteria, rationale, GUI framework decision
|
||||
├── USER-GUIDE.md # Ships to buyers
|
||||
├── RECOVERY.md # Creator only (this file)
|
||||
│
|
||||
├── scripts/ # The 9 .py source files (CLI entry points)
|
||||
│ ├── 01_deduplicator.py # Working
|
||||
│ ├── 02_text_cleaner.py
|
||||
│ ├── 03_format_standardizer.py
|
||||
│ ├── 04_missing_value_handler.py
|
||||
│ ├── 05_column_mapper_enforcer.py
|
||||
│ ├── 06_outlier_detector.py
|
||||
│ ├── 07_multi_file_merger.py
|
||||
│ ├── 08_validator_reporter.py
|
||||
│ └── 09_master_orchestrator.py
|
||||
│
|
||||
├── src/
|
||||
│ ├── core/ # Shared business logic - both CLI and GUI call into this
|
||||
│ ├── cli.py # Typer CLI front-end
|
||||
│ └── gui/ # Streamlit GUI front-end
|
||||
│ ├── app.py # Streamlit entry point
|
||||
│ ├── pages/ # One Streamlit page per script in the bundle
|
||||
│ └── components.py # Shared widgets
|
||||
│
|
||||
├── samples/
|
||||
│ ├── messy_sales.csv
|
||||
│ └── bank_export.xlsx
|
||||
│
|
||||
├── demo/
|
||||
│ └── streamlit_app.py # Constrained version for Streamlit Community Cloud
|
||||
│
|
||||
├── build/
|
||||
│ ├── pyinstaller.spec # Cross-platform build spec (handles GUI launcher + CLI binaries)
|
||||
│ ├── launcher.py # Starts local Streamlit server, opens default browser
|
||||
│ ├── windows/
|
||||
│ │ └── installer.iss # Inno Setup wrapper
|
||||
│ ├── macos/
|
||||
│ │ ├── entitlements.plist
|
||||
│ │ └── dmg_settings.py
|
||||
│ └── linux/
|
||||
│ └── AppImage/ # AppImage build assets
|
||||
│
|
||||
├── ci/
|
||||
│ └── build.yml # GitHub Actions cross-platform build
|
||||
│
|
||||
├── tests/
|
||||
│
|
||||
└── requirements.txt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Rebuild Steps
|
||||
|
||||
### From a complete ZIP backup
|
||||
1. Unzip into a clean directory.
|
||||
2. Push to a GitHub repository.
|
||||
3. The CI pipeline (`ci/build.yml`) builds Windows, macOS, and Linux artifacts on tagged releases.
|
||||
4. Connect the repo to Streamlit Community Cloud and point it at `demo/streamlit_app.py` to redeploy the hosted demo.
|
||||
5. For local builds: see Section 3.
|
||||
6. Done.
|
||||
|
||||
### From documentation only (worst case)
|
||||
1. Read `DECISIONS.md` to understand *why* the project is what it is. Section 4c locks the GUI framework as Streamlit; Section 4b locks the UX standards. These are non-negotiable.
|
||||
2. Read `TECHNICAL.md` Sections 2-3 for the build pipeline architecture, including the Streamlit launcher pattern in Section 3.4.
|
||||
3. Read `BUSINESS.md` for product strategy, which bundles to build, and the hosted demo as a marketing asset.
|
||||
4. Recreate scripts using the spec in `USER-GUIDE.md` Section 2 (script table), `TECHNICAL.md` Section 7 (per-bundle technical notes), `TECHNICAL.md` Section 9 (boundary between scripts 04 and 06 - do not relitigate this), and `TECHNICAL.md` Section 10 (per-script functional requirements; Section 10.1 is the v1 launch target for the deduplicator).
|
||||
5. Set up the cross-platform build pipeline (Section 3 below).
|
||||
6. Recreate installer configs per `TECHNICAL.md` Section 3.
|
||||
7. Build the constrained `demo/streamlit_app.py` for hosted deployment. Constraints: row limit, watermark, sample data only or strict file-size cap.
|
||||
|
||||
---
|
||||
|
||||
## 3. Local Build Setup (per platform)
|
||||
|
||||
### All platforms (common)
|
||||
- Install Python 3.11+.
|
||||
- `pip install -r requirements.txt pyinstaller`
|
||||
- Verify Streamlit app runs locally: `streamlit run src/gui/app.py`
|
||||
- Verify CLI runs locally: `python -m src.cli --help`
|
||||
|
||||
### Windows
|
||||
- Install Inno Setup: https://jrsoftware.org/isinfo.php
|
||||
- Build: `pyinstaller build/pyinstaller.spec`
|
||||
- Wrap in installer: open `build/windows/installer.iss` in Inno Setup, compile.
|
||||
|
||||
### macOS
|
||||
- Install Xcode command line tools: `xcode-select --install`
|
||||
- Enroll in Apple Developer Program ($99/yr). Allow 1-2 weeks first time.
|
||||
- Generate Developer ID Application certificate, install in Keychain.
|
||||
- Generate app-specific password for `notarytool`.
|
||||
- Build: `pyinstaller build/pyinstaller.spec`
|
||||
- Sign: `codesign --deep --force --options runtime --sign "Developer ID Application: [Name]" dist/BundleName.app`
|
||||
- Package as DMG.
|
||||
- Notarize: `xcrun notarytool submit BundleName.dmg --wait`
|
||||
- Staple: `xcrun stapler staple BundleName.dmg`
|
||||
|
||||
### Linux
|
||||
- Install AppImage tooling: download `appimagetool` from https://appimage.github.io
|
||||
- Build: `pyinstaller build/pyinstaller.spec`
|
||||
- Wrap as AppImage using `appimagetool` per the assets in `build/linux/AppImage/`.
|
||||
|
||||
### Streamlit + PyInstaller specific notes
|
||||
- A custom PyInstaller hook (`hook-streamlit.py`) is required to bundle Streamlit's data files correctly.
|
||||
- Hidden imports must include `streamlit`, `altair`, `pyarrow` (and their submodules where PyInstaller fails to detect them).
|
||||
- The launcher script (`build/launcher.py`) is the actual PyInstaller entry point, not the Streamlit script directly.
|
||||
- Budget 1-3 days the first time getting the Streamlit-PyInstaller spec right; it's reusable across all subsequent bundles.
|
||||
|
||||
### CI build (recommended)
|
||||
- Push the repo to GitHub.
|
||||
- Tag a release: `git tag v1.0.0 && git push --tags`
|
||||
- GitHub Actions runs the matrix build, produces all three artifacts.
|
||||
- Manual step: download artifacts from the Releases page, upload to Gumroad / Lemon Squeezy.
|
||||
|
||||
### Hosted demo deployment (separate from desktop build)
|
||||
- Connect GitHub repo to Streamlit Community Cloud (one-time, free).
|
||||
- Configure the deployment to point at `demo/streamlit_app.py`.
|
||||
- The demo updates automatically on git push to the configured branch.
|
||||
- Custom domain optional via CNAME (verify Streamlit Community Cloud current policy at recovery time).
|
||||
|
||||
---
|
||||
|
||||
## 4. External Dependencies (re-acquire if lost)
|
||||
|
||||
| Item | Source | Cost |
|
||||
|---|---|---|
|
||||
| Python | https://python.org/downloads | Free |
|
||||
| PyInstaller | `pip install pyinstaller` | Free |
|
||||
| Streamlit | `pip install streamlit` | Free |
|
||||
| Inno Setup (Windows) | https://jrsoftware.org/isinfo.php | Free |
|
||||
| Apple Developer Program (macOS signing) | https://developer.apple.com | $99/yr |
|
||||
| Xcode command line tools (macOS) | `xcode-select --install` | Free |
|
||||
| appimagetool (Linux) | https://appimage.github.io | Free |
|
||||
| GitHub Actions (CI) | github.com | Free tier covers all three OS runners |
|
||||
| Streamlit Community Cloud (demo hosting) | streamlit.io/cloud | Free |
|
||||
| Python libraries | See `requirements.txt`, `pip install -r requirements.txt` | Free |
|
||||
|
||||
---
|
||||
|
||||
## 5. Backup Recommendation
|
||||
|
||||
- **Primary backup**: GitHub repository (private). Source is the source of truth.
|
||||
- **Secondary backup**: ZIP of the full project tree on cloud storage (Google Drive / Dropbox / S3).
|
||||
- **Apple Developer credentials**: store certificate + app-specific password in a password manager. Losing these requires regenerating, not catastrophic.
|
||||
- **Streamlit Community Cloud connection**: stored in Streamlit's UI as a GitHub OAuth link. Re-authorize from a new Streamlit account if lost.
|
||||
- Back up after every meaningful code or doc change.
|
||||
- Include this `RECOVERY.md` and `DECISIONS.md` in every backup. They contain the irreplaceable context.
|
||||
|
||||
---
|
||||
|
||||
## 6. Recovery Priorities (if rebuilding under time pressure)
|
||||
|
||||
If you only have time to rebuild part of the project, this is the order:
|
||||
|
||||
1. **Source: `src/core/` and `scripts/`**. Without these there is no product.
|
||||
2. **DECISIONS.md**. Without this you will re-litigate every settled decision (especially GUI framework, dual interface, UX standards) and probably get it wrong differently.
|
||||
3. **TECHNICAL.md**, especially Sections 9 (04/06 boundary) and 10 (per-script functional requirements). Without these you will rebuild the deduplicator with weaker fuzzy matching than the v1 launch spec demands and ship something that loses to free Excel.
|
||||
4. **Streamlit GUI source (`src/gui/`)**. The primary buyer surface; without it the product reverts to CLI-only and the buyer persona will refund.
|
||||
5. **PyInstaller spec + launcher + per-OS build configs** (`build/`). Reproducing the Streamlit-PyInstaller integration from scratch is 1-3 days of work.
|
||||
6. **Apple Developer Program enrollment**. 1-2 week lead time. Start this first if Mac distribution matters.
|
||||
7. **Hosted demo (`demo/streamlit_app.py`)**. Important marketing asset but not blocking for desktop sales.
|
||||
8. Documentation files (USER-GUIDE, BUSINESS, README). Recoverable from memory + this guide.
|
||||
9. CI config (`ci/build.yml`). Nice to have, not blocking.
|
||||
Reference in New Issue
Block a user