Reported: every page renders empty in the main body even after the audit-log defensive-wrap commit (59c6d0f). Close button also doesn't trigger shutdown — that page is blank too. Sidebar nav still renders, so the chrome path that runs on every page is the suspect. Three chrome additions land all at once and are temporarily turned off so the user can see whether bare chrome restores rendering: 1. **Sticky footer (``render_sticky_footer``)**: short-circuited with ``return`` at the top of the function. The CSS-injection + components-html iframe mechanic is the highest-suspicion item — if the iframe script throws or the CSS interacts badly with the user's Streamlit / Python build, the side effects can be page-killing on theirs while invisible on ours. The original body is preserved as ``_render_sticky_footer_DISABLED`` so re-enabling is a one-line change. 2. **Diagnostics sidebar (``_render_diagnostics_sidebar``)**: call site in ``hide_streamlit_chrome`` is gated by ``if False:``. Wrapping in try/except (the previous commit) caught exceptions but didn't help — silent partial renders inside ``with st.sidebar: with st.expander: ...`` can still leave the render stack in a bad state on some Streamlit versions. 3. **Compact-spacing CSS layer**: the ``gap: 0.5rem !important;`` on ``stVerticalBlock`` / ``stHorizontalBlock``, the slim heading margins, the slim hr / caption / expander / button / metric rules — all stripped back to the pre-compact ``_HIDE_CHROME_CSS``. The ``gap`` rule in particular is a suspect: if the user's Streamlit version doesn't render stVerticalBlock as a flex container, the rule is harmless; if it does and interacts badly with overflow, content could be clipped. What's deliberately KEPT enabled: - The audit-log calls (already wrapped from59c6d0f). - ``log_page_open`` calls in tool pages (already wrapped internally). - All UI changes pre-compact (the unified tool-page layout, the download-button helper, etc.). If pages render after this commit, we know it's one of the three disabled items above and can bisect further. If they still don't render, the cause is in code that pre-dated the audit-log work and the bisection has to keep going. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🌐 Language: English · Español
DataTools
Local CSV / Excel cleaning. CLI + browser GUI, no cloud, no install ceremony. GUI ships with English and Spanish language packs.
Tools
| # | Tool | Status |
|---|---|---|
| 01 | Find Duplicates — exact + fuzzy match, 5 normalizers, survivor rules, audit | Ready |
| 02 | Clean Text — whitespace, smart chars, BOM, line endings, case ops | Ready |
| 03 | Standardize Formats — dates, phones, emails, addresses, names, currencies, booleans | Ready |
| 04 | Fix Missing Values — disguised-null detection, profile, mean/median/mode/ffill/bfill/interpolate, drop strategies | Ready |
| 05 | Map Columns — fuzzy auto-rename, target schema with type coercion, required fields with defaults, drop/reorder | Ready |
| 06 | Find Unusual Values | Coming Soon |
| 07 | Combine Files | Coming Soon |
| 08 | Quality Check | Coming Soon |
| 09 | Automated Workflows — chain tools with recommended (not forced) order, save/load JSON, automate weekly cleanups | Ready |
Download (non-technical users)
Pre-built installers — no Python required:
| Platform | Download | First-launch note |
|---|---|---|
| macOS | DataTools-X.Y.Z-mac.dmg |
Drag DataTools.app into /Applications, then double-click. |
| Windows | DataTools-X.Y.Z-win-setup.exe |
Run the installer; launches from Start Menu. |
| Linux | DataTools-X.Y.Z-linux-x86_64.AppImage |
chmod +x the file, then double-click. |
Latest release: see GitHub Releases (or the Gumroad listing). The installers are ~150–200 MB; the launcher boots a local server at http://127.0.0.1:8501 and opens your browser. Nothing is sent to the cloud.
Install from source (developers)
pip install -r requirements.txt
Python 3.10+ required.
Run
GUI (recommended):
streamlit run src/gui/app.py
CLI — seven entry points:
python -m src.cli customers.csv [--apply] # dedup
python -m src.cli_text_clean messy.csv [--apply] # text clean
python -m src.cli_format intl.csv [--apply] # format standardize (auto-streams >100 MB)
python -m src.cli_missing holes.csv [--apply] # missing values
python -m src.cli_column_map vendor.csv [--apply] # column mapper
python -m src.cli_pipeline any_file.csv [--apply] # chain tools end-to-end
python -m src.cli_analyze any_file.csv [--json] # scan only
Every CLI runs preview-only by default; add --apply to write output.
Language
The GUI sidebar has a language picker. Packs ship for English and Español (src/i18n/packs/); the choice persists for the session. Adding a language: drop a <code>.json next to en.json mirroring its key tree, then list it in LANGUAGES. See Developer Guide §i18n.
Review & Normalize gate
Every uploaded file passes through a CSV-normalization gate before any tool sees it. The analyzer flags ~15 issue types (whitespace, NBSP / zero-width chars, BOM, encoding, smart punct, dirty headers, null sentinels, mojibake, …) tagged by confidence (high / medium / low) and fix action. The GUI shows each finding with Auto-fix / Skip / Customize, a live before/after preview, and an encoding-override picker. Tool pages refuse to load until the gate passes.
Output
Every run writes:
{input}_<tool>.csv— the cleaned data{input}_changes.csv(text cleaner) or{input}_match_groups.csv(dedup) — audit traillogs/<tool>_YYYYMMDD_HHMMSS.log— debug-level run log
Original input file is never modified.
Docs
- User Guide — install, GUI workflow, gate
- CLI Reference — every flag with recipes
- Requirements — file sizes, encodings, detectors, perf targets
- Technical — architecture, gate internals, fix registry
- Developer Guide — adding fixes / detectors / standardizers
Dependencies
pandas, openpyxl, rapidfuzz, phonenumbers, typer, loguru, charset-normalizer, streamlit. Optional: ftfy for mojibake repair.
License
Proprietary.