User hit ``AttributeError: module 'streamlit.elements.image' has no attribute 'image_to_url'`` on first PDF import. Root cause: ``streamlit-drawable-canvas`` 0.9.3 (last upstream release 2023) calls a Streamlit internal that was relocated in Streamlit ~1.30+. The function moved from ``streamlit.elements.image`` to ``streamlit.elements.lib.image_utils`` AND its signature changed — the second positional argument is now a ``LayoutConfig`` dataclass instead of a plain ``int`` width. Three remedies considered: 1. Downgrade Streamlit. Reverses unrelated improvements + security fixes; not on the table. 2. Fork drawable-canvas. The maintenance hit isn't worth it for a one-line internal API change. 3. **Ship a compatibility shim.** Re-attach a wrapper at the old import path that adapts the old call shape to the new function. This is the standard workaround the wider Streamlit community has converged on for this exact regression. ``src/gui/_drawable_canvas_compat.py`` does (3). The ``install()`` helper is idempotent, opt-in (not auto-run at module import — a grep for ``_install_canvas_compat`` shows every call site), and no-ops if Streamlit hasn't moved the function OR if the new function isn't where we expect (lets the canvas surface a real error rather than papering over a different bug). The page calls ``_install_canvas_compat()`` once at module top before any ``st_canvas`` invocation; Streamlit's script-rerun model means this fires every page load but the ``_PATCHED`` guard makes re-runs free. The shim wraps the old ``width=int`` arg into a default-constructed ``LayoutConfig()`` — the old ``width=-1`` sentinel meant "use the image's natural width", which is also what an unconfigured LayoutConfig produces. Confirmed by inspecting Streamlit 1.57.0's ``image_utils.py``. 4 new tests pin the shim contract: - ``install()`` attaches ``image_to_url`` to the old path on modern Streamlit - Idempotent — calling twice doesn't double-wrap - Doesn't clobber a future Streamlit that restores the original at the old path - Translates ``(image, -1, False, "RGB", "PNG", "id")`` into a proper call to the new function with a ``LayoutConfig`` instance If a future Streamlit upgrade moves ``image_to_url`` AGAIN, the shim's silent-no-op fallback means the canvas error surfaces again and points at where to look. The shim doesn't paper over mysteries; it only patches the one specific relocation we know about. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🌐 Language: English · Español
DataTools
Local CSV / Excel cleaning. CLI + browser GUI, no cloud, no install ceremony. GUI ships with English and Spanish language packs.
Tools
| # | Tool | Status |
|---|---|---|
| 01 | Find Duplicates — exact + fuzzy match, 5 normalizers, survivor rules, audit | Ready |
| 02 | Clean Text — whitespace, smart chars, BOM, line endings, case ops | Ready |
| 03 | Standardize Formats — dates, phones, emails, addresses, names, currencies, booleans | Ready |
| 04 | Fix Missing Values — disguised-null detection, profile, mean/median/mode/ffill/bfill/interpolate, drop strategies | Ready |
| 05 | Map Columns — fuzzy auto-rename, target schema with type coercion, required fields with defaults, drop/reorder | Ready |
| 06 | Find Unusual Values | Coming Soon |
| 07 | Combine Files | Coming Soon |
| 08 | Quality Check | Coming Soon |
| 09 | Automated Workflows — chain tools with recommended (not forced) order, save/load JSON, automate weekly cleanups | Ready |
Download (non-technical users)
Pre-built installers — no Python required:
| Platform | Download | First-launch note |
|---|---|---|
| macOS | DataTools-X.Y.Z-mac.dmg |
Drag DataTools.app into /Applications, then double-click. |
| Windows | DataTools-X.Y.Z-win-setup.exe |
Run the installer; launches from Start Menu. |
| Linux | DataTools-X.Y.Z-linux-x86_64.AppImage |
chmod +x the file, then double-click. |
Latest release: see GitHub Releases (or the Gumroad listing). The installers are ~150–200 MB; the launcher boots a local server at http://127.0.0.1:8501 and opens your browser. Nothing is sent to the cloud.
Install from source (developers)
pip install -r requirements.txt
Python 3.10+ required.
Run
GUI (recommended):
streamlit run src/gui/app.py
CLI — seven entry points:
python -m src.cli customers.csv [--apply] # dedup
python -m src.cli_text_clean messy.csv [--apply] # text clean
python -m src.cli_format intl.csv [--apply] # format standardize (auto-streams >100 MB)
python -m src.cli_missing holes.csv [--apply] # missing values
python -m src.cli_column_map vendor.csv [--apply] # column mapper
python -m src.cli_pipeline any_file.csv [--apply] # chain tools end-to-end
python -m src.cli_analyze any_file.csv [--json] # scan only
Every CLI runs preview-only by default; add --apply to write output.
Language
The GUI sidebar has a language picker. Packs ship for English and Español (src/i18n/packs/); the choice persists for the session. Adding a language: drop a <code>.json next to en.json mirroring its key tree, then list it in LANGUAGES. See Developer Guide §i18n.
Review & Normalize gate
Every uploaded file passes through a CSV-normalization gate before any tool sees it. The analyzer flags ~15 issue types (whitespace, NBSP / zero-width chars, BOM, encoding, smart punct, dirty headers, null sentinels, mojibake, …) tagged by confidence (high / medium / low) and fix action. The GUI shows each finding with Auto-fix / Skip / Customize, a live before/after preview, and an encoding-override picker. Tool pages refuse to load until the gate passes.
Output
Every run writes:
{input}_<tool>.csv— the cleaned data{input}_changes.csv(text cleaner) or{input}_match_groups.csv(dedup) — audit traillogs/<tool>_YYYYMMDD_HHMMSS.log— debug-level run log
Original input file is never modified.
Docs
- User Guide — install, GUI workflow, gate
- CLI Reference — every flag with recipes
- Requirements — file sizes, encodings, detectors, perf targets
- Technical — architecture, gate internals, fix registry
- Developer Guide — adding fixes / detectors / standardizers
Dependencies
pandas, openpyxl, rapidfuzz, phonenumbers, typer, loguru, charset-normalizer, streamlit. Optional: ftfy for mojibake repair.
License
Proprietary.