docs(i18n): document language packs across user, dev, and marketing docs

README + USER-GUIDE describe the sidebar picker and current coverage
(home + shared chrome, per-tool bodies pending). DEVELOPER gains a
how-to for adding packs and keys with the parity-test guarantee.
TECHNICAL §10b records the in-house-JSON architecture and locks in the
no-gettext decision (also logged in DECISIONS). REQUIREMENTS reflects
the new interface surface and updated test count. COPY.md adds a
"Language claim" slot so landing/email work can pick it up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-13 15:16:24 +00:00
parent c4ce86bd64
commit 38011872e1
7 changed files with 73 additions and 4 deletions

View File

@@ -38,6 +38,9 @@ src/
app.py # Streamlit entry point
pages/ # One page per tool
components/ # shared, dedup_review, findings, gate, _legacy
i18n/ # GUI language packs (JSON-backed, in-house lookup)
__init__.py # t() · current_language() · render_language_selector()
packs/ # en.json, es.json, … (one file per language)
build/ # PyInstaller spec, launcher, OS-specific configs
demo/ # Constrained Streamlit Community Cloud version
tests/ # pytest; targets core/, not UI
@@ -220,6 +223,22 @@ Deliberately separate. Confluent original spec was wrong.
- `-999` sentinel — 04 converts to `NaN` first; 06 then computes stats.
- Suspicious-but-plausible (age 110) — 06 territory.
## 10b. GUI internationalization (i18n)
The GUI uses an in-house, JSON-backed translation layer at `src/i18n/`. **No** `gettext` / `babel` / `.po` pipeline — the surface is small enough that a 100-line module + per-language JSON file is a better fit than a build-time toolchain.
**Resolution model**: `t(key, lang=None, **fmt)` walks a dotted key (`home.title`, `tools.01_deduplicator.name`) through a nested dict. Fallback chain: requested lang → English (canonical) → the literal key. Missing format placeholders return the raw template rather than raising so a translation file cannot crash the UI.
**Active language** is stored in `st.session_state["ui_lang"]`. Reading it outside a Streamlit run (tests, scripts) silently falls back to English, keeping the module importable without Streamlit context.
**Picker placement**: `hide_streamlit_chrome()` calls `render_language_selector()` on every page that hides Streamlit's default chrome — i.e., the entire app. One mount point, every page picks it up.
**Pack parity** is a tested invariant: `tests/test_lang_packs.py::TestPackParity` fails CI when `en.json` and another pack diverge in either direction. This catches translation drift at PR time rather than from buyer reports.
**Farewell overlay**: the shutdown screen's JS payload interpolates pack strings into an `innerHTML` inside a JS single-quoted string. `_js_html_safe()` in `components/_legacy.py` escapes both the JS string terminator (`'`) and HTML special chars (`< > &`). The test `TestFarewellEscape` pins this; never bypass it.
**Why not gettext**: zero compiled artifacts in the PyInstaller bundle, no build step before tests run, no `.po`/`.mo` round-trip for translators (anyone can edit JSON), and the same lookup works in unit tests without process state. Locked in because the surface won't grow large enough to need the alternative, and the alternative breaks the "drop a file, run pytest, ship" loop.
## 11. Per-script functional specs
Specs live in this section as scripts enter active build. Each follows the Tier 1/2/3 structure with explicit strategic framing (what's the market gap given some of this is free elsewhere).