Files
datatools-dev/src/gui/pages/7_Multi_File_Merger.py
Michael ac94208d8f chore: production-readiness sweep on the help-popover wave
- Drop unused 'from src.i18n import t' from pages 1-9 (the swap to
  render_tool_header(tool_id) means no page calls t() directly anymore).
  Pages 10, 11 and the underscore-prefixed pages were already clean or
  legitimately use t().

- Rewrite PDF Extractor help_md (en + es). The original prose described
  features the tool does NOT have — template drawing, per-source saved
  templates, automatic reuse. The actual tool is a heuristic batch
  scanner (per its own docstring: "No templates, no per-bank
  configuration"). New copy: scan → uncheck → pick date format → enable
  OCR if needed → download. Spanish version tagged with
  '<!-- TODO: review Spanish -->' since the prose is best-effort.

- Document why both stSidebarNavSectionHeader (legacy, streamlit~=1.35)
  and stNavSectionHeader (current, 1.57) testids appear in the chrome
  CSS — requirements floor is streamlit>=1.35,<2 so dropping the legacy
  selector would silently break the lower bound.

- Pin the t()-returns-key-on-miss contract that render_tool_header's
  fallback path depends on, with a comment at the call site.

- Pin the demo's intentional skip of hide_streamlit_chrome (so the
  +/- sidebar swap JS doesn't ever try to load there) with a load-
  bearing comment in app_demo.py.

- Confirmed i18n parity: every tool id has page_title / page_caption /
  description / name / help_md in BOTH packs; help.button_label and
  help.missing_body in both.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 18:07:33 +00:00

97 lines
3.2 KiB
Python

"""DataTools Combine Files — stub page."""
from __future__ import annotations
import sys
from pathlib import Path
import streamlit as st
_project_root = Path(__file__).resolve().parent.parent.parent.parent
if str(_project_root) not in sys.path:
sys.path.insert(0, str(_project_root))
from src.gui.components import (
back_to_home_link,
render_sticky_footer,
render_tool_header,
hide_streamlit_chrome,
require_feature_or_render_upgrade,
)
from src.license import FeatureFlag
hide_streamlit_chrome()
render_sticky_footer()
back_to_home_link()
from src.audit import log_page_open
log_page_open("7_Multi_File_Merger")
require_feature_or_render_upgrade(FeatureFlag.MULTI_FILE_MERGER)
# ---------------------------------------------------------------------------
# Header
# ---------------------------------------------------------------------------
render_tool_header("07_multi_file_merger")
st.info("This tool is under development.")
# ---------------------------------------------------------------------------
# What this tool will do
# ---------------------------------------------------------------------------
st.markdown("""
**Features:**
- Import multiple CSV/Excel files at once
- Automatic schema alignment (matching columns by name)
- Append mode: stack files vertically (union)
- Join mode: merge files on shared key columns
- Handle mismatched columns (fill missing with nulls or drop)
- Source file tracking column
""")
st.divider()
# ---------------------------------------------------------------------------
# Multi-file upload (functional)
# ---------------------------------------------------------------------------
uploaded_files = st.file_uploader(
"Import CSV or Excel files",
type=["csv", "tsv", "xlsx", "xls"],
accept_multiple_files=True,
help="Import multiple files to preview. Processing is not yet available.",
key="merger_file_upload",
)
if uploaded_files:
import pandas as pd
for f in uploaded_files:
try:
if f.name.endswith((".xlsx", ".xls")):
df = pd.read_excel(f)
else:
df = pd.read_csv(f)
st.subheader(f"Preview: {f.name}")
st.caption(f"{len(df)} rows, {len(df.columns)} columns — Columns: {', '.join(df.columns[:10])}{'...' if len(df.columns) > 10 else ''}")
st.dataframe(df.head(5), width="stretch")
except Exception as e:
from src.core.errors import format_for_user
st.error(
f"**Could not read `{f.name}`**\n\n"
f"```\n{format_for_user(e)}\n```"
)
# ---------------------------------------------------------------------------
# Placeholder options
# ---------------------------------------------------------------------------
st.subheader("Merge Strategy")
st.selectbox("Mode", ["Append (stack vertically)", "Join on key columns", "Schema alignment (smart merge)"], disabled=True)
st.selectbox("Mismatched columns", ["Fill with null", "Drop non-shared columns", "Error"], disabled=True)
st.checkbox("Add source filename column", value=True, disabled=True)
st.divider()
st.button("Merge Files", type="primary", width="stretch", disabled=True)