docs: update all documentation to reflect v3.0 functionality
Update README, CLI reference, and developer guide to cover delimiter selector, inline checkboxes/dropdowns, live surviving rows preview, multi-row survivors, and apply_review_decisions(). Remove dead link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -10,7 +10,7 @@ python -m src.cli INPUT_FILE [OPTIONS]
|
||||
|
||||
| Argument | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `INPUT_FILE` | Yes | Path to the CSV or Excel file to deduplicate |
|
||||
| `INPUT_FILE` | Yes | Path to the CSV, delimited text, or Excel file to deduplicate |
|
||||
|
||||
## Options
|
||||
|
||||
|
||||
@@ -90,17 +90,20 @@ Typer-based CLI with 17 options. Key responsibilities:
|
||||
### src/gui/app.py — Streamlit GUI
|
||||
|
||||
Single-page layout:
|
||||
- File upload with instant preview
|
||||
- File upload with instant preview and configurable delimiter (comma, tab, semicolon, pipe, or custom)
|
||||
- Advanced options expander (column selection, fuzzy, normalizers, survivor rule, merge, config profiles)
|
||||
- Find Duplicates button → runs `deduplicate()` with `progress_callback`
|
||||
- Interactive review: expandable match group cards with merge/keep/skip buttons
|
||||
- Interactive review via `st.data_editor` with inline checkboxes and column dropdowns
|
||||
- Batch actions: Accept All, Reject All, Clear Decisions
|
||||
- Apply review decisions and download cleaned results
|
||||
- Download buttons for deduplicated CSV, removed rows, and match groups report
|
||||
|
||||
### src/gui/components.py — Reusable GUI Widgets
|
||||
|
||||
- **`match_group_card()`** — expandable card showing side-by-side row comparison with diff highlighting
|
||||
- **`config_panel()`** — the advanced options expander, returns a `DeduplicationConfig`
|
||||
- **`results_summary()`** — summary stats and download buttons
|
||||
- **`match_group_card()`** — expandable card with `st.data_editor`: inline Keep checkboxes per row, `SelectboxColumn` dropdowns for differing columns, and a live surviving rows preview
|
||||
- **`config_panel()`** — the advanced options expander, returns settings dict with strategies, survivor rule, merge flag
|
||||
- **`results_summary()`** — summary metrics and download buttons
|
||||
- **`apply_review_decisions()`** — builds final DataFrames from user review decisions (merge, split, or keep-all per group) with column override support
|
||||
|
||||
## Data Flow
|
||||
|
||||
|
||||
Reference in New Issue
Block a user