revert: restore audit-log kill switch — async redesign didn't help

User pulled d9e32e5 (async-writer audit log + re-enabled diagnostics
sidebar) and still sees blank pages. The synchronous-write theory
from the previous round was at most a partial explanation; something
ELSE in the audit-log code path is also taking the page render down
on the user's machine.

Restore the kill switch so the user has a working app while we
diagnose:

- ``src/audit.py``: ``_DISABLED = True`` re-introduced at module
  top, each of ``log_event`` / ``log_session_start`` /
  ``log_page_open`` / ``flush_audit_log`` early-returns. The async
  writer thread is never started.
- ``hide_streamlit_chrome``: ``_render_diagnostics_sidebar()`` call
  re-gated behind ``if False:``.

The async writer code stays in place — easier to flip the flag back
when we identify the real cause than to rewrite a third time. The
shutdown-flush call in ``shutdown_app`` also stays; it early-returns
on the kill switch and is harmless.

Diagnostic plan for the next session: ask the user for the launcher
terminal output (the new stderr "DataTools audit: writes failing..."
message would tell us if the writer thread DID start and DID fail),
and whether ``~/.datatools/logs/`` is being created at all.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-17 02:44:23 +00:00
parent d9e32e578b
commit 65c85107b6
2 changed files with 29 additions and 11 deletions

View File

@@ -55,6 +55,15 @@ _LOG_PATH: Path | None = None
_SESSION_ID: str | None = None _SESSION_ID: str | None = None
_SESSION_STARTED: bool = False _SESSION_STARTED: bool = False
# RESTORED kill switch — the async-writer redesign still triggers the
# blank-pages symptom on the user's machine despite no synchronous
# file I/O on the request path. Cause is not yet identified; keep all
# log_* calls as no-ops while we diagnose so the GUI keeps working.
# Diagnostic plan: ask the user for the launcher terminal output AND
# whether anything appears in ``~/.datatools/logs/`` at all — that
# bisects "writer thread starts" from "writer thread can't write."
_DISABLED: bool = True
# Bounded in-memory queue. ``deque(maxlen=N)`` drops the OLDEST entry # Bounded in-memory queue. ``deque(maxlen=N)`` drops the OLDEST entry
# when full so the most-recent events are always the ones that # when full so the most-recent events are always the ones that
# survive on disk — diagnostic-most-valuable at the moment of a # survive on disk — diagnostic-most-valuable at the moment of a
@@ -205,6 +214,8 @@ def log_event(
Failures inside this function are swallowed; a broken audit log Failures inside this function are swallowed; a broken audit log
must never take the GUI down. must never take the GUI down.
""" """
if _DISABLED:
return
try: try:
try: try:
ts = datetime.now(tz=timezone.utc).isoformat(timespec="milliseconds") ts = datetime.now(tz=timezone.utc).isoformat(timespec="milliseconds")
@@ -234,6 +245,8 @@ def log_event(
def log_session_start() -> None: def log_session_start() -> None:
"""Idempotent session-start banner with platform info.""" """Idempotent session-start banner with platform info."""
if _DISABLED:
return
global _SESSION_STARTED global _SESSION_STARTED
with _LOCK: with _LOCK:
if _SESSION_STARTED: if _SESSION_STARTED:
@@ -260,6 +273,8 @@ def log_session_start() -> None:
def log_page_open(slug: str) -> None: def log_page_open(slug: str) -> None:
"""Emit a deduplicated 'page open' nav event.""" """Emit a deduplicated 'page open' nav event."""
if _DISABLED:
return
try: try:
try: try:
import streamlit as st import streamlit as st
@@ -298,6 +313,8 @@ def flush_audit_log(timeout_s: float = 0.5) -> None:
stuck disk can never delay shutdown — events still in the queue stuck disk can never delay shutdown — events still in the queue
when the timer expires are dropped. when the timer expires are dropped.
""" """
if _DISABLED:
return
global _SHUTDOWN_REQUESTED global _SHUTDOWN_REQUESTED
deadline = time.monotonic() + max(0.0, timeout_s) deadline = time.monotonic() + max(0.0, timeout_s)
with _QUEUE_COND: with _QUEUE_COND:

View File

@@ -158,11 +158,12 @@ def hide_streamlit_chrome(*, gate_license: bool = True) -> None:
require_license_or_render_activation, require_license_or_render_activation,
) )
render_license_status_sidebar() render_license_status_sidebar()
# Diagnostics sidebar re-enabled now that the audit log is async # Diagnostics sidebar is DISABLED — the async-writer redesign
# and ``audit_log_path()`` is a pure path computation (no mkdir # didn't actually fix the blank-pages symptom on the user's
# on the request path). Still wrapped in try/except defensively; # machine. The sidebar calls ``audit_log_path()`` which is pure
# a render error here prints to stderr instead of taking down # now, so the failure mode must be elsewhere; keep this off
# the page body. # while we diagnose so the user has a working GUI.
if False:
try: try:
_render_diagnostics_sidebar() _render_diagnostics_sidebar()
except Exception: except Exception: