feat(monitor): implement code review fixes and real-time WebSocket monitoring

Backend Improvements (11 fixes applied):

Critical Fixes:
- Add lock protection for browser pool access in monitor stats
- Ensure async track_janitor_event across all call sites
- Improve error handling in monitor request tracking (already in place)

Important Fixes:
- Replace fire-and-forget Redis with background persistence worker
- Add time-based expiry for completed requests/errors (5min cleanup)
- Implement input validation for monitor route parameters
- Add 4s timeout to timeline updater to prevent hangs
- Add warning when killing browsers with active requests
- Implement monitor cleanup on shutdown with final persistence
- Document memory estimates with TODO for actual tracking

Frontend Enhancements:

WebSocket Real-time Updates:
- Add WebSocket endpoint at /monitor/ws for live monitoring
- Implement auto-reconnect with exponential backoff (max 5 attempts)
- Add graceful fallback to HTTP polling on WebSocket failure
- Send comprehensive updates every 2 seconds (health, requests, browsers, timeline, events)

UI/UX Improvements:
- Add live connection status indicator with pulsing animation
  - Green "Live" = WebSocket connected
  - Yellow "Connecting..." = Attempting connection
  - Blue "Polling" = Fallback to HTTP polling
  - Red "Disconnected" = Connection failed
- Restore original beautiful styling for all sections
- Improve request table layout with flex-grow for URL column
- Add browser type text labels alongside emojis
- Add flex layout to browser section header

Testing:
- Add test-websocket.py for WebSocket validation
- All 7 integration tests passing successfully

Summary: 563 additions across 6 files
This commit is contained in:
unclecode
2025-10-18 11:38:25 +08:00
parent aba4036ab6
commit 25507adb5b
6 changed files with 561 additions and 71 deletions

View File

@@ -119,6 +119,7 @@ async def lifespan(_: FastAPI):
# Initialize monitor
monitor_module.monitor_stats = MonitorStats(redis)
await monitor_module.monitor_stats.load_from_redis()
monitor_module.monitor_stats.start_persistence_worker()
# Initialize browser pool
await init_permanent(BrowserConfig(
@@ -135,6 +136,14 @@ async def lifespan(_: FastAPI):
# Cleanup
app.state.janitor.cancel()
app.state.timeline_updater.cancel()
# Monitor cleanup (persist stats and stop workers)
from monitor import get_monitor
try:
await get_monitor().cleanup()
except Exception as e:
logger.error(f"Monitor cleanup failed: {e}")
await close_all()
async def _timeline_updater():
@@ -143,7 +152,9 @@ async def _timeline_updater():
while True:
await asyncio.sleep(5)
try:
await get_monitor().update_timeline()
await asyncio.wait_for(get_monitor().update_timeline(), timeout=4.0)
except asyncio.TimeoutError:
logger.warning("Timeline update timeout after 4s")
except Exception as e:
logger.warning(f"Timeline update error: {e}")