What it does
Runs every 60 minutes (launchd com.skyrun.system-hygiene). Self-heals what bash can fix; queues the rest for live-ea to handle via the scheduled-tasks MCP.
Auto-fixes (no human, no Claude Code session needed)
1. Claude Code zombies (>2h old) →kill -TERM then kill -9
2. Oversize logs (>5MB) → trim to last 1000 lines
3. MCP allowlist drift → if any of the 9 explicitly-disabled MCPs got re-enabled (Claude Desktop update), reset to {"isEnabled": false}
4. Heartbeats >30 days old → delete
5. launchd agents failing repeatedly (>5 failures in last 200 log lines) → launchctl unload
6. Stale lock files (/tmp/skyrun_*.lock >1h old) → remove
7. Fix-queue items >24h old → prune
Detects + queues for live-ea
Things requiring the scheduled-tasks MCP (which bash can't reach):- Genuinely overdue scheduled task (cron-aware: only flags if INSIDE the task's expected active window AND past its expected next-run time) → writes to
~/Library/Application Support/SkyRun/fix_queue.jsonlwithaction: "trigger_task". Next live-ea reads + triggers viamcp__scheduled-tasks__update_scheduled_taskwithfireAt= now+60s. - Self-completed task that should be disabled → queues
action: "disable_task". Live-ea calls update withenabled: false.
Cron-aware staleness detection
Tasks are NOT flagged as stale during their off-hours. Specifically:live-eaonly flagged if today is a weekday and current hour is 9-19 MDTgmail-deep-scan(9am/1pm/5pm/10pm) only flagged after 11amtranscript-scan(8am/12pm/4pm) only flagged after 10amdaily-data-quality-checkonly flagged after 8amdaily-beenverified-enrichmentonly flagged after 7amnightly-consolidationonly flagged 1am-11pm (giving the 11pm cron its slot)grand-county-property-scoutonly on Mondays after 8am
This eliminates weekend / off-hours false positives.
Status logic
ok— nothing to fix OR auto-fix succeeded ("self-healed: ...")partial— at least one item escalated to live-ea queue (notification fires)- (would be
errorif heartbeat couldn't be written; never expected)
ntfy push policy
- Silent on auto-fix-only runs — if everything was bash-fixable, no notification (would be noise)
- Push on escalation — when items are queued for live-ea, push at
defaultpriority - Push at
highpriority if zombies killed > 5 (indicates something unusual is spawning processes)
Live-ea queue handler (Step 7a)
live-ea/SKILL.md Step 7a reads the fix queue every 20 min during business hours:
- For
action: "trigger_task": looks up viamcp__scheduled-tasks__list_scheduled_tasks, callsupdate_scheduled_taskwithfireAt= now+60s - For
action: "disable_task": callsupdate_scheduled_taskwithenabled: false - Marks processed items with
"status": "triggered"or"status": "disabled" - Caps at 3 actions per live-ea run to prevent runaway
Files
| Path | Purpose |
|---|---|
| Script | ~/Library/Application Support/SkyRun/system_hygiene.sh |
| Launch agent | ~/Library/LaunchAgents/com.skyrun.system-hygiene.plist |
| Hygiene log | ~/Library/Logs/skyrun-system-hygiene.log |
| Fix queue | ~/Library/Application Support/SkyRun/fix_queue.jsonl |
| Heartbeats | ~/Library/Application Support/SkyRun/health/{date}_system-hygiene_*.json |
What gets surfaced in the morning brief
Section F of nightly-consolidation reads the latest system-hygiene heartbeat. If status != ok, it flags in morning brief Systems status block as YELLOW with the summary. Anything killed gets a count.
Backstory — why it exists
Discovered 2026-04-26 morning: 56 Claude Code processes running, 11 MCPs auto-enabled, fswatch agent failing 19+ times/day. Cause: scheduled tasks that don't clean-exit, plus auto-installed MCPs on Claude Desktop updates, plus my fswatch experiment that couldn't get FDA. Cleanup brought it to 6 processes / 2 MCPs / fswatch unloaded. The watchdog prevents recurrence — autonomously.
Files
| What | Path |
|---|---|
| Script | ~/Library/Application Support/SkyRun/system_hygiene.sh |
| Launch agent | ~/Library/LaunchAgents/com.skyrun.system-hygiene.plist |
| Log | ~/Library/Logs/skyrun-system-hygiene.log |
| Heartbeats | ~/Library/Application Support/SkyRun/health/{date}_system-hygiene_*.json |
Manual run
bash
bash "/Users/josephbowens/Library/Application Support/SkyRun/system_hygiene.sh"
cat "$(ls -t ~/Library/Application\ Support/SkyRun/health/system-hygiene | head -1)"
Tuning
If too noisy, raise the per-task staleness tolerances in system_hygiene.sh case block. If process threshold needs adjusting, change the >20 check.
Currently disabled MCPs (re-enable via Claude Extensions Settings/{name}.json)
These were auto-installed but redundant or unused. To re-enable, set {"isEnabled": true} in the matching JSON file:
chrome-control— duplicate ofClaude_in_Chrome. Keep disabled.osascript— duplicate of bash + AppleScript. Keep disabled.ms_office_word,ms_office_powerpoint— handy if user wants Office automation; disabled by default.notes— Apple Notes; disabled by default.pdf-server-mcp,pdf-filler-simple— duplicate PDF stack; disabled.polygon-mcp-server,revolut-x-api— financial APIs unrelated to SkyRun; keep disabled.
Active: filesystem (Anthropic standard), fantastical-mcp (calendar — used by Section K pre-meeting briefs).
fswatch / PWA auto-rebuild — currently DISABLED
com.skyrun.pwa-autorebuild was unloaded 2026-04-26 because it failed 19+ times/day (TCC blocked /usr/bin/python3 from reading Desktop). Layer 2 (live-ea rebuilds every 20 min during business hours) is the active freshness mechanism.
To re-enable when FDA is granted: launchctl load ~/Library/LaunchAgents/com.skyrun.pwa-autorebuild.plist.
Related
reference_pwa_staleness.md— the 3-layer freshness system this complementsreference_system_monitoring.md— how heartbeats become morning-brief signalsreference_scheduled_tasks.md— the fleet this watchdog watches