What it does
Two-phase daily scan:
1. CAPTURE — mines transcripts modified since last fire + sent Gmail (last 2d). LLM-extracts "I'll do X by Y" commitments. Adds to ledger.
2. WATCH — reviews open commitments. For each overdue/escalated: composes a draft (deliver the missing thing OR reframe with new deadline), pushes to PWA approval queue.
Catches the failure mode stalled-deal-watchdog can't: self-imposed deadline misses on deals that aren't otherwise stalled.
Files
- Skill:
~/.claude/scheduled-tasks/commitment-tracker/SKILL.md(prompt the scheduler invokes) - Spec:
~/.claude/scheduled-tasks/commitment-tracker/SPEC.md(canonical design — edit here) - Helper:
~/Library/Application Support/SkyRun/commitment_tracker.py - Ledger:
~/Library/Application Support/SkyRun/state/commitments.jsonl - Last-run state:
~/Library/Application Support/SkyRun/state/commitment_tracker_last_run.txt
Cron
30 8 * — every day at 8:30am MDT. Sequenced after stalled-deal-watchdog (8am) and morning brief (~5am), before postcard-ledger Mondays (10am) and bounce-handler (11am).
Plus on-demand from SkyRun QB.
Status lifecycle
| Status | Meaning |
|---|---|
open | captured, deadline ≥ today |
overdue | 1 day past deadline (yellow) |
escalated | 2+ days past deadline (red) |
fulfilled | sent email matched promised_action OR manual mark |
dismissed | operator override (no longer relevant) |
commitment_id (deterministic)
commitment_id = sha256(speaker + recipient + commitment_text + deadline_date)[:12]
Re-ingesting same source produces same id → idempotent. Same commitment phrased two ways across two sources will produce two ids; surfacing twice is fine, operator dismisses dups.
Outputs every fire
- Updated ledger (idempotent appends)
- Per-overdue: draft →
pending_drafts.jsonl+ flag →pending_hs_updates.jsonl(24h dedupe) - Fulfillment auto-mark via Gmail sent-search match
- Heartbeat at
~/Library/Application Support/SkyRun/health/ - Audit at
~/Desktop/SkyRun/audit/<DATE>/ - ntfy: severity ladder (📅 due-today / ⏰ overdue / 🚨 escalated)
Hard rules
- DNC check before any draft
- HS writes are proposals (never auto-write)
- Email drafts only (never auto-send)
- No Vermont references
- Same commitment_id no re-draft within 24h
- Never draft for
dismissedorfulfilled - LLM
confidence: low+ null deadline → DROP (better to miss than false-flag)
Manual operation
bash
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" status
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" ingest --events /path/to/events.json
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" mark-fulfilled <id> --evidence "<msg-id>"
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" mark-dismissed <id>
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" report-overdue
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" report-due-today
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" report-all
Date phrase parsing (LLM-pass guidance)
| Phrase | Resolution |
|---|---|
| "today" / "later today" / "by EOD today" | run-date |
| "tomorrow" | run-date + 1 |
| "by Friday" / "by end of week" / "this week" | next Friday from run-date |
| "by Monday" / "next Monday" etc. | next occurrence of that weekday |
| "next week" | following Monday |
| "by end of next week" | following Friday |
| "by [explicit date]" | parse the date |
| "in the next few days" | run-date + 3 |
| "soon" / "shortly" | low confidence — flag, may drop |
Loop close
Joseph speaks/writes "I'll do X by Y"
↓
Transcript or sent email captures it
↓
commitment-tracker fires (8:30am next day)
↓
LLM extracts → ledger
↓
Daily status check
↓
If fulfilled (sent email matches): auto-mark fulfilled
If deadline passes:
- 1 day past → overdue → draft surfaced to PWA + HS proposal queued
- 2+ days past → escalated → high-priority ntfy + draft + HS proposal
↓
Joseph 👍 the draft → live-ea sends OR Joseph composes alternate
↓
Next fire detects fulfillment via sent-Gmail match → marks fulfilled
↓
Audit preserves full lifecycle
Tuning notes
After 4 weeks:
- Capture rate (% of obvious commitments LLM finds vs. operator manually adds) — target >80%
- False positive rate — target <10%
- Fulfillment detection accuracy — drafts shouldn't compose for already-fulfilled
- Median miss → ntfy → action — target <2h business hours
If volume is high (10+/day), tighten LLM extraction confidence threshold.
Phase 2 (when adam-bd activates)
Same pattern scaffolds into ~/.claude/scheduled-tasks/adam-bd-commitment-tracker/. Brain-only profile NEEDS this — it's voice-calibrated drafting + transcript-mining, both core to "the brain." Worth promoting to active in Phase 1 adam-bd.
Future enhancements
1. Phase 2 fulfillment detection: train LLM classifier on past commitments + sent emails for smarter match (vs current keyword-match). Defer until Phase 1 has 100+ ledger entries to train on.
2. Prospect-side commitments: also track promises others make to Joseph (different action surface — nudge them when overdue, don't draft as if Joseph fulfilling).
3. Calendar integration: deadlines auto-add to Joseph's calendar with snooze reminders.
4. PWA tile: dedicated commitments view showing all open commitments with countdown.
Origin
Built 2026-04-27 as Gap C closure in the prospecting-loop walkthrough. Surfaced explicitly during the Weber walkthrough recap when stalled-deal-watchdog couldn't catch the "I'll send today" miss because Weber's last_touch was current. Smoke-tested with Weber + Hadank cases; helper correctly flagged Weber as overdue.