reference commitment tracker

What it does

Two-phase daily scan:

1. CAPTURE — mines transcripts modified since last fire + sent Gmail (last 2d). LLM-extracts "I'll do X by Y" commitments. Adds to ledger.
2. WATCH — reviews open commitments. For each overdue/escalated: composes a draft (deliver the missing thing OR reframe with new deadline), pushes to PWA approval queue.

Catches the failure mode stalled-deal-watchdog can't: self-imposed deadline misses on deals that aren't otherwise stalled.

Files

Skill: ~/.claude/scheduled-tasks/commitment-tracker/SKILL.md (prompt the scheduler invokes)
Spec: ~/.claude/scheduled-tasks/commitment-tracker/SPEC.md (canonical design — edit here)
Helper: ~/Library/Application Support/SkyRun/commitment_tracker.py
Ledger: ~/Library/Application Support/SkyRun/state/commitments.jsonl
Last-run state: ~/Library/Application Support/SkyRun/state/commitment_tracker_last_run.txt

Cron

30 8 * — every day at 8:30am MDT. Sequenced after stalled-deal-watchdog (8am) and morning brief (~5am), before postcard-ledger Mondays (10am) and bounce-handler (11am).

Plus on-demand from SkyRun QB.

Status lifecycle

Status	Meaning
`open`	captured, deadline ≥ today
`overdue`	1 day past deadline (yellow)
`escalated`	2+ days past deadline (red)
`fulfilled`	sent email matched promised_action OR manual mark
`dismissed`	operator override (no longer relevant)

Status auto-updates each helper read; persists to ledger.

commitment_id (deterministic)

commitment_id = sha256(speaker + recipient + commitment_text + deadline_date)[:12]

Re-ingesting same source produces same id → idempotent. Same commitment phrased two ways across two sources will produce two ids; surfacing twice is fine, operator dismisses dups.

Outputs every fire

Updated ledger (idempotent appends)
Per-overdue: draft → pending_drafts.jsonl + flag → pending_hs_updates.jsonl (24h dedupe)
Fulfillment auto-mark via Gmail sent-search match
Heartbeat at ~/Library/Application Support/SkyRun/health/
Audit at ~/Desktop/SkyRun/audit/<DATE>/
ntfy: severity ladder (📅 due-today / ⏰ overdue / 🚨 escalated)

Hard rules

DNC check before any draft
HS writes are proposals (never auto-write)
Email drafts only (never auto-send)
No Vermont references
Same commitment_id no re-draft within 24h
Never draft for dismissed or fulfilled
LLM confidence: low + null deadline → DROP (better to miss than false-flag)

Manual operation

bash
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" status
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" ingest --events /path/to/events.json
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" mark-fulfilled <id> --evidence "<msg-id>"
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" mark-dismissed <id>
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" report-overdue
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" report-due-today
python3 "$HOME/Library/Application Support/SkyRun/commitment_tracker.py" report-all

Date phrase parsing (LLM-pass guidance)

Phrase	Resolution
"today" / "later today" / "by EOD today"	run-date
"tomorrow"	run-date + 1
"by Friday" / "by end of week" / "this week"	next Friday from run-date
"by Monday" / "next Monday" etc.	next occurrence of that weekday
"next week"	following Monday
"by end of next week"	following Friday
"by [explicit date]"	parse the date
"in the next few days"	run-date + 3
"soon" / "shortly"	low confidence — flag, may drop

Loop close


Joseph speaks/writes "I'll do X by Y"
  ↓
Transcript or sent email captures it
  ↓
commitment-tracker fires (8:30am next day)
  ↓
LLM extracts → ledger
  ↓
Daily status check
  ↓
If fulfilled (sent email matches): auto-mark fulfilled
If deadline passes:
  - 1 day past → overdue → draft surfaced to PWA + HS proposal queued
  - 2+ days past → escalated → high-priority ntfy + draft + HS proposal
  ↓
Joseph 👍 the draft → live-ea sends OR Joseph composes alternate
  ↓
Next fire detects fulfillment via sent-Gmail match → marks fulfilled
  ↓
Audit preserves full lifecycle

Tuning notes

After 4 weeks:

Capture rate (% of obvious commitments LLM finds vs. operator manually adds) — target >80%
False positive rate — target <10%
Fulfillment detection accuracy — drafts shouldn't compose for already-fulfilled
Median miss → ntfy → action — target <2h business hours

If volume is high (10+/day), tighten LLM extraction confidence threshold.

Phase 2 (when adam-bd activates)

Same pattern scaffolds into ~/.claude/scheduled-tasks/adam-bd-commitment-tracker/. Brain-only profile NEEDS this — it's voice-calibrated drafting + transcript-mining, both core to "the brain." Worth promoting to active in Phase 1 adam-bd.

Future enhancements

1. Phase 2 fulfillment detection: train LLM classifier on past commitments + sent emails for smarter match (vs current keyword-match). Defer until Phase 1 has 100+ ledger entries to train on.
2. Prospect-side commitments: also track promises others make to Joseph (different action surface — nudge them when overdue, don't draft as if Joseph fulfilling).
3. Calendar integration: deadlines auto-add to Joseph's calendar with snooze reminders.
4. PWA tile: dedicated commitments view showing all open commitments with countdown.

Origin

Built 2026-04-27 as Gap C closure in the prospecting-loop walkthrough. Surfaced explicitly during the Weber walkthrough recap when stalled-deal-watchdog couldn't catch the "I'll send today" miss because Weber's last_touch was current. Smoke-tested with Weber + Hadank cases; helper correctly flagged Weber as overdue.