---
name: SoT-canonical architecture — continuous enforcement of single source of truth
description: SoT (Grand_County_STR_Engine_v3.5_filtered.xlsx Lead Details) is the single source of truth — was always the design intent, hardwired as RULES.md R-02 + structurally enforced by sot_reconciler.py running every 15 min. HubSpot is derivative. SmartLead enrolled-leads-list is a subset of SoT. This doc explains the architecture, why prior bandaids failed, why this is permanent, and how to add new HS-writing code without breaking it.
type: reference
last_updated: 2026-05-04
originSessionId: 81f95992-93b5-4db5-8fbc-2d5da5aeb321
---
# Architecture

```
                  ┌─────────────────────────────────────┐
                  │     SoT (CANONICAL — read-only       │
                  │     to reconciler; only operator     │
                  │     + owning skills can write)        │
                  │                                       │
                  │  Grand_County_STR_Engine_v3.5_       │
                  │  filtered.xlsx · Lead Details tab    │
                  │  866 leads · 48 cols including       │
                  │  EMAIL_ALL/PHONE_ALL ledgers          │
                  └────────────────┬────────────────────┘
                                   │
                       sot_reconciler.py
                       (every 15 min, idempotent)
                                   │
                                   ▼
                  ┌─────────────────────────────────────┐
                  │   HubSpot — DERIVATIVE                │
                  │   Joseph-owned ONLY                  │
                  │   (hubspot_owner_id = 88361194)      │
                  │   ~931 contacts                       │
                  │   Catches up to SoT every 15 min     │
                  └─────────────────────────────────────┘
                                   ▲
                                   │ BCC alias auto-log
                                   │ (R-01 scope: Joseph aliases)
                                   │
                  ┌─────────────────────────────────────┐
                  │   SmartLead — DERIVATIVE              │
                  │   Enrolled leads ⊆ SoT                │
                  │   (Step 7b auto-load from BV)         │
                  └─────────────────────────────────────┘
```

# Why this is the architecture

## SoT was always meant to be canonical

The xlsx file is literally named "Source of Truth" via the SoT acronym. Every CLAUDE.md mention treats it as the master. The DESIGN was always:

1. New leads enter SoT (scout, BV, manual)
2. SoT writes propagate to HS + SL
3. Bounces / replies / engagement events flow back to SoT
4. SoT is the only store that's edited by humans

## The implementation didn't enforce it

Multiple write paths could touch HS or SL independently:
- Scout pushed to HS via `hubspot-lead-push` skill
- BV enrichment wrote to SoT then later wrote to HS (separate runs)
- SmartLead imports created HS contacts WITHOUT setting hubspot_owner_id (the 23 unowned-with-LID contacts found 2026-05-04)
- Manual HS edits bypassed SoT entirely
- Postcard-updater wrote HS without proper LSN-token prefix (flag #77)

Each write path could fail or partially-complete. Drift accumulated. My audits caught drift point-in-time + applied fixes; drift returned because the writers weren't hardened.

## The structural fix: continuous reconciliation

`sot_reconciler.py` runs every 15 minutes. Each fire:

1. Loads SoT (read-only via openpyxl `read_only=True` — physically can't modify)
2. Loads ALL Joseph-owned HS contacts (per RULES.md R-01 — never touches non-Joseph contacts)
3. Surveys unowned-with-LID contacts (read-only flag, NEVER auto-claims)
4. Reconciles 7 auto-fix drift classes:
   - **D2**: HS contact has email matching SoT but missing `ID:Rxxxxx` token → add token
   - **D3**: HS city corruption (street suffix in city field) → patch to SoT canonical
   - **D4**: HS primary email differs from SoT EMAIL_ALL ledger primary → sync
   - **D5**: HS `hs_additional_emails` missing entries from SoT EMAIL_ALL → sync
   - **D6**: HS phone differs from SoT primary phone → sync
   - **D7**: SoT lead has no HS contact → CREATE (with hubspot_owner_id=Joseph from start)
   - **D8**: HS commission attribution differs from SoT-derived expected → sync
5. Flags 3 operator-review classes (NEVER auto-fix — would destroy data without judgment):
   - **F1**: Cross-contamination (HS email NOT in SoT ledger at all)
   - **F2**: Multiple HS contacts share same `ID:Rxxxxx` token
   - **F3**: Unowned HS contacts with our LID tokens — operator decides whether to claim
6. Writes heartbeat per canonical schema; pushes ntfy on partial/error

# Why this is permanent (vs prior bandaids)

| Prior bandaid pattern | This architecture |
|---|---|
| Run audit, find drift, apply fix, drift returns next day | Reconciler runs every 15 min — max drift age = 15 min |
| Each writer hardened individually; new writers re-introduce drift | Reconciler reads live HS state every fire — new writers automatically reconciled |
| Joseph manually checks dashboards | Reconciler ntfy-pushes on partial/error — operator only sees real issues |
| "I'll fix it next time" | Idempotent — re-running is safe + no-op on already-reconciled state |
| Drift fixed only when noticed | Drift fixed continuously regardless of whether anyone is watching |

# RULES.md M-01 anti-bypass: rules cannot be reasoned around

The reconciler enforces R-01 via two layers:
1. **Search filter**: `hubspot_owner_id EQ 88361194` — only Joseph-owned contacts loaded
2. **Post-load assertion**: every loaded contact's `hubspot_owner_id` verified == "88361194"; mismatch raises `RuntimeError` and aborts before any PATCH

If either layer is silently weakened, PROOF 16 in gate-proof catches it (greps for the filter + assertion presence in the script source) and fails the gate-proof run. ntfy fires high-priority. Operator surfaces the regression before it can do harm.

# How to add new HS-writing code

**DO**: import canonical patterns + add to PROOF 16 allow-list
```python
JOSEPH_OWNER_ID = "88361194"  # R-01

# Search filter MUST include hubspot_owner_id
filterGroups: [{filters: [
    {propertyName: "trackingsource", operator: "EQ", value: lead_id},
    {propertyName: "hubspot_owner_id", operator: "EQ", value: JOSEPH_OWNER_ID},
]}]

# Pre-PATCH assertion (defense in depth)
def patch_hs_contact(cb, hs_tab, contact_id, props):
    # Re-fetch + verify owner before write
    current = hs_get_contact(contact_id)
    if current.get("hubspot_owner_id") != JOSEPH_OWNER_ID:
        raise RuntimeError(f"R-01 ABORT: contact {contact_id} owner = {current.get('hubspot_owner_id')!r}, not Joseph")
    # ... proceed with PATCH
```

**Add the new file to PROOF 16's HS-writers allow-list** in `gate_proof_runner.sh`:
```bash
hs_writers = [
    '$SR_LIB/sot_reconciler.py',
    '$SR_LIB/bv_hs_sync.py',
    '$SR_LIB/fix_hs_city_drift.py',
    '$SR_LIB/create_missing_hs_contacts.py',
    '$SR_LIB/your_new_writer.py',  # ADD HERE
]
```

PROOF 16 verifies every listed file references `88361194`. Adding a writer without the reference fails the gate.

**DON'T**: write HS via `fetch()` in skill markdown without going through the canonical helpers. Skill markdown CAN show example HS API code (PROOF 16 doesn't gate skill files because they're documentation, not executable code), but actual writes must originate from a Python helper that PROOF 16 can verify.

# Files

- `~/Library/Application Support/SkyRun/sot_reconciler.py` — the reconciler
- `~/Library/Application Support/SkyRun/gate_proof_runner.sh` — PROOF 16 (lines 705+)
- `~/.claude/scheduled-tasks/sot-reconciler/SKILL.md` — cron skill (every 15 min)
- `~/.claude/projects/-Users-josephbowens-Desktop-SkyRun/memory/RULES.md` — master rule registry
- `~/.claude/projects/-Users-josephbowens-Desktop-SkyRun/memory/feedback_hs_scope_joseph_only.md` — R-01 documentation

# State as of 2026-05-04 PM

- 866 SoT leads · 931 Joseph-owned HS contacts
- First live reconciler run: 58 patches landed (21 email primary syncs, 53 additional_emails syncs, 10 phone syncs, 11 missing-LID-token added)
- 5 HS API errors (operator-review)
- 4 F1 cross-contamination flagged
- 4 F2 dup-token flagged
- 0 F3 unowned-with-LID auto-claimed (correct — not Joseph's by R-01)
- Gate-proof: 92/92 passing including PROOF 16
- Cron: every 15 min, idempotent, max drift age = 15 min

# Joseph's bar (the standard the architecture must meet)

> "100% pristine data integrity across the entire system 100% of the time. Nothing should get lost or dropped or missed, EVER."

The reconciler is the mechanism that delivers this for the SoT↔HS axis. SmartLead axis enforcement is partial (Step 7b auto-load + bounce-handler rotation) — true 2-way SL reconciliation requires SL API access we don't have at our tier. The Step 7b CSV import path remains the bridge.
