Failure
Two concurrent cron fires from different seats both filed CMB-0036 within 6 minutes, forcing a renumber
Two concurrent cron fires from different seats both filed CMB-0036 within 6 minutes, forcing a renumber
Summary
On 2026-04-15 between 03:51Z and 03:57Z, two seats in the overnight Claude mesh independently filed CMB-0036 to the fleet feed for two different defects. CLAUDE-CLI-MacBook-Air-CAMBER-01 filed at 03:51:49Z (Shayelyn placeholder last_snippet matview defect). CLAUDE-CLI-MacBook-Air-ORA-01 filed at 03:57:54Z (outbound SMS ingestion cutoff at 2026-04-01T22:34:12Z). Both seats were running independent cron schedules with no shared ticket-number reservation mechanism. The collision was resolved by an explicit "first post wins" amendment at 04:00:32Z that renumbered ORA-01's filing to CMB-0037, but only after both diagnostics had already been published under the same ID.
Evidence
Timeline
- 03:51:49Z — CAMBER-01 fire 1 filed
CMB-0036for the Shayelyn placeholder defect with WRITE-PROPOSED status. - 03:57:54Z — ORA-01 fire 2 filed
CMB-0036for the outbound SMS ingestion cutoff. ORA-01 had not yet read CAMBER-01's 03:51:49Z post when the cron fired (6-minute gap, no inter-fire pull). - 04:00:32Z — ORA-01 fire 3 read CAMBER-01's 03:51:49Z post, recognized the collision, posted an
AMENDMENTthat renumbered ORA-01's filing toCMB-0037per "first post wins the number". Both tickets carried forward under unique IDs from that point.
Why it happened
- Both seats were running independent CronCreate schedules (every ~22 minutes for ORA-01, every ~6 minutes for CAMBER-01).
- Neither schedule consulted the live reducer
fleet-active-queuebefore allocating the next ticket number — both seats incremented from local memory of the highest CMB-NNNN they had seen in the most recent feed pull. - The 6-minute gap between CAMBER-01's filing and ORA-01's next pull was wider than CAMBER-01's filing window, so ORA-01's fire 2 had stale state.
- No two-phase file-then-confirm protocol existed: a seat could file a number and have no way to detect a concurrent collision before publishing.
Impact
- One feed amendment cycle (one extra post at 04:00:32Z) to renumber and reconcile.
- ~6 minutes of cross-seat ambiguity where both diagnostics were mutually unaware.
- No data loss; no incorrect dispositions; both tickets ended up with unique numbers within ~9 minutes of the first collision.
- The tax was real but bounded. The cost was not the renumber — it was the asymmetric burden on whichever seat read second, which had to amend.
Implications
- Pattern: every concurrent multi-seat cron mesh that allocates ticket numbers from a shared namespace will collide if any individual fire window is wider than the shared-state pull frequency. With two seats and ~6-minute fire windows, P(collision) is non-trivial within a 4-hour overnight cycle.
- Same class as ORA-2026-0003 / 0004 / 0006: a default that hid coordination state (in this case, "the highest CMB-NNNN I've seen") became wrong when concurrent state existed. The fix is to make the allocation explicit at the write site.
- Fix: cron-mesh ticket allocation discipline. Codified in ORA-2026-0012.
References
- Feed posts:
/Users/chadbarlow/Desktop/FLEET_FEED.mdlines 1205–1218 (the 04:00:32Z amendment). - Backlog flag: ORA-01 noted at line 1218 — "Filing to ORA for morning read as a doctrine candidate: cron-mesh ticket allocation discipline. Not a doctrine tonight — just a flag."