Failure

Two concurrent cron fires from different seats both filed CMB-0036 within 6 minutes, forcing a renumber

inter-agentfeed-protocol

Two concurrent cron fires from different seats both filed CMB-0036 within 6 minutes, forcing a renumber

Summary

On 2026-04-15 between 03:51Z and 03:57Z, two seats in the overnight Claude mesh independently filed CMB-0036 to the fleet feed for two different defects. CLAUDE-CLI-MacBook-Air-CAMBER-01 filed at 03:51:49Z (Shayelyn placeholder last_snippet matview defect). CLAUDE-CLI-MacBook-Air-ORA-01 filed at 03:57:54Z (outbound SMS ingestion cutoff at 2026-04-01T22:34:12Z). Both seats were running independent cron schedules with no shared ticket-number reservation mechanism. The collision was resolved by an explicit "first post wins" amendment at 04:00:32Z that renumbered ORA-01's filing to CMB-0037, but only after both diagnostics had already been published under the same ID.

Evidence

Timeline

  • 03:51:49Z — CAMBER-01 fire 1 filed CMB-0036 for the Shayelyn placeholder defect with WRITE-PROPOSED status.
  • 03:57:54Z — ORA-01 fire 2 filed CMB-0036 for the outbound SMS ingestion cutoff. ORA-01 had not yet read CAMBER-01's 03:51:49Z post when the cron fired (6-minute gap, no inter-fire pull).
  • 04:00:32Z — ORA-01 fire 3 read CAMBER-01's 03:51:49Z post, recognized the collision, posted an AMENDMENT that renumbered ORA-01's filing to CMB-0037 per "first post wins the number". Both tickets carried forward under unique IDs from that point.

Why it happened

  • Both seats were running independent CronCreate schedules (every ~22 minutes for ORA-01, every ~6 minutes for CAMBER-01).
  • Neither schedule consulted the live reducer fleet-active-queue before allocating the next ticket number — both seats incremented from local memory of the highest CMB-NNNN they had seen in the most recent feed pull.
  • The 6-minute gap between CAMBER-01's filing and ORA-01's next pull was wider than CAMBER-01's filing window, so ORA-01's fire 2 had stale state.
  • No two-phase file-then-confirm protocol existed: a seat could file a number and have no way to detect a concurrent collision before publishing.

Impact

  • One feed amendment cycle (one extra post at 04:00:32Z) to renumber and reconcile.
  • ~6 minutes of cross-seat ambiguity where both diagnostics were mutually unaware.
  • No data loss; no incorrect dispositions; both tickets ended up with unique numbers within ~9 minutes of the first collision.
  • The tax was real but bounded. The cost was not the renumber — it was the asymmetric burden on whichever seat read second, which had to amend.

Implications

  • Pattern: every concurrent multi-seat cron mesh that allocates ticket numbers from a shared namespace will collide if any individual fire window is wider than the shared-state pull frequency. With two seats and ~6-minute fire windows, P(collision) is non-trivial within a 4-hour overnight cycle.
  • Same class as ORA-2026-0003 / 0004 / 0006: a default that hid coordination state (in this case, "the highest CMB-NNNN I've seen") became wrong when concurrent state existed. The fix is to make the allocation explicit at the write site.
  • Fix: cron-mesh ticket allocation discipline. Codified in ORA-2026-0012.

References

  • Feed posts: /Users/chadbarlow/Desktop/FLEET_FEED.md lines 1205–1218 (the 04:00:32Z amendment).
  • Backlog flag: ORA-01 noted at line 1218 — "Filing to ORA for morning read as a doctrine candidate: cron-mesh ticket allocation discipline. Not a doctrine tonight — just a flag."