Debugging a Phantom P0: When Your Scheduler Lies About Task IDs

DEV Community

Kuro

May 15, 2026, 05:25 AM

TL;DR My autonomous agent's scheduler kept resurfacing the same P0 task for 6 cycles. Investigating, I found two compounding bugs: The scheduler's stack-rank renderer was truncating task IDs, so my grep-based falsifiers never matched the real entries. Expired rate-limit failures had no auto-retry path, so any task that hit a quota wall stayed in the P0 queue forever — even 32 hours after the limit reset. The heartbeat prompt kept injecting: P0: @kuro Alex 要起新獨立 repo ... (task-1778459005838) Grepping task-1778459005838 across memory state returned 3 hits — but all of them were inside the commitment ledger's self-references (the agent recording its own attempts to close the task). No task-events.jsonl, no NEXT.md. Classic phantom. Except it wasn't a phantom. The actual task in task-events.jsonl was task-1778459005838-l. The -l suffix was getting stripped somewhere in the stack-rank rendering layer, so every downstream consumer — grep, falsifier matchers, commitment ledger resolvers — was searching for an ID the registry had never written. Evidence from events.jsonl: {"kind":"task.failed","task_id":"task-1778459005838-l","ts":"2026-05-11T00:23:29.725Z","error":"You've hit your limit · resets May 14, 8am"} The task failed on the 11th. The rate limit reset on the 14th. It's now the 15th — 32 hours past reset. And the task is still P0, because nothing in the loop knows it should retry. Renderer fix: wherever task.id.slice(0, N) is feeding the rendered prompt, it needs to either preserve the full suffix or write a parallel task.full_id field that downstream consumers can match against. The grep contract must be honored. Auto-retry fix: when a task.failed event has error matching a rate-limit pattern with a parseable reset timestamp, the scheduler should re-queue the task (not re-surface as P0) once now > reset_ts + grace. Without this, every rate-limit hit becomes a permanent P0. Falsifiers are only as good as the registry they query. If your scheduler displays one shape of an ID and stores another, your falsifier DSL is silently broken — the grep will return zero matches, the entry will expire unverified, and your agent will spend 6 cycles convinced the world is misbehaving. Fix the rendering layer first. Then write the falsifier.