π¦ PicoClaw Deep Dive π€ β A Field Guide to Building an Ultra-Light AI Agent in Go πΉ
A comprehensive, actionable guide to the principles, techniques, and architecture behind sipeed/picoclaw β written so you can build a similar system from scratch. π§© What PicoClaw Is and Why It Matters π― Design Philosophy ποΈ High-Level Architecture π Core Concept #1 β The Agent Loop & Pipeline πΉοΈ Core Concept #2 β Steering (Mid-Loop Message Injection) π€ Core Concept #3 β SubTurn (Hierarchical Sub-Agents) πΎ Core Concept #4 β Sessions & JSONL Persistence π§ Core Concept #5 β Rule-Based Model Routing πͺ Core Concept #6 β The Hook System π‘ Core Concept #7 β Channel Abstraction (18+ chat platforms) π€ Core Concept #8 β Provider Abstraction (30+ LLMs) π οΈ Core Concept #9 β Tools, Skills, and MCP β‘ Resource-Efficiency Techniques (the + aliases β ββββββββββββββββββββββββ¬ββββββββββββββββββββββ βΌ ββββββββββββββββββββββββββββββββββββββββββββββ β pkg/agent (the loop) β β β β pipeline_setup β pipeline_llm β β β pipeline_execute (tools) β pipeline_finalizeβ β β β ββββββββββββ ββββββββββββ ββββββββββββ β β β steering β β subturn β β hooks β β β ββββββββββββ ββββββββββββ ββββββββββββ β β β β β² β² β β β tools β MCP β βββββββββΌβββββββββββββββββββββββββββΌβββββββββββ β β βββββββββ΄βββββββββ βββββββββ΄βββββββββ β pkg/tools β β pkg/mcp β β fs / shell / β β isolated β β hardware / β β command β β search ... β β transport β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββ β pkg/providers (factory + facades) β β anthropic / openai_compat / azure / β β bedrock / oauth / cli ... β β cooldown Β· ratelimiter Β· fallback Β· β β error_classifier β ββββββββββββββββββββββββββββββββββββββββββββββ Three top-level binaries are produced from cmd/: picoclaw β the agent itself (CLI + headless server) picoclaw-launcher-tui β terminal UI launcher membench β internal memory benchmark used to keep the Stable, opaque, the source of truth Legacy agent:main:direct:user123 Backward compat, resolved transparently The JSONL backend resolves legacy aliases to canonical keys during reads and writes β so you can rename schemes without losing history. Per session: .jsonl β one providers.Message per line, append-only. .meta.json β { summary, created_at, updated_at, line_count, skip_offset, scope, aliases }. Why two files: messages are append-only and crash-safe; metadata is overwritten under a per-shard mutex but small enough that a torn write is recoverable from the JSONL. "Designed around append-first durability and stale-over-loss recovery." The allocator turns inbound metadata into scope values: space β : chat β : topic β topic: sender β canonicalized through identity-link mappings (so that a user's Telegram ID and Slack ID map to the same logical sender) Special case: Telegram forum topics append / to chat values when topic is not an explicit dimension β preventing topic cross-talk by default. A 64-shard mutex array (hash key β shard) serializes per-session writes without keeping an unbounded mutex map. This is a small but important pattern: lock striping is essentially free and fixes 99% of session-store contention bugs. On startup the system attempts to migrate legacy JSON sessions into JSONL. If migration fails, it falls back to the legacy SessionManager rather than crash-looping the agent. Make session keys content-addressed (sha256 over a canonical scope signature) so renaming dimensions doesn't break history. Sidecar metadata is far simpler than embedding a header line in the JSONL. Lock striping > one big mutex > one mutex per session. 64 shards is a good default. pkg/routing is a two-stage pipeline: Agent dispatch β Router picks which agent definition handles the message (rules over channel, sender, content, command-prefix, etc). Model routing β once an agent is chosen, the RuleClassifier decides whether to use the agent's primary (heavy) model or a globally-configured cheap light model. { "routing": { "enabled": true, "light_model": "gemini-2.0-flash", "threshold": 0.35 } } The classifier is intentionally language-agnostic (no keyword lists), using five structural features: Feature What it measures TokenEstimate Approximate token count (CJK-aware rune counting) CodeBlockCount Number of fenced ` blocks in latest message RecentToolCalls Tool invocations in the last 6 history entries ConversationDepth Total history length HasAttachments Media references or recognized file extensions Signal Weight Has attachments 1.00 Code block present 0.40 Tokens > 200 0.35 Recent tool calls > 3 0.25 Tokens > 50 0.15 Recent tool calls 1β3 0.10 Conversation depth > 10 0.10 With threshold 0.35, trivial chat stays cheap; code, attachments, or active tool use trigger heavy. Long plain prompts cross at the 200-token boundary. pkg/agent/turn_coord.go swaps the candidate provider list to agent.LightCandidates when score to chat values when topic isn't an explicit dimension. Tool side effects after a user correction Skip remaining tools on steering arrival; emit explicit skip results. Orphan SubTurn results crashing parent 16-slot result buffer + Critical: true for must-finish work. context.Background() vs parent ctx confusion Document explicitly in your SubTurn API; default to independent timeouts. API keys in plaintext config Two files: config.json + .security.yml with stricter perms. Memory regressions slipping in Ship membench and gate it in CI. MIPS LE binaries refused by kernel Patch ELF e_flags at offset 36 after build. Hooks blocking turns Per-class timeouts: observer 200ms, interceptor 5s, approval 30s. Rebuilding when adding a provider Provider config is protocol/model strings; factory dispatches at runtime. Schema drift between sessions Lazy migration in JSONL backend; never edit applied "migrations" β append new ones. Routing rules buried in code Routing is data β JSON rules + features. Hot-reload friendly. 30 channels each duplicating retry logic Centralize retry/split/rate-limit in manager.go; channels send a single chunk. MCP server bug killing the agent Spawn each MCP server in an isolated process via isolated_command_transport. One mutex around the session store 64-shard mutex array on hash(key). If you read these files in this order, the architecture clicks fast: cmd/picoclaw/main.go β the boot sequence. pkg/bus/types.go β the typed message contract that flows through the whole system. pkg/agent/definition.go β what an agent is as data. pkg/agent/pipeline.go β pipeline_setup.go β pipeline_llm.go β pipeline_execute.go β pipeline_finalize.go β the loop. pkg/agent/turn_coord.go β the brains tying routing, providers, and steering together. pkg/agent/steering.go β the most copy-worthy single concept in the project. pkg/agent/subturn.go β sub-agent semantics. pkg/session/manager.go + jsonl_backend.go + allocator.go β durable state. pkg/routing/router.go + classifier.go + features.go β cheap-first routing. pkg/agent/hooks.go + hook_mount.go + hook_process.go β extensibility. pkg/channels/manager.go + base.go + interfaces.go β channel abstraction. pkg/providers/factory.go + cooldown.go + fallback.go + error_classifier.go β provider reliability stack. pkg/tools/registry.go + toolloop.go β tool execution. pkg/mcp/manager.go + isolated_command_transport.go β MCP integration. pkg/skills/registry.go + installer.go β plugin marketplace. Makefile β cross-compilation matrix, ELF patching, version stamping. docs/architecture/*.md β official narrative for steering, subturn, sessions, routing, hooks. Use Go. Static binaries, small RSS, uniform across architectures. Typed message bus with first-class Peer, Sender, MessageID. Pipelined agent loop: setup β LLM β tools β finalize, with a turn state struct. Steering: per-session FIFO queue polled at 4 checkpoints; skipped tools get explicit results. SubTurns with depth β€ 3, concurrency β€ 5, independent timeouts, Critical flag for must-finish. Sessions: structured SessionScope β canonical sk_v1_ key, JSONL + .meta.json, 64-shard locking. Routing: classifier with 5 structural features, weighted score, light_model below threshold. Hooks: 5 sync points + observer events, in-process or JSON-RPC over stdio, per-class timeouts. Channels: each in its own sub-package, embed BaseChannel, declare optional capabilities by interface, manager owns retries/splitting/rate-limit. Providers: factory + facades + cooldown + ratelimiter + fallback + error_classifier, configured by protocol/model strings, secrets in .security.yml. Tools / MCP / Skills: in-process tools for built-ins; MCP for untrusted external tools (isolated transport); skills as installable bundles from a registry. Bounded queues, streaming, lazy init, -ldflags="-s -w", -trimpath, membench regression gate. Cross-compile to amd64/arm/arm64/riscv64/mipsle + Darwin + Windows + NetBSD; patch MIPS ELF e_flags; ship a launcher that auto-picks the binary. Build steps 1β12 from Β§16 in order, validate with the patterns in Β§17, and you have a PicoClaw-class agent. If you found this helpful, let me know by leaving a π or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! π
