I built a 20 kB React hook that doesn't care which AI you use — here's how streaming actually works

DEV Community

devleo

May 9, 2026, 07:43 PM

`--- Most React AI chat libraries are secretly backend libraries. They stream directly from OpenAI, or through their own cloud, or via a framework-specific server But here's the thing: streaming AI chat is fundamentally just three events: data: {"type":"text","text":"Hello"} That's it. text, done, error. Your React component shouldn't need to know anything more than that. So I built react-ai-stream (https://github.com/trimooo/react-ai-stream) — a backend-agnostic The architecture Here's the full picture: React UI / The boundary in the middle is everything. The React layer speaks {type, text} over SSE. The server How streaming actually works Most tutorials skip the networking part. Here's what's actually happening. Server-Sent Events (SSE) is a one-directional HTTP protocol: the server opens a connection and keeps HTTP/1.1 200 OK data: {"type":"text","text":"Hello"} data: {"type":"text","text":", world"} data: {"type":"done"} The double newline (\n\n) is the event delimiter. Your API route receives the user's messages, calls The buffering problem nobody talks about Here's where most implementations have a subtle bug. Network chunks don't align with SSE event The correct pattern: let buf = '' The critical invariant: buf = parts.pop() keeps the incomplete trailing event. If you write buf = '' 10 lines to a streaming chat 'use client' export default function Page() { The hook has no dependency on the UI package. You can wire messages to any component — Tailwind, Why "backend-agnostic" is the right abstraction Compare these two approaches: Coupled approach — OpenAI SDK in the browser: // Your LLM choice is now in your bundle. Decoupled approach — hook speaks HTTP: // The frontend doesn't know or care what's behind this endpoint. The server-side API route handles provider selection. It might route to Anthropic by default, fall This also means you can run three providers simultaneously in complete isolation: const claude = useAIChat({ endpoint: '/api/chat?provider=anthropic' }) Each instance has its own message history, loading state, and abort controller. No shared context The React rendering challenge The naive implementation of streaming into React state has a real performance problem: // This fires a state update — and a re-render — for every token. React 18 batches some updates, but async loop callbacks aren't always batched. During fast streaming The library solves this by using Zustand's createStore (the vanilla, framework-agnostic version) // The store lives outside React. The mutation rate and the render rate are decoupled. The store can receive 100 tokens/second while This also enables true isolation. Each useAIChat() call creates its own store instance via a ref. How abort propagates end-to-end The stop button works through a chain of signals most people don't trace all the way: user clicks Stop On the server side, req.signal reflects this abort too. Forwarding it to the upstream LLM call const upstream = await fetch(LLM_API_URL, { That's waste reduction at the infrastructure level, not just UI polish. What's in the library Three packages, all MIT, ~20 kB total: Package: @react-ai-stream/core Built with: TypeScript strict mode, tsup (ESM + CJS), Vitest (34 tests), Turborepo monorepo. Try it npm install @react-ai-stream/react Live demo (https://react-ai-stream-example.vercel.app) — three models streaming in parallel via Groq Docs (https://react-ai-stream-docs.vercel.app) — quickstart, provider setup, API reference GitHub (https://github.com/trimooo/react-ai-stream) — source, examples, architecture deep-dive The architecture page (https://react-ai-stream-docs.vercel.app/architecture) and How streaming works https://react-ai-stream-docs.vercel.app/concepts/streaming-explained) have the full technical detail What I'd like to hear If you've built AI chat in React, I'm curious: what was the hardest part? Provider coupling, ---`