AI News Hub Logo

AI News Hub

How to convert a JSON sample to a Valibot schema (and the 3 ways the algorithm diverges from Zod)

DEV Community
JSON to TS

When you sit down to build runtime validation for an API boundary in TypeScript, the first half-hour goes into picking a library. Zod is the default. Valibot is the 2026 upstart that you start hearing about once your bundle size gets audited — same expressive surface, but pipe-based composition and per-primitive tree-shaking instead of a monolithic chainable class. Both libraries answer the same question: given an unknown JSON value at runtime, prove its shape. And both expect you to write the schema by hand. That's the part nobody likes. So I wrote a converter — paste a JSON sample, get back a Valibot schema you can tighten by hand. The tool is here, free, no signup, runs entirely in the browser: json-to-ts-app.netlify.app. Internally the Valibot emitter is a sister function to the Zod emitter I wrote up last week. Same shape walk, same naming/uniquify logic, same children-first ordering. But three things have to change once you switch validators — and they're not the things you'd guess from skimming the Valibot README. This post is about those three. The walk is unchanged. For each node in the JSON sample: Primitive → emit the leaf schema (v.string(), v.number(), v.boolean(), v.null()). Array → recurse on every element, dedupe the resulting schemas into a union if mixed, collapse to a bare schema if uniform. Object → recurse on every value, give the object a const NameSchema = v.object({...}) binding, push it into the output list after its children so the const order is valid. Optional → if a key is missing in any sibling sample (multi-sample input), mark its value optional. Mixed types → wrap them in a union. Children-first ordering still matters even though we're no longer in Zod-land: const bindings don't hoist in JavaScript, so const Root = v.object({ user: User }) requires User to already exist by the time that line runs. Same constraint, same fix. What changes is how each node emits. Three divergences from Zod, all non-obvious: v.optional() wraps the schema — it doesn't chain on it In Zod, you write: z.string().optional() .optional() is a method on the Zod schema instance. It returns a new schema that mutates the inner one's behavior. That's why a long Zod field reads left-to-right, like a fluent builder: z.string().min(3).max(20).optional(). Valibot has no such method. It has a top-level function called optional that takes a schema and returns a new one: v.optional(v.string()) So the emitter's per-key code path can't just append ".optional()" to whatever schemaStr it built. It has to wrap: var wrapped = info.optional ? "v.optional(" + schemaStr + ")" : schemaStr; This sounds like a one-line difference — and it is — but it forces a small rethink for anyone porting a Zod codebase by hand. You don't dot a method on at the end; you wrap from the outside. Multi-modifier fields (e.g. optional + nullable) compose as nested calls: v.optional(v.nullable(v.string())), not .string().nullable().optional(). The reading order flips inside-out. The emitter never emits a multi-modifier field today (JSON samples don't tell you "this is nullable"; they only tell you "this is null sometimes" — which becomes a union with v.null()). But the wrap pattern is the right primitive for the moment that changes. Zod's idiomatic header is destructured: import { z } from "zod"; The emitter for the Zod path emits exactly that. For Valibot, the emitter switches to a namespace import: import * as v from "valibot"; Why? Because Valibot's main pitch is per-primitive tree-shaking. Every combinator (v.string, v.object, v.optional, v.union) is a separate exported symbol that gets dead-code-eliminated by the bundler if you don't use it. A 5-field schema using only v.object and v.string should ship ~600 bytes of Valibot, not the 12KB you'd get from a destructured-everything import. Bundlers can in theory tree-shake destructured imports too. In practice — and Valibot's docs are explicit about this — the namespace form is the path that lines up with the analyzer. Some bundlers (esbuild and Vite are reliable; older Webpack + Babel pipelines less so) drop dead namespace properties cleanly only when the namespace is the import shape. The single-letter v. prefix keeps callsites tight either way: v.object({ id: v.string() }) reads about the same as z.object({ id: z.string() }). The cost is paid in the import line, the win is paid back at every byte of bundle. v.union([...]) is the only composition path — there is no .or Zod gives you two ways to spell a sum type: // chain form z.string().or(z.number()) // function form z.union([z.string(), z.number()]) Both work. The Zod emitter picks the function form because it reads better at arity > 2 and matches the docs. Valibot only has the function form: v.union([v.string(), v.number()]) There is no .or() method to chain. There is no .and() either — v.intersect([...]) is its analogue. The whole API is composition-by-function-call, never method-chaining. This is part of the same design that drove decision #1: a Valibot schema is a value, not an object with methods, so all combinators are top-level functions. For the emitter that means there's only one branch to write. The mixed-type case collapses to: if (union.length === 0) return "v.unknown()"; if (union.length === 1) return union[0]; return "v.union([" + union.join(", ") + "])"; (The v.unknown() for empty arrays is a deliberate echo of the Zod path: a literal v.never() would reject any element, which is wrong for a sample that just happened to be empty. v.unknown() matches the TS-side unknown[].) The single-type-collapse is the same trick the Zod emitter uses — a 1-arg union like v.union([v.string()]) is degenerate and the bare schema reads identically. Worth dropping. Paste a Stripe webhook payload into the Stripe webhook → Valibot landing page and you get something like this: import * as v from "valibot"; const DataObjectSchema = v.object({ id: v.string(), object: v.string(), amount: v.number(), currency: v.string(), status: v.string(), customer: v.optional(v.union([v.string(), v.null()])), metadata: v.object({}), }); const DataSchema = v.object({ object: DataObjectSchema, }); const RootSchema = v.object({ id: v.string(), object: v.string(), api_version: v.string(), created: v.number(), data: DataSchema, type: v.string(), }); All three divergences show up in this output: The header is import * as v (decision #2). The optional-and-nullable customer field wraps from the outside: v.optional(v.union([v.string(), v.null()])) (decision #1 + decision #3 stacked). Children come first: DataObjectSchema is declared before DataSchema, which is declared before RootSchema. Try to flip the order and the bundler / TS compiler complains about referencing a const before its declaration. The same machinery handles an AWS Lambda event → Valibot conversion, a webhook from any other provider, or any pasted JSON sample. The output is meant to be a starting point — you'll usually tighten v.string() into v.pipe(v.string(), v.email()) for the email field, narrow v.string() to v.literal("payment_intent.succeeded") for known event types, and so on. But you don't write the 80-line v.object shell by hand. Two reasons people pick Valibot over Zod in 2026: Bundle size. A medium-sized validation surface (say, 30 schemas across 8 endpoints) drops from ~14KB Zod-minified to ~2-3KB Valibot-minified, because every combinator you don't import gets dropped. For a public-facing landing page or a mobile-first app, that's the reason you switch. Pipe-based refinement. Rather than chaining methods on a string schema, you compose with v.pipe(v.string(), v.email(), v.minLength(5)). The schema is built outside-in instead of inside-method-by-method. It reads more like a function pipeline, less like a builder DSL. If neither of those applies — and you're already deep in a Zod codebase — staying on Zod is fine. The converter has both modes; flip the toggle at the top of the tool and the same JSON sample comes out as either schema family. The point of the converter isn't to take a side; it's to skip the boring part. The three divergences in this post are the only places where the algorithm has to actually change. The other 90% of the work — naming, ordering, deduping, multi-sample merging — is the same shape walk in both modes. Which is, in retrospect, why a "second emitter" was a Friday-afternoon ship rather than a week-long project.