AI News Hub Logo

AI News Hub

Building an Interactive Merkle Tree Visualizer β€” One Byte Changes, the Whole Root Changes, in O(log n) Hops

DEV Community
SEN LLC

"If a Merkle tree commits to a million transactions and one of them is altered, the root changes. To prove inclusion of any one transaction, you only need ~20 hashes." Most explanations stop there and leave you to imagine the picture. This article is the picture: an interactive visualizer where you edit a leaf and watch the change ripple to the root, and a proof β€” the actual sibling hashes you'd send to a verifier β€” gets generated and replayed live. 🌳 Demo: https://sen.ltd/portfolio/merkle-viz/ GitHub: https://github.com/sen-ltd/merkle-viz The whole thing is ~200 lines of pure logic + 150 lines of DOM/SVG glue, no build step, no dependencies. Hashing is crypto.subtle.digest("SHA-256", …) in both the browser and Node 18+, so the same merkle.js ships to production and runs under node --test. 17 tests cover the math. You have a list of items: transactions, log entries, file blobs, anything addressable. You want a single short fingerprint that commits to the entire list, such that: Changing any item changes the fingerprint. Given any item, you can prove to someone holding only the fingerprint that the item was in the list β€” without sending them the rest of the list. Both verification and proof generation are cheap: O(log n) in the list size. A naΓ―ve answer is "hash the concatenation of every item." That gives you property 1 but blows up property 2 β€” the verifier needs the entire list. A Merkle tree gets all three. The construction is mechanical: level 0: H(L0) H(L1) H(L2) H(L3) ← leaves: hash each item level 1: H(H(L0)||H(L1)) H(H(L2)||H(L3)) ← pair them up, hash again level 2: H( level1_left || level1_right ) ← keep going = ROOT H is SHA-256. || is byte concatenation (not hex-string concatenation β€” that's a real footgun). For odd-count levels, this implementation duplicates the last node and pairs it with itself β€” the convention used by Bitcoin's consensus rules. (Certificate Transparency carries the odd one up unchanged; the Ethereum Patricia trie is a different beast entirely. Pick a convention, document it, stick with it.) Once you've built it, the inclusion proof for leaf i is just the sibling hash at every level on the way from leaf i up to the root. For 4 leaves, every proof is 2 hashes long. For 1024 leaves, it's 10. For 1 million, it's 20. merkle.js exports four functions a verifier needs and three more for the visualizer to drive the UI. The whole thing is async because crypto.subtle.digest is async; nothing else about it is exotic. const enc = new TextEncoder(); export async function sha256Hex(bytes) { const buf = bytes instanceof Uint8Array ? bytes : enc.encode(bytes); const digest = await crypto.subtle.digest("SHA-256", buf); return bytesToHex(new Uint8Array(digest)); } export async function hashLeaf(text) { return sha256Hex(text); } export async function hashPair(leftHex, rightHex) { const buf = new Uint8Array(64); buf.set(hexToBytes(leftHex), 0); buf.set(hexToBytes(rightHex), 32); return sha256Hex(buf); } hashPair is where most ad-hoc implementations go wrong. The two child hashes need to be concatenated as bytes, not as hex strings. sha256("ab" + "cd") and sha256(bytes_of("ab") || bytes_of("cd")) produce different roots, and verifiers built against one convention will reject proofs built against the other. The unit tests pin this down explicitly: test("hashPair concatenates child bytes (not hex strings)", async () => { const left = await hashLeaf("L"); const right = await hashLeaf("R"); const got = await hashPair(left, right); const concat = Buffer.concat([ Buffer.from(left, "hex"), Buffer.from(right, "hex"), ]); assert.equal(got, createHash("sha256").update(concat).digest("hex")); }); Building the tree is a straightforward bottom-up loop: export async function buildTree(leaves) { if (!leaves.length) return { levels: [[]], root: null }; const leafHashes = await Promise.all(leaves.map(hashLeaf)); const levels = [leafHashes]; while (levels[levels.length - 1].length > 1) { const cur = levels[levels.length - 1]; const next = []; for (let i = 0; i { const before = await buildTree(["a", "b", "c", "d"]); const after = await buildTree(["a", "b", "c!", "d"]); const changed = diffLevels(before.levels, after.levels); // Expected: leaf 2, parent (1,1), root (2,0). assert.ok(changed.has("0:2")); assert.ok(changed.has("1:1")); assert.ok(changed.has("2:0")); // Untouched sibling chain. assert.ok(!changed.has("0:0")); assert.ok(!changed.has("0:1")); assert.ok(!changed.has("0:3")); assert.ok(!changed.has("1:0")); }); That's the whole "tampering propagates in O(log n)" claim, expressed as code. The SVG layer walks the changed set and paints the matching nodes red. It's the picture every Merkle-tree explainer wants but doesn't draw. merkle.js has no DOM access, no globals, no module state. Every export is pure (or async-pure). That's what lets node --test consume it directly: $ npm test βœ” sha256Hex matches node:crypto for ASCII input βœ” hashLeaf is deterministic and matches sha256(text) βœ” hashPair concatenates child bytes (not hex strings) βœ” buildTree on 4 leaves yields a 3-level tree βœ” buildTree duplicates the last leaf when count is odd (Bitcoin style) βœ” proofs verify against the real root for every leaf βœ” proof verification fails if a single character of the leaf changes βœ” proof verification fails if a sibling hash is swapped βœ” diffLevels finds exactly the path of changed nodes … β„Ή tests 17 β„Ή pass 17 β„Ή fail 0 script.js is the DOM/SVG glue: editor for the leaves, SVG renderer for the tree, panel for the proof. It owns nothing about hashing β€” every byte that becomes a hash goes through merkle.js. That's how the same module ships to production and runs under tests. The runtime story is symmetric: crypto.subtle.digest is identical in modern browsers and Node 18+, no polyfill, no shim. The async-everywhere style is the price; the payoff is that import { buildTree } from "./merkle.js" works in both contexts unchanged. Bitcoin block headers commit to a single Merkle root over every transaction in the block. SPV ("simplified payment verification") wallets use exactly this proof shape: server sends the wallet a transaction + a Merkle proof, wallet verifies inclusion against the block header it already has. The 2008 paper builds on this; the alternative β€” sending the whole block β€” is the network-bandwidth disaster the design avoids. Git tree objects hash a directory of blobs and recursive sub-trees. It's not the exact same construction (Git uses a content-addressable graph, not a balanced binary tree), but the property is the same: changing one byte of one file changes every ancestor tree's hash up to the commit. Certificate Transparency logs commit to every TLS certificate issued by a participating CA. Browsers verify, before trusting a certificate, that it was logged β€” using a Merkle inclusion proof against a public log root. The convention is slightly different (no leaf duplication), but the math is the same. Content-addressable storage in IPFS, Plan 9, and friends uses Merkle DAGs to address blobs. Same idea, different topology. The shared insight is that hashing is composable: a hash of a hash is still a fingerprint, and you can build a tree where the root is a fingerprint of the whole subtree. Once you see that, the rest is plumbing. The visualizer is intentionally small. A few things real implementations care about that this one doesn't: Domain separation β€” production Merkle trees prefix leaf hashes with a different byte than internal-node hashes (0x00 vs 0x01) to defeat second-preimage attacks where a verifier is tricked into treating an internal node as a leaf. CT and recent Bitcoin proposals do this; the demo doesn't, because adding it complicates the explanation. Streaming construction β€” buildTree here holds every level in memory. For a 100M-entry log you'd compute parent levels on-the-fly and forget the leaves once their level is finished. Proof formats β€” real-world proofs are serialized to specific byte layouts (RFC 6962 for CT, the Bitcoin protocol's compact format). The visualizer keeps proofs as plain JS objects since serialization isn't the point. If you're shipping production code: read the RFC 6962 (CT) or the relevant section of the Bitcoin developer reference instead of trusting any visualizer's defaults. The demo at https://sen.ltd/portfolio/merkle-viz/ starts with four "transaction" leaves. Click tamper next to leaf 2 to see the canonical demo: the leaf input goes red, the SVG node and its parent and the root all turn red, the root hash above the tree changes. The inclusion proof to the right still verifies β€” because it's been replayed against the new root, with the new sibling hashes that fall out of the new tree. Then open the "tamper with the leaf I'm verifying" details panel and type something different from the leaf text β€” that's when the verification fails, because now the candidate leaf doesn't actually correspond to any leaf in the tree. Source: https://github.com/sen-ltd/merkle-viz β€” MIT, ~350 lines total, 17 unit tests, no build step. Open merkle.js first; the rest is presentation. πŸ›  Built by SEN LLC as part of an ongoing series of small, focused developer tools. Browse the full portfolio for more.