AI News Hub Logo

AI News Hub

Part 2 : I got tired of wiring the same caching stack every project, so I built LayerCache

DEV Community
날다람쥐

LayerCache in Production: 5 Patterns That Actually Save You Part 2 of the LayerCache series. If you missed Part 1, start here — it covers the core problem and the basic setup. The first post got more traction than I expected (200+ views in two days — thank you). "OK, the basic setup makes sense. But what does it look like in a real service?" This post answers that. I'll walk through five patterns I keep reaching for, and how LayerCache handles each one without you having to wire it yourself. wrap() The most tedious part of caching is key management. You write a function, then you write a cached version of that function, then you keep the two in sync forever. Bugs love that gap. wrap() closes it. It decorates a function directly, deriving the cache key from the arguments automatically: import { CacheStack, MemoryLayer, RedisLayer } from 'layercache' const cache = new CacheStack([ new MemoryLayer({ ttl: 60 }), new RedisLayer({ client: redis, ttl: 3600 }), ]) // Before: you manage the key async function getUser(id: number) { return cache.get(`user:${id}`, () => db.findUser(id)) } // After: wrap() handles the key for you const getUser = cache.wrap(db.findUser.bind(db), { keyPrefix: 'user', ttl: 60, tags: ['users'], }) // Call it exactly like the original function const user = await getUser(123) The key is derived from keyPrefix + JSON.stringify(args). For most use cases that's exactly what you want. You can override it with a custom keyResolver if you need something more specific — for example, to exclude certain arguments or normalize them first. I've shipped services where the caching layer was a black box in production. Hit rates? No idea. Was Redis actually being used? Grep the logs and hope. Was L1 evicting the wrong things? Complete mystery. LayerCache has metrics built in, and they're dead simple to pull: const stats = cache.getStats() console.log(stats) // { // hits: 18432, // misses: 241, // hitRate: 0.987, // fetches: 241, // staleHits: 18, // stampedeDedupes: 7, // layers: [ // { name: 'MemoryLayer', hits: 16100, misses: 2332, avgLatencyMs: 0.006 }, // { name: 'RedisLayer', hits: 2191, misses: 141, avgLatencyMs: 0.021 }, // ] // } Per-layer latency is tracked using Welford's online algorithm — no memory overhead from storing every sample. If you're running Prometheus, there's a one-liner exporter: import { createPrometheusExporter } from 'layercache' const exporter = createPrometheusExporter(cache) app.get('/metrics', (req, res) => { res.set('Content-Type', 'text/plain') res.send(exporter.export()) }) And if you want OpenTelemetry traces showing exactly which layer served each request: import { createOpenTelemetryPlugin } from 'layercache' import { trace } from '@opentelemetry/api' createOpenTelemetryPlugin(cache, trace.getTracer('my-service')) This uses event hooks under the hood — no method monkey-patching. After wiring it up, you'll see spans like layercache.get → L1 miss → L2 hit → backfill L1 in your trace explorer. Debugging a cache performance regression goes from "add logging, redeploy, wait" to just opening your trace UI. Here's a subtle production problem: your most popular page gets 50,000 hits a day. Your least popular page gets 3. With a fixed TTL, they both expire on the same schedule and both go back to the database. The unpopular page is fine — it just misses. The popular page creates a brief window where every concurrent user hits the expiry simultaneously. Stampede prevention helps, but the smarter fix is to just not let hot keys expire as quickly in the first place. Adaptive TTL automatically extends the TTL for hot keys, up to a ceiling you define: new MemoryLayer({ ttl: 30, adaptiveTtl: { enabled: true, maxTtl: 300, // never cache beyond 5 minutes hitsPerStep: 10, // ramp up every 10 hits stepMs: 30_000, // each step adds 30 seconds } }) A key that gets hit 100 times gradually ramps its TTL toward maxTtl. A key that goes cold resets back to the base TTL. You never have to profile and hardcode special TTLs for hot keys. Pair it with staleWhileRevalidate and hot keys become almost invisible from the user's perspective — they always get a value immediately, and the refresh happens in the background: new MemoryLayer({ ttl: 60, staleWhileRevalidate: 600, adaptiveTtl: { enabled: true, maxTtl: 300, hitsPerStep: 20, stepMs: 30_000 }, }) You shouldn't have to rewrite route handlers to add caching. LayerCache ships middleware for the frameworks you're probably already using. Express: import { createExpressCacheMiddleware } from 'layercache' app.get('/api/users', createExpressCacheMiddleware(cache, { ttl: 30, tags: ['users'], keyResolver: (req) => `users:${req.url}`, }), async (req, res) => { res.json(await db.getUsers()) } ) Cached responses come back with an x-cache: HIT header — useful for debugging in staging without changing any application logic. Fastify: import { createFastifyLayercachePlugin } from 'layercache' await fastify.register(createFastifyLayercachePlugin(cache, { statsRoute: '/cache-stats', // optional: expose metrics endpoint })) fastify.get('/api/products', async (request, reply) => { return fastify.cache.get('products:all', () => db.getProducts()) }) tRPC: import { createTrpcCacheMiddleware } from 'layercache' const cachedProcedure = publicProcedure.use( createTrpcCacheMiddleware(cache, 'trpc', { ttl: 60 }) ) export const appRouter = router({ getUser: cachedProcedure .input(z.object({ id: z.number() })) .query(({ input }) => db.findUser(input.id)), }) GraphQL resolver: import { cacheGraphqlResolver } from 'layercache' const resolvers = { Query: { user: cacheGraphqlResolver( cache, 'gql:user', (_, { id }) => db.findUser(id), { ttl: 60, tags: ['users'] } ), }, } The pattern is the same across all of them: wrap the data-fetching part, leave the rest of your route/resolver untouched. Cold starts are painful. You deploy, traffic hits, and for the first 30 seconds every request goes all the way to the database while the cache warms up organically. In a high-traffic service that's a visible latency spike right after every deploy. Cache warming pre-populates your layers before your service starts accepting traffic: // Define what to warm and in what order await cache.warm([ { key: 'config:global', fetcher: () => db.getGlobalConfig(), ttl: 300, priority: 1, // load first }, { key: 'categories:all', fetcher: () => db.getAllCategories(), ttl: 600, priority: 2, }, { // Warm a batch of known hot keys keys: topUserIds.map(id => `user:${id}`), fetcher: (key) => db.findUser(Number(key.split(':')[1])), ttl: 60, priority: 3, }, ]) // Now the cache is warm — start accepting traffic app.listen(3000) Lower priority number = loads first. LayerCache runs each priority group before moving to the next, so your most critical data is always ready first. If a fetcher fails during warm-up, it's skipped rather than crashing your startup. One thing I didn't cover in Part 1: the admin CLI. You don't need to write any code to inspect or manage a Redis-backed cache in a running environment. # See overall hit/miss stats npx layercache stats # List keys matching a pattern npx layercache keys --pattern "user:*" # Invalidate all keys tagged with 'posts' npx layercache invalidate --tag posts # Delete a specific key npx layercache delete user:123 It's saved me more than once when debugging a production issue. Instead of writing a one-off script to peek at what's in the cache, you just run the CLI. Part 1 covered why LayerCache exists. This post covered how to actually use it past the basics: wrap() for zero-boilerplate function caching Built-in metrics, Prometheus, and OpenTelemetry for observability Adaptive TTL to stop treating hot keys and cold keys the same Drop-in middleware for Express, Fastify, tRPC, and GraphQL Cache warming so your first request after a deploy isn't your slowest If any of this is useful, the best way to support the project is a ⭐ on GitHub — it genuinely helps other developers find it. 👉 github.com/flyingsquirrel0419/layercache Questions, edge cases, or features you'd want to see? Drop them in the comments — the next post will probably be driven by whatever comes up there.