AI News Hub Logo

AI News Hub

I built a "polite scraper" Chrome extension instead of a server-side scraper. Here's why.

DEV Community
Nik G

Six weeks ago I started building SlotOwl — a Chrome extension that watches This post is about ONE design decision I made early on that turned out to If you're building anything that watches a third-party website on a user's Government appointment portals are a nightmare. US visa dropbox, Schengen The existing tools to catch a slot fall into two camps: Manual — sit on F5 for hours/days Sketchy paid bots — $50–200 services that ask for your portal login and run a scraper on their server farm Camp 2 has three structural problems: Security: sharing your portal login with a third party is, at best, against the portal's ToS, and at worst gets your account locked Reliability: server-side scrapers get IP-banned constantly, breaking for hundreds of users at once Scale economics: every user costs CPU + bandwidth on the operator's servers I wanted a third option. The simplest version of that idea: what if the Here's the whole thing on a napkin: ┌────────────────────────────────────────────────┐ │ User's Chrome browser │ │ │ │ ┌─────────────┐ ┌──────────────────┐ │ │ │ Portal tab │ ←poll── │ Service worker │ │ │ │ (logged in) │ │ (background) │ │ │ └─────────────┘ └────────┬─────────┘ │ │ │ │ │ │ "slot found" └────────────────────────────────────┼───────────┘ │ ▼ ┌─────────────────────────┐ │ Firebase Cloud Func │ │ alertFanout │ └────┬──────┬──────┬──────┘ │ │ │ email push desktop Important: the portal HTML never leaves the browser. The only thing that The user is already logged into the portal in their own browser. The If you're a security-minded user, you can audit the extension's source Server-side scrapers funnel hundreds of users through a small pool of IPs When the scraper IS the user, that pattern disappears. Each user's The polling is happening on the user's machine. My only server-side Portals often throw captchas to deter automation. A server-side scraper In my model, when the polling script hits a captcha, the page state Chrome aggressively suspends extension service workers. To keep polling chrome.alarms API with a 1-min minimum, which This is reliable enough but it does mean if the user closes Chrome A server farm could in theory check the portal every 10 seconds for In practice, slot windows are 5–15 minutes wide on the portals I've Server-side scrapers can hard-code per-portal logic. I need users Solution: workflows are JSON definitions: { id: "schengen-stockholm", entryUrl: "https://visa.vfsglobal.com/swe/en/...", selectors: [ { match: "no available slots", state: "unavailable" }, { match: "available", state: "available" } ] } Anyone can define a new workflow without me shipping code. (In practice Extension: Manifest V3, vanilla JS (no React/Vue — fewer build steps, smaller bundle, faster to iterate). esbuild for bundling. Backend: Firebase Cloud Functions (Node 20). One function per responsibility — alertFanout, linkMintToken, linkConsumeToken, webPushSubscribe, sendEmail, getUsage, joinWaitlist, etc. Eleven functions total. Each is small enough to keep in your head. Database: Firestore. Workflows under users/{uid}/workflows/{id}, alert quotas under users/{uid}/usage/{yyyy-mm}. Email: Resend. Way cleaner API than SES or Mailgun for transactional. Cross-device push: Web Push API + VAPID keys. I considered Firebase Cloud Messaging but went with raw Web Push because (a) one fewer dependency, (b) when iOS Safari fully ships push to homescreen apps, Web Push will work natively. FCM would have meant another adapter. Marketing site: hand-rolled static site (no Next.js, no Nuxt). A build script reads partials and writes the dist folder. Total weight is ~30 KB CSS + 12 KB JS. If I were starting over today, three things: Define the workflow JSON schema even more strictly, sooner. I Build the alert quota system before the alert system. I built the Treat the privacy story as the marketing story from day 1. The What's next SlotOwl is currently in Chrome Web Store review (3 days in, fingers — if you (or anyone you know) is hunting an appointment, please share. Honest about the future: I don't know yet whether this is a $1k MRR If you're building something similar, or you've shipped a Chrome @greythinkinglab. — Nik / greythinkinglab https://greythinkinglab.com