How I auto-triage 200 emails a day with Aider and Nylas

DEV Community

Qasim Muhammad

May 4, 2026, 05:10 PM

My inbox averages 200 messages a workday. Half are noise. A quarter need a fast acknowledgement. The remainder need real work. The split is mostly stable, so the triage rules are mostly stable, so it is a good fit for an LLM. I wired Aider to it. Aider is the AI pair-programming CLI — it has a shell, it can call commands, and it speaks Python natively. Pairing it with the Nylas CLI gives a triage pipeline that runs on my laptop in the background and surfaces only the messages I should personally read. Three buckets: Bucket Action What lands here 🔥 Action Star, leave unread Customer escalations, oncall pages, anything from my CEO 👀 Skim Mark read, archive Newsletters, build notifications, "FYI" 🗑 Drop Mark read, archive, mark spam if confident Cold sales, recruiter spam, marketing A modest LLM gets these right >95% of the time. Aider drives it. # /opt/triage/triage.py import json import subprocess import sys def llm_classify(subject: str, snippet: str, sender: str) -> str: prompt = f"""Classify this email into one of: ACTION, SKIM, DROP. ACTION: needs my response or attention soon. SKIM: informational, can wait. DROP: spam, recruiter, marketing, newsletter. From: {sender} Subject: {subject} Snippet: {snippet} Reply with only one word.""" out = subprocess.run( ["aider", "--message", prompt, "--no-auto-commits", "--yes-always"], capture_output=True, text=True, timeout=30 ) label = out.stdout.strip().split()[-1].upper() return label if label in ("ACTION", "SKIM", "DROP") else "SKIM" def main(): raw = subprocess.check_output( ["nylas", "email", "list", "--unread", "--limit", "50", "--json"] ) msgs = json.loads(raw) for m in msgs: sender = m["from"][0]["email"] bucket = llm_classify(m["subject"], m.get("snippet", ""), sender) if bucket == "ACTION": subprocess.run(["nylas", "email", "mark-starred", m["id"]]) elif bucket == "SKIM": subprocess.run(["nylas", "email", "mark-read", m["id"]]) elif bucket == "DROP": subprocess.run(["nylas", "email", "mark-read", m["id"]]) subprocess.run(["nylas", "email", "delete", m["id"], "--yes"]) print(f"{bucket}: {m['subject'][:60]}") if __name__ == "__main__": main() 50-line python file. Nothing clever. # Manual trigger python /opt/triage/triage.py # Every 5 minutes crontab -e # Add: */5 * * * * /usr/bin/python3 /opt/triage/triage.py >> /var/log/triage.log 2>&1 The LLM call takes ~2 seconds per message; on a 50-message batch that is roughly 100 seconds. Cron's */5 is plenty of breathing room. Three reasons: It treats prompts as commands. aider --message '...' is a one-liner. No SDK to import, no auth ceremony. It is local and fast. I am calling it 200 times a day. Browser-loop tools rule this out. It is bring-your-own-key. I run it with Anthropic Sonnet for triage, switch to Opus when I want it to draft replies. If you prefer the OpenAI o1 model or a local Llama, swap the aider line for llm "..." (Simon Willison's tool), or directly call the API. The pipeline is provider-agnostic. The CLI gives me the same surface across Gmail, Outlook, Exchange, Yahoo, iCloud, and IMAP — without writing six different SDK integrations. Adding a second account is one nylas auth login command. The script does not change. It also exposes --json on every list command. That makes it pipeable into Python's json.loads without parsing prose. No HTML-stripping, no MIME decoding, just structured data. After 60 days running this on my main inbox: Time saved per workday: ~35 minutes (estimated by stopwatch on a sample week) False positives (action emails wrongly archived): 4 in 60 days, all when subject lines were ambiguous ("Quick question" from a sender I had not seen before) False negatives (drop emails left in inbox): too many to count, mostly recruiter LinkedIn forwards. I tightened the prompt twice and they faded. The wins compound: less time triaging means I read action mail sooner, which means faster replies, which means fewer follow-ups. Draft replies: I tried. The replies sounded like me from a distance, like me from up close they sounded like a chatbot. I removed it. Schedule meetings: handled by calendar-schedule-ai which is a separate command and worth its own writeup. Deal with attachments: passes through untouched. I read attachments manually. Classify this email into one of: ACTION, SKIM, DROP. ACTION = something I personally need to do or reply to within 24 hours. SKIM = useful but not urgent. DROP = spam, sales outreach, recruiters, marketing, automated build/deploy notifications. If unsure, prefer ACTION (false positive is cheap; false negative is expensive). Sender: {sender} Subject: {subject} First 200 chars: {snippet} That instruction (lower 95% confidence threshold for ACTION) is the most important tuning. Send too much to ACTION and you get noise; send too little and you miss escalations. The bias toward ACTION costs you 30 seconds of skimming. The other direction costs you a customer. Build an AI email triage agent in Python — full reference implementation Build an LLM agent with email and calendar tools — broader agent surface Why AI agents need email — the case for agent inboxes Full command reference