AI News Hub Logo

AI News Hub

Practicing Karpathy's Personal Knowledge Base Method with a Git Repository

DEV Community
Xunxing Mao

This article was originally published on maoxunxing.com. Follow me there for more on AI-assisted workflows, Hugo, and knowledge systems. Andrej Karpathy recently shared a practical approach on X/Twitter and published a complete LLM Wiki Gist: using LLMs to build personal knowledge bases for research topics. The core workflow: Dump source files (articles, papers, screenshots) into a raw/ directory Use an LLM to "compile" them into structured Markdown knowledge entries Browse everything in Obsidian Query the knowledge base — the LLM searches and answers autonomously Periodically run LLM "health checks" to fix contradictions and fill gaps His knowledge base has grown to ~100 entries and 400K words. No RAG needed — the LLM maintains indexes and summaries to handle all queries. In one sentence: raw materials in, structured knowledge out, LLM does the heavy lifting. Karpathy uses Obsidian as his viewer. But if you already have a Hugo blog repository, you don't need any extra software: Need Obsidian Approach Hugo Repo Approach View Markdown Obsidian editor hugo server -D local preview Link knowledge [[]] backlinks + graph Hugo tags + Algolia search Publish output Requires extra export Remove draft: true, push Version control Needs Obsidian Git plugin It's already a Git repo Multi-device sync Obsidian Sync or iCloud git pull Search Built-in Obsidian search grep / Algolia / LLM The key advantage: knowledge refined into articles publishes directly — zero migration cost. One repo, full pipeline from collection to publication. Build three content tiers inside your repository: content/ raw/ notes -> posts. Materials only get more refined, never regress. mkdir -p content/raw cat content/raw/_index.md --- title: "Raw" description: "Knowledge inbox" draft: true --- EOF Create archetypes/raw.md: --- title: "{{ replace .Name "-" " " | title }}" date: {{ .Date }} draft: true tags: [] source: "" --- Now hugo new raw/topic-name/index.md auto-generates entries with the template. Add raw to the permalinks section in config.toml: [permalinks] raw = "/:slugorcontentbasename/" See a good article or have an idea? Create a raw entry immediately: hugo new raw/interesting-topic/index.md Paste in the content. No formatting needed, no perfection required — raw state is fine. This is the heart of Karpathy's method and the highest-value step. Have the LLM read multiple related materials from raw/ and synthesize a notes/ entry: "Read all raw entries tagged with AI, synthesize them into a structured knowledge entry under content/notes/ai-fundamentals/. Requirements: extract core concepts, add cross-references, cite sources." When a notes entry has accumulated enough depth: "Based on the knowledge entry in content/notes/ai-fundamentals/, write a developer-facing blog post for content/posts/. Requirements: include opinions, real examples, and actionable advice." Periodically audit the knowledge base: "Scan all entries in content/raw/ and content/notes/. Find: 1) duplicate topics that should merge 2) entries missing tags 3) raw materials ready to compile into notes" Take it further with a Qoder Skill — one sentence does it all: /kb collect https://example.com/article — fetch and create a raw entry /kb collect I learned today that LoRA fine-tuning's key is... — quick-capture a thought /kb compile AI — compile AI-related raw materials into a notes entry /kb preview — start local preview with all materials visible /kb check — LLM health check The visual flow: See a great article / Have an insight | v /kb collect "content" live on the web The entire process: Collection: Zero friction, one sentence Compilation: LLM handles the grunt work Publishing: Remove draft: true, push to deploy No extra software: Git + Hugo + LLM, that's it Aspect Karpathy's Version This Approach Storage Standalone knowledge repo Embedded in blog repo Viewer Obsidian hugo server -D Raw materials raw/ directory content/raw/ (draft) Compilation LLM generates .md LLM generates notes/ Output Markdown/Marp/charts Directly published as blog posts Search Custom search engine grep + Algolia + LLM Health checks LLM audit Same LLM audit The biggest difference: Karpathy's knowledge base is standalone — output requires manual migration. In this approach, the knowledge base and blog are unified. Collection to publication happens in one repository, with zero migration cost. The core of Karpathy's method isn't about which tools you use — it's about establishing a "collect -> compile -> output" knowledge pipeline and letting the LLM handle compilation and maintenance. If you already have a blog repository, you can implement this method right inside it: add content/raw/ as an inbox, use draft: true to control visibility, and let the LLM drive the flow from raw materials to knowledge to published articles. No Obsidian. No Notion. No new software. One Git repo is your knowledge base. If you're interested in AI-assisted development workflows, check out my AI Coding Playbook for tool selection and prompt templates. I also wrote AI Rewriting Workflow on how knowledge workers can adapt when AI multiplies leverage. Andrej Karpathy's original post — X/Twitter thread on LLM Knowledge Bases — The original announcement describing the raw/ -> wiki compilation workflow. LLM Wiki Gist — github.com/karpathy/442a6bf... — Karpathy's complete LLM Wiki pattern specification, defining the three-layer architecture (source materials, AI-generated wiki, configuration). How to Build a Personal LLM Knowledge Base (Karpathy's Method) — Step-by-step walkthrough of implementing Karpathy's method. How To Do PHD-Level Research with AI (Karpathy's LLM Wiki) — Deep dive into using the LLM Wiki pattern for academic-level research. Karpathy's LLM Wiki: The End of Forgotten Knowledge — Analysis of the LLM Wiki pattern as an alternative to traditional RAG retrieval. VentureBeat — Karpathy shares 'LLM Knowledge Base' architecture that bypasses RAG — Industry analysis of why this approach works without complex RAG pipelines. MindStudio — What Is Andrej Karpathy's LLM Wiki? — Practical guide to building an LLM Wiki with Claude Code. Antigravity Codes — Karpathy's LLM Knowledge Bases: The Post-Code AI Workflow — Technical breakdown of the workflow as a "post-code" paradigm. Reddit r/ObsidianMD — Implemented Karpathy's LLM knowledge base workflow in Obsidian — Community discussion on Obsidian-based implementations. DEV Community — A Personal Git Repo as a Knowledge Base Wiki — Using plain Git + Markdown as a personal wiki, the foundational approach this article builds upon. Hacker News — Repurposing Hugo as a wiki — Discussion on using Hugo for wiki-style knowledge management. Felix Mao | maoxunxing.com | @maoxunxing