Designing a Second Brain for AI Agents: The Vault-as-Database Pattern

Three Folders and One Schema File

Andrej Karpathy's approach to personal knowledge bases has become the reference architecture for 2026: three folders — raw/, wiki/, outputs/ — and one schema file. No apps, no accounts, no database. The LLM writes and maintains all wiki data; you rarely touch it directly (from llm powered personal knowledge bases).

The architecture is disarmingly simple. Source documents go into raw/. The LLM incrementally compiles a wiki — a collection of .md files with summaries, backlinks, and categorized concepts. Outputs (answers, slide decks, visualizations) get rendered into outputs/, and the best of those get filed back into the wiki to enhance future queries. Every exploration compounds (from llm powered personal knowledge bases).

Nick Spisak's implementation guide strips this down further: flat files with a good schema outperform fancy tool stacks 90% of the time. Karpathy himself confirmed he keeps his schema "super simple and flat" in an AGENTS.md file — no database, no plugin, just a text file describing rules (from nick spisak shared link).

The schema file is the critical piece most people skip. It tells the AI what the knowledge base is about, how wiki/ is organized, that every topic gets its own .md file with one-paragraph summaries and [Topic Name](/topics/topic-name) links (from nick spisak shared link). Without it, the AI generates inconsistent structure. With it, the system self-organizes.

This works because of a property most people overlook: plain text markdown is the only format that satisfies every constraint simultaneously — human-readable, AI-readable, version-controllable, portable across tools, and free from vendor lock-in. Any AI tool that can read a filesystem (Cursor, Claude Code, OpenCode, Codex) can reason over your knowledge with zero integration work (from obsidian agentic workflows).

The Three-Layer Memory Stack

Multiple practitioners have independently converged on a three-layer architecture for AI memory. The layers are always the same; only the implementation details vary.

Layer 1: Session Persistence. A file that loads at the start of every session, teaching the agent who you are, what the system is, and what the conventions are. Karpathy uses AGENTS.md. Claude Code users use CLAUDE.md. Nyk frames it as an "exosuit" — when the agent joins a session, it puts on the accumulated knowledge of the entire organization (from nyk builderz shared link). The file should stay under 200 lines. Below it sits an auto-memory directory: a routing document (MEMORY.md) linking to topic files for debugging patterns, architecture decisions, and user preferences (from nyk builderz shared link).

Layer 2: Knowledge Graph. The wiki itself — Obsidian vault, markdown directory, or any structured file collection the agent can search. For Obsidian users, two MCP servers bridge Claude to the vault: smart-connections for semantic search (finds notes without exact titles) and qmd for structured queries and metadata operations (from nyk builderz shared link). Together they give the agent both fuzzy and precise retrieval.

Layer 3: Ingestion Pipeline. The system that feeds new knowledge into the graph. Karpathy uses Obsidian Web Clipper to capture web articles. The brain-ingest tool processes video and audio locally — no data leaves your machine — extracting 12-18 claims, 3-5 frameworks, and 5-8 techniques from a single 90-minute talk (from nyk builderz shared link). Automated nightly pipelines pull from bookmarks, feeds, and documentation.

Skip one layer and the others degrade. Session persistence without a knowledge graph means the agent knows your preferences but has no accumulated knowledge to reason over. A knowledge graph without ingestion goes stale. An ingestion pipeline without session persistence means the agent doesn't know how to use what it's learned (from nyk builderz shared link).

Why Context Delivery Is the Real Bottleneck

Tiago Forte's PARA system dominated personal knowledge management for a decade. The arrival of AI agents reframed the problem entirely. Forte Labs now calls this "Personal Context Management" — the bottleneck isn't AI capability, it's your ability to give AI the right information at the right time (from brain as context window).

Context amnesia is the fundamental agent problem. A 200K context window changes how much text the model can scan, not how much it "knows." Scanning is not knowing. Without a memory system, every session is a first date (from nyk builderz shared link).

The practical implication: every minute spent improving your knowledge base's structure multiplies every future AI interaction. The gap between generic and executive-level output is entirely a context problem, not a model capability problem (from claude cowork workspace setup system). Most people start from scratch every session. The people getting extraordinary results have pre-loaded their context.

Connecting live data sources (Slack, Gmail, Calendar, Notion) to your workspace lets Claude pull real data instead of guessing — a bigger lever than better prompting (from claude cowork workspace setup system). Six months of captured research lets you ask meta-questions about your own thinking patterns. That is your own pattern recognition, amplified by a machine that never forgets what you wrote (from obsidian claude code jarvis cyrilxbt).

The Agent Maintains the Wiki

Every previous note-taking system died the same death: maintenance. Agents fix this. The thing that killed every wiki is what agents are built for — they don't get bored with maintenance. They notice contradictions between notes, specs out of sync with codebases, and propose structural changes when the current architecture creates drag (from nyk builderz shared link).

Karpathy's approach makes this explicit: you don't edit the wiki by hand. That's the AI's job. You read it, ask questions against it, and the AI keeps it updated (from nick spisak shared link). The Exo Brain system implements this as a self-feeding read/write loop: Claude reads the vault before sessions and writes back summaries and decisions after. The vault gets richer without you lifting a finger (from exo brain obsidian claude second brain).

But there's a critical design constraint. The "agents read, humans write" principle for the source layer keeps raw knowledge purely human-authored, preventing AI-generated text from contaminating the ground truth you reason from (from exo brain obsidian claude second brain). The separation is:

Human-authored knowledge (sources, raw captures) — the ground truth. Sacred.
Agent-maintained wiki (compiled topics, summaries, indexes) — derived and continuously updated by the LLM.
Agent working memory (scratchpads, session logs) — ephemeral operational context.

Every question you ask makes the next answer better, because answers get saved back into the knowledge base (from nick spisak shared link). This is the compounding mechanism: the wiki doesn't just store knowledge, it gets smarter through use.

The Schema File: Your System Prompt for Life

The most important file in any vault-as-database system is the one that gets read first. In Claude Code, that's CLAUDE.md — the correct way to think about it is as "the system prompt for your life" (from exo brain obsidian claude second brain).

Better yet, frame it as a teaching document: "this vault is your exosuit; when you join this session, you put on the accumulated knowledge of the entire organization" (from nyk builderz shared link). This shifts the relationship from assistant to organizational intelligence.

Keep it under 200 lines. It should answer three questions:

Who am I working with? — Your name, role, projects, communication style
What's the system? — Directory structure, conventions, how files relate
What are the rules? — Quality standards, boundary conditions, file formats

Below CLAUDE.md, an auto-memory directory provides structured persistence. MEMORY.md acts as a routing document (also under 200 lines, always loaded), linking to topic files: debugging.md, patterns.md, architecture.md, preferences.md. Detailed notes live in topic files; the routing document just tells the agent where to look (from nyk builderz shared link).

The compounding effect is real. Session 1, the agent knows your folders. Session 5, your projects and preferences. Session 20, it has better recall of your work than you do (from exo brain obsidian claude second brain). A three-folder context architecture (knowledge + projects + people) creates a compounding curve where by Day 90, the agent surfaces connections across your work you haven't consciously noticed (from claude code file structure system).

For teams or workspaces, layer instructions at three levels: Level 1 (universal personalization rules), Level 2 (global formatting and naming conventions), Level 3 (project-specific instructions, isolated to prevent contamination between domains) (from twitter link share hooeem).

Directory Structure as Database Schema

Your directory structure IS your database schema. Each top-level directory is a collection, each file is a record, and YAML frontmatter provides the columns.

The Karpathy Layout

vault/
  raw/              # Source documents — articles, papers, transcripts
  wiki/             # LLM-compiled knowledge — summaries, backlinks, concepts
  outputs/          # Generated answers, slides, visualizations
  CLAUDE.md         # Schema file — loaded every session

A Production Layout

For systems that distinguish source types and support richer operations:

vault/
  sources/          # Raw inputs from the world
    tweets/         # Captured tweets with extracted insights
    meetings/       # Meeting transcripts and summaries
    slack/          # Slack thread captures
    dictation/      # Voice-to-text captures
    captures/       # Freeform notes
  topics/           # Synthesized knowledge, one file per topic
    topic-name.md   # Simple topic
    topic-name/     # Hierarchical sub-topics when it grows
  plans/            # Structured action plans
  guides/           # Polished, curated answers
  memory/           # Agent memory directory
    MEMORY.md       # Routing document
    patterns.md     # Recurring patterns observed
    preferences.md  # User preferences learned
  CLAUDE.md         # Schema file

The separation between sources/ and topics/ is the most important structural decision. Sources are raw inputs — timestamped, attributed, immutable. Topics are living documents that get continuously enriched. Sources accumulate. Topics compound.

Frontmatter as Query Layer

YAML frontmatter makes the vault queryable without reading file contents:


type: tweet
date: "2026-04-05"
source_url: "https://..."
author: "@handle"
topics: ["topic-slug"]
synthesized: false
tags: ["tag1", "tag2"]

The synthesized field is a state machine: false means unprocessed, true means insights extracted into topics. This lets the system know what work remains.

Naming Conventions

Use YY-MM-DD-<slug>.md for chronological sorting and collision prevention. Derive slugs from content, not source titles.

For wiki/topic files, consider prose-as-title naming: notes named as claims ("memory graphs beat giant memory files.md") rather than categories ("memory-systems.md"). Result titles alone tell the AI whether a note is relevant before reading content. Wikilinks then read as sentences, making the graph self-documenting (from nyk builderz shared link).

Progressive Disclosure: Managing What the Agent Sees

Context windows are finite. The solution is progressive disclosure — each layer loads only when needed, with cost increasing at each step.

Layer 0 — Always loaded: CLAUDE.md bootstraps every session. Free.

Layer 1 — Session context: Load relevant topic and source files for the current task. The agent discovers these via frontmatter links and wikilinks.

Layer 2 — Deep retrieval: Skills like /ask or /synthesize scan the full vault and assemble focused context packages. This is where frontmatter metadata pays off — filter by date, topic, type, and synthesized status without reading every file. At around 100 articles and 400K words, LLMs can handle complex Q&A without fancy RAG — auto-maintained index files and brief document summaries are sufficient for the model to navigate at this scale (from llm powered personal knowledge bases).

Layer 3 — External context: Live data sources via MCP connections. The cf-crawl skill creates a daily crawl job that keeps local markdown in sync with upstream docs — a self-updating context layer with zero manual work (from cf crawl scheduled knowledge base).

For skills themselves, progressive disclosure means restructuring monolithic instructions into a slim main file (table of contents) plus separate reference files loaded on demand — achieving 89% context reduction with no loss of functionality (from progressive disclosure claude skills).

Design your system so the common case is cheap and the rare case is thorough.

The Scratchpad Pattern: Operational Memory

Beyond the vault and the schema file, agents need a distinct form of working memory. Not session history (lossy), not todos (static), but a live scratchpad the agent writes to as it works (from agent scratchpad napkin pattern).

Agents that log their own mistakes, corrections, and what worked exhibit compounding improvement. By session five, the tool behaves fundamentally differently (from agent scratchpad napkin pattern). This gives three timescales of memory:

Within-session — scratchpad writes as the agent works
Across-sessions — memory.md log of decisions and lessons
Long-term — synthesized patterns promoted to CLAUDE.md or permanent topic files

The Claude Subconscious project pushes further: a background agent processes full session transcripts silently, maintaining persistent memory blocks for coding preferences, architecture decisions, recurring struggles, pending items, and active guidance. One agent brain connects across all projects simultaneously (from claude subconscious ai memory agent).

A simpler approach: effective AI memory should be curated and distilled (decisions, lessons, opinions), not raw conversation log. The "daily drip" pattern — one thoughtful personal question per day, processed and filed — adds more useful context after six weeks than an initial onboarding interview (from shpigford hyper personalization ai).

Quality Maintenance: The Health Check

A knowledge system that compounds must also maintain quality. Without checks, errors compound alongside knowledge — LLM outputs filed back into the wiki can introduce contradictions that propagate into future answers.

The monthly health check prompt: "Review wiki/, flag contradictions between articles, find topics mentioned but never explained, list claims not backed by a source in raw/, suggest 3 new articles for gaps" (from nick spisak shared link). This prevents the most dangerous failure mode of self-maintaining systems: confident answers built on inconsistent foundations.

LLM health checks can also impute missing information with web searches, find interesting connections between articles, and suggest further questions to investigate (from llm powered personal knowledge bases). The key is making this a scheduled, recurring operation — not something you remember to do when things feel off.

The Ingestion Pipeline

A second brain is only as good as what goes into it.

Source Types and Capture Methods

Source	Capture Method	Automation
Web articles	Obsidian Web Clipper → .md files (from llm powered personal knowledge bases)	Manual trigger
Tweets/posts	CLI tool + Claude API enrichment	Fully automated (nightly)
Video/audio	brain-ingest (local, 12-18 claims per 90-min talk) (from nyk builderz shared link)	Semi-automated
Web pages	agent-browser (82% fewer tokens than Playwright) (from nick spisak shared link)	Automated
Meetings	Granola MCP or transcript API	Semi-automated
Slack threads	Slack MCP	On-demand
Documentation	Crawl-to-markdown pipeline	Fully automated (daily)
Voice notes	Wispr Flow or dictation app	Manual trigger
Freeform	`/capture` command	Manual

The Enrichment Step

Raw captures are low-value without enrichment. The enrichment step transforms a raw source into a structured record with:

Extracted insights — atomic, actionable observations (typically 3-8 per source)
Topic matching — links to existing topic files
Metadata — author, date, engagement, source URL
Wikilinks — inline references connecting to the knowledge graph

Each insight should be specific and actionable with an inline citation: (from [Source Slug](/topics/source-slug)). Not vague summaries. The insight format matters because topic files must read like briefing documents, not bibliographies.

Synthesis: Organizing by Theme

The most critical synthesis principle: organize insights by theme, not by source. Related ideas from different sources should be adjacent.

When a topic grows large (30+ insights), split into sub-topics. The parent becomes a summary with index links; sub-topics get focused files. This is progressive disclosure applied to knowledge itself — read the parent for a broad overview, drill into sub-topics for depth.

Cross-cutting pattern detection is the highest-value synthesis operation. The Exo Brain system calls this /emerge — scanning for patterns implied by notes but never explicitly written (from exo brain obsidian claude second brain). Individual sources are data. Topics are information. Cross-cutting patterns are knowledge.

Enterprise Scale: The Single Brain Architecture

The vault-as-database pattern scales from personal to organizational. Eric Siu's Single Grain runs Dorsey's four-layer AI-native org architecture: a unified vector database — the "Single Brain" — ingests all company data every 15 minutes. Slack, CRM, Gong transcripts, Google Analytics, Search Console, client deliverables, financial data — every agent queries the same brain, so sales sees marketing performance and team capacity when evaluating leads (from shared link without context).

A fleet of specialized agents (ops, sales, SEO, content, recruiting) with a coordinator sitting above all queries the Single Brain. The compounding curve is steep: Month 1 was terrible (hallucinations, broken automations). Month 2, AutoResearch surfaced patterns humans missed — sales call keywords correlating with 3x close rates. Month 3, the flywheel turned as accumulated data improved every agent's output (from shared link without context).

The strategic insight: months of continuous data ingestion creates a world model competitors need years to replicate. Not because the tech is secret, but because proprietary data accumulates in ways that can't be fast-forwarded (from shared link without context). At the enterprise level, the knowledge base isn't a productivity tool — it's a moat.

This connects to a broader pattern: AI compresses the time it takes to DO things but does not compress the time it takes for things to HAPPEN. Living proprietary data — continuously generated through operations, not static datasets — is what survives the AI era (from ashugarg shared link).

Scaling: From Flat Files to SQLite

The flat-file architecture has a ceiling. Garry Tan's Karpathy-style git wiki for OpenClaw hit 2.3GB, and git's 5GB limit means migration is inevitable at scale (from garry tan openclaw git wiki gstack).

The scaling path:

0-100 articles (~400K words): Flat files work perfectly. LLMs handle Q&A without RAG. Auto-maintained index files and document summaries are sufficient (from llm powered personal knowledge bases).
100-1000 articles: Hierarchical topics, sub-directories, and more aggressive summarization keep things manageable. This is where progressive disclosure and smart indexing become essential.
1000+ articles / 2GB+: Plan for database migration. SQLite backends provide the query performance flat files can't sustain. GStack's autoplan skill can generate architectural specs for the upgrade through a single prompt (from garry tan openclaw git wiki gstack).
Long-term trajectory: Synthetic data generation and finetuning to have the LLM "know" the data in its weights rather than relying on context windows (from llm personal knowledge base workflow).

Plan for database transitions early in long-running knowledge projects. The architecture should support the migration path, not fight it.

Cost Architecture

Running AI agents against a knowledge base has a cost structure you can optimize aggressively.

The 80/15/5 Distribution

Eighty percent of agent tasks are janitorial — file reads, status checks, formatting. Fifteen percent require moderate reasoning. Only five percent need frontier intelligence (from hierarchical model routing cost).

Hierarchical model routing achieves roughly 10x cost reduction: cheap models for scanning and metadata, mid-tier for synthesis, frontier only for cross-cutting pattern detection (from hierarchical model routing cost). An all-frontier approach costs around $225/month; hierarchical routing drops this to about $19/month.

Token Management

The 30-45 minute rule: one focused topic per context window, then start fresh. Use Sonnet 99% of the time — don't put Opus on routine tasks. Write reusable scripts instead of processing items conversationally; scripts use a fraction of the tokens (from twitter link share hooeem).

Building the System: A Prescriptive Playbook

Phase 1: Foundation (Day 1)

mkdir -p vault/{raw,wiki,outputs}
cd vault && git init

Write your schema file. Start with three sections: who you are, what the structure is, what the rules are. Keep it under 100 lines.

Create 3-5 seed topic files in wiki/ — just a title and empty sections. These are the starting points your system grows from.

Phase 2: Capture (Week 1)

Start manually. Every time you encounter something worth remembering, create a source file in raw/. The manual phase teaches you what your format should look like and what your topic taxonomy needs.

After 10-15 sources, run your first synthesis: pull insights into topic files, grouped by theme with inline citations.

Phase 3: Automation (Week 2-3)

Automate your highest-volume source type. Set up nightly batch ingestion, connect MCP servers, or schedule crawl-to-markdown pipelines.

Add the memory layer: memory.md as an append-only session log, with a line in CLAUDE.md telling the agent to log decisions after each session.

Phase 4: Operations (Week 3-4)

Build your first operational skills — /ask, /synthesize, /capture. Each is a markdown file describing the task, inputs, output format, and constraints. Building AI systems with Claude Code requires systems thinking, not software engineering — write detailed Markdown files describing desired behavior (from jimprosser chief of staff claude).

Phase 5: Compounding (Month 2+)

Run monthly health checks. Expand source types. Add the scratchpad for operational memory. Run consolidation to surface cross-cutting patterns.

Track the compounding metric: how many sessions before the agent anticipates your needs rather than just responding to them.

Architectural Principles

The filesystem is the database. No external databases for storage. Everything is a markdown file in a git repo. Flat files with a good schema outperform fancy tool stacks 90% of the time.
Three layers, all three required. Session persistence + knowledge graph + ingestion pipeline. Skip one and the others degrade.
Context delivery beats model capability. A well-organized vault with a mid-tier model outperforms a messy vault with a frontier model.
The agent maintains the wiki. You don't edit it by hand. The AI compiles, updates, links, and cleans. You read, ask questions, and provide raw inputs.
Progressive disclosure manages cost. Schema file loads free. Topic summaries on demand. Full sources for deep retrieval. External APIs only when the vault can't answer.
Organize by theme, not by source. Topic files are briefing documents, not bibliographies.
Health checks prevent error compounding. Monthly reviews catch contradictions, missing explanations, and unsourced claims before they propagate.
Plan for scale. Start with flat files. Know that 2GB+ means SQLite. Design the schema to survive the migration.
Automation runs overnight. Ingestion, crawl jobs, synthesis passes — all scheduled for off-hours. You wake up to a vault that's richer than when you went to sleep.
The data compounding is the moat. Months of accumulated, operational data creates a world model that can't be replicated by starting fresh. This is true for individuals and organizations alike.

Sources Cited

nick spisak shared link — Step-by-step Karpathy method: three folders, schema file, agent-browser, monthly health checks
llm powered personal knowledge bases — Karpathy's original LLM knowledge base post: raw/ → wiki/ → outputs/, Obsidian as IDE, Q&A at scale, health checks
llm personal knowledge base workflow — Karpathy follow-up: idea files over code sharing, synthetic data + finetuning trajectory
nyk builderz shared link — Three-layer memory stack, CLAUDE.md as exosuit, auto-memory directory, prose-as-title naming, brain-ingest, MCP bridges
shared link without context — Single Brain architecture at Single Grain: unified vector DB, agent fleet, compounding data as moat
garry tan openclaw git wiki gstack — Git wiki scaling limits at 2.3GB, SQLite migration path
claude code file structure system — Three-folder context architecture: knowledge + projects + people, compounding by Day 90
twitter link share hooeem — Three-level Cowork instructions, token management, scheduled tasks
ashugarg shared link — Decision traces as enterprise moat, compounding proprietary data, write path vs read path
obsidian claude code jarvis cyrilxbt — Vault as foundation, Claude Code as engine, pattern recognition amplified
brain as context window — Repo-as-database, Personal Context Management reframes the bottleneck
exo brain obsidian claude second brain — Self-maintaining vault, agents read / humans write, CLAUDE.md as system prompt, compounding sessions
obsidian agentic workflows — File-system-as-database is agent-agnostic across Cursor, Claude Code, OpenCode
claude cowork workspace setup system — Context engineering: 8-phase bootstrap, live data sources, first-try drafts
cf crawl scheduled knowledge base — Crawl-to-markdown + Scheduled Tasks for self-updating knowledge bases
hierarchical model routing cost — 80/15/5 task distribution, hierarchical model routing for 10x cost reduction
agent scratchpad napkin pattern — Live agent scratchpad, compounding self-correction across sessions
claude code origin story yc lightcone — Terminal simplicity, generalist-over-specialist
obsidian claude skills framework — Obsidian + Claude Code skills: persistent memory, inline operations
claude subconscious ai memory agent — Background Letta agent with persistent memory blocks, cross-project learning
jimprosser chief of staff claude — Systems thinking via Markdown, architecture matters more than code
progressive disclosure claude skills — 89% context reduction via progressive disclosure in skill files
shpigford hyper personalization ai — Daily drip pattern, curated memory over raw logs, 6-week inflection
obsidian vault claude code memory — obsidian-mind: purpose-built Obsidian vault for Claude Code memory persistence