AI Automation · Verified demand

Build an AI Knowledge Base Without RAG: The Markdown Second-Brain (and Codebase Memory) Approach

Knowledge management / developer tooling / operations·Build difficulty 3/5

Markdown Knowledge-Graph Second Brain / Codebase Memory (non-RAG): ingest your docs, transcripts, or codebase into a wiki of linked markdown files (optionally a knowledge graph via Graphify) with an auto-built index, so an AI agent answers questions over your knowledge accurately without embeddings, a vector database, or a chunking pipeline.

The problem

Most teams have knowledge scattered across Google Docs, Notion pages, call transcripts, Slack threads, and large or legacy codebases. Chat context is ephemeral, so AI agents forget what they were told and answer from stale or missing information. The default fix, retrieval-augmented generation (RAG), forces you to stand up an embeddings model, a vector database, and a chunking pipeline before you can ask a single question, and it still retrieves imprecise fragments that miss how documents relate to each other. For an SMB or agency that just wants a reliable internal knowledge base or an executive second brain, and for a developer who wants Claude Code or Cursor to actually understand a large codebase, that is heavy, expensive, and slow to maintain.

Who it's for

SMBs and agencies that want an internal knowledge base or executive second brain without standing up a vector database, and developers working in large or legacy codebases who want Claude Code, Cursor, or another coding agent to have durable, accurate context on the whole repo. It fits any team that produces a lot of documents, transcripts, or code but loses that knowledge to ephemeral chat sessions.

How it works

  1. 1

    Define the structure: create a raw/ folder for source material, a wiki/ folder for the linked output, and a CLAUDE.md (or AGENTS.md) schema file that tells the agent the naming convention, link style, and index format to follow. For a codebase, point Graphify at the repo root instead.

  2. 2

    Ingest the source: drop docs, exported transcripts, and notes into raw/, or use Obsidian Web Clipper to capture articles. For code, run Graphify to parse the codebase into a knowledge graph of files, functions, and their relationships.

  3. 3

    Chunk into a linked wiki: have the agent (Claude Code) read each raw source, split it into atomic markdown pages, add [[wiki-links]] between related pages, and write or update a single index.md plus a running change log so the whole library is navigable.

  4. 4

    Unify into a second brain (optional): merge the wiki pages and the Graphify graph output into one Obsidian vault so you get backlinks, graph view, and a single searchable knowledge surface for exec or team use.

  5. 5

    Query it: ask questions in Claude Code with the wiki folder in context, or point an executive-assistant agent at the wiki path so it reads the index first and follows links to the relevant pages instead of searching embeddings.

  6. 6

    Maintain it: run periodic lint or health checks that flag orphan pages, broken links, stale sections, and coverage gaps, then suggest which new sources to add so the knowledge base stays current.

Tools

Claude CodeGraphifyObsidian (with Web Clipper)Markdown wiki (index + change log + CLAUDE.md schema)

The result

You get a self-contained, version-controllable knowledge base of plain markdown files (and, for code, a navigable knowledge graph) that an AI agent reads directly, with no embeddings model, vector database, or chunking infrastructure to run or pay for. Because the agent navigates an index and follows links to only the relevant pages rather than retrieving similarity-matched fragments, answers tend to be more precise and the team avoids RAG's setup and maintenance overhead. Practitioners report large token savings versus loading or retrieving everything, though the exact figure is scale-dependent and works best at smaller scale (roughly under a few hundred documents, or sub-60-file codebases) where the whole index and linked structure stay manageable; very large corpora may still benefit from retrieval. The knowledge becomes durable and auditable instead of ephemeral, and a developer's coding agent gains stable context on a large or legacy repo.

FAQ

Can I build an AI knowledge base without RAG or a vector database?

Yes. Instead of embeddings and a vector database, you organize your docs into a wiki of linked markdown files with an auto-built index, and let an AI agent like Claude Code read the index and follow links to the relevant pages. There is no embeddings model, vector store, or chunking pipeline to run. This works best at smaller scale (roughly under a few hundred documents); very large corpora may still benefit from retrieval.

What is the markdown second-brain (or Karpathy-style wiki) approach to AI knowledge?

It is a pattern where you convert your source material into many small, cross-linked markdown pages plus a single index file, then point an AI agent at that folder. The agent navigates the index and the links the way a human reads a wiki, rather than retrieving similarity-matched fragments. The same idea gives a coding agent durable memory of a codebase, often via a knowledge graph of files and functions.

Is RAG dead, or when should I still use a vector database?

RAG is not dead, but it is no longer the default for every use case. For internal knowledge bases, executive second brains, and most codebases under a few hundred documents or dozens of files, a linked-markdown or knowledge-graph approach is simpler, cheaper, and often more precise. Keep RAG or a vector database when your corpus is very large, changes constantly, or exceeds what an agent can navigate by index alone.

How do I give Claude Code or Cursor context on a large or legacy codebase?

Parse the repository into a knowledge graph of files, functions, and their relationships (for example with Graphify), generate a navigable index and linked summary pages, and keep a schema file (CLAUDE.md or AGENTS.md) describing the structure. The coding agent then reads the index and follows links to the relevant parts of the code instead of guessing or re-scanning everything, which gives it durable, accurate context across a large or legacy codebase.

How much does a non-RAG knowledge base cost to run compared to a RAG build?

The big saving is infrastructure: you avoid hosting an embeddings model and a vector database and building a chunking pipeline, so there is no per-query embedding or vector-store cost. You pay mainly for the AI agent's tokens when it reads the wiki, and practitioners report meaningfully lower token usage than loading everything, though the exact savings depend on scale. Agencies typically build internal AI knowledge bases as a fixed-fee project with an optional monthly maintenance retainer.

Want this built for you?

Book a free audit and we'll scope this automation for your stack — what it takes, what it costs, and whether it's the right first build. With or without us.