Traditional RAG is strongest when the system needs to retrieve relevant chunks at answer time. An LLM Wiki is strongest when the corpus benefits from durable synthesis before the next question is asked.
| Dimension | Query-time RAG | LLM Wiki incremental compilation |
|---|---|---|
| Processing trigger | Every user question retrieves and assembles context again. | Ingest reads sources once and updates durable markdown pages. |
| Memory of synthesis | Usually lost unless the app separately stores it. | Saved into wiki/syntheses/ or existing concept pages. |
| Contradiction handling | Contradictions often surface only when chunks collide in a prompt. | Contradictions are recorded during ingest and lint. |
| Human auditability | Depends on retrieval logs and chunk citations. | Markdown pages, source summaries, and log entries are inspectable. |
| Infrastructure | Often needs embeddings, indexes, rerankers, and runtime retrieval services. | Can begin with files, grep, index.md, log.md, and wiki-links. |
Use both when needed
The patterns are not enemies. A mature team wiki may use markdown as the compiled knowledge layer and add BM25, vector search, or hybrid retrieval when the graph grows. The important difference is that the search layer should find compiled knowledge, not replace it.
Rule of thumb
If a source will be used once, retrieval may be enough. If the source changes how future answers should be formed, compile it into the wiki.