An LLM Wiki is a file-based knowledge system where a language model incrementally compiles raw sources into a durable, human-readable markdown graph. The model is not only answering questions. It is maintaining the encyclopedia it will read next time.
The shortest useful definition
An LLM Wiki has three durable assets: original sources that do not change, synthesized markdown pages that the agent may update, and a small instruction file that defines exactly how the agent is allowed to maintain the graph. The system compounds because ingestion and useful queries leave behind better pages.
What it is not
| It is not | Why | What to use instead |
|---|---|---|
| A transcript dump | Chat history is not structured, routed, deduplicated, or source-audited. | Save durable syntheses into wiki/syntheses/ and link them from index.md. |
| A vector database by itself | Embeddings can find similar chunks, but they do not preserve the model’s synthesis work. | Use markdown pages as the compiled graph; add search later when size demands it. |
| A fully autonomous publishing system | AI drafts can drift, overstate, or merge incompatible claims. | Keep human review, source links, and status labels in the loop. |
Minimal project skeleton
my-llm-wiki/
AGENTS.md
raw/
papers/
web-clips/
transcripts/
wiki/
index.md
log.md
concepts/
entities/
sources/
syntheses/
open-questions.md
Why the pattern compounds
- Sources are read once with care. The ingest pass extracts claims, evidence, entities, and contradictions.
- The graph is rewritten, not appended blindly. Existing concept pages are preserved and extended so the new source changes the whole map.
- Queries can become pages. A strong answer, comparison, or decision matrix is saved into wiki/syntheses/ and added to index.md.
- Linting turns maintenance into a routine. The agent regularly checks broken links, orphan pages, stale claims, and unresolved contradictions.
Decision rule
Build an LLM Wiki when: the same body of knowledge will be revisited many times, the answer quality improves when prior syntheses are preserved, and a human needs to audit sources after the model works.