The three-layer architecture is the safety boundary of an LLM Wiki. Do not let the agent freely edit original sources, and do not hide operating rules inside vague chat prompts.
Layer 1: immutable raw sources
raw/ holds originals: PDFs, markdown exports, transcripts, screenshots, CSVs, meeting notes, cloned documentation, and clipped web pages. The agent may read this directory, but it should not modify or delete it during normal work. If a bad synthesis happens, the raw layer lets you rebuild the wiki from preserved evidence.
Raw files need basic identity metadata somewhere nearby: source title, origin, date captured, license or access boundary if known, checksum when practical, and the reason the source belongs in the wiki. Without that, later reviewers cannot tell whether the compiled page is faithful or stale.
| Source class | Capture metadata | Review risk |
|---|---|---|
| Paper or article | Title, URL or DOI, author/source, capture date, access date. | Stale claims, secondary-source overreach, paywall or license limits. |
| Meeting or chat transcript | Date, participants or roles, confidentiality boundary, decision status. | Private information, unapproved decisions, incomplete context. |
| Repository or code artifact | Remote URL, branch, commit or release tag, relevant paths. | Generated files, stale branches, implementation-specific assumptions. |
| Screenshot, audio, or video | Original file, extraction method, transcript/OCR status, reviewer notes. | Bad OCR, missing context, accessibility and privacy exposure. |
Layer 2: dynamic compiled wiki
wiki/ is the mutable output. It contains concept pages, source summaries, entity pages, syntheses, contradictions, and open questions. The agent may update this layer during ingest, query, and lint operations.
| Compiled page type | Typical use | Must include |
|---|---|---|
| Source summary | Compact proxy for one raw source. | Raw path, source status, key claims, entities, and contradictions. |
| Concept page | Reusable explanation of a method, pattern, or term. | Definition, scope, source traces, related links, and limitations. |
| Entity page | Model, tool, organization, standard, product, or project record. | Identity, source dates, authority boundary, and caveats. |
| Synthesis | Reusable answer, comparison, or decision record. | Question answered, inputs used, reasoning boundary, and review status. |
| Contradiction | Visible conflict between sources or pages. | Both claims, source context, date/scope notes, and next review action. |
Layer 3: algorithmic schema
AGENTS.md, CLAUDE.md, or an equivalent root instruction file is the operating contract. It tells the agent which directories exist, which files are read-only, how pages are shaped, how links are written, and which checklist must be completed before a task is done.
| Layer | Write access | Primary job | Failure if mixed |
|---|---|---|---|
| raw/ | Human and import tools only | Preserve ground truth | AI edits can destroy the evidence needed to recover. |
| wiki/ | Agent may edit under schema rules | Maintain synthesized knowledge | The graph becomes a pile of notes with no predictable shape. |
| AGENTS.md | Human-reviewed edits | Define agent behavior | The model improvises file paths, metadata, and safety rules. |
Recommended folder tree
raw/
papers/
web/
meetings/
attachments/
wiki/
index.md
log.md
concepts/
entities/
sources/
syntheses/
contradictions.md
open-questions.md
AGENTS.md
Interface contracts
The layers should communicate through explicit records rather than hidden memory. A source summary points back to raw evidence. A concept page points to source summaries and related concepts. The index points to every compiled page. The log records operations. The schema defines which of those files can be changed by which workflow.
| Operation | Reads from | Writes to | Never writes to |
|---|---|---|---|
| Ingest | raw/, wiki/index.md, relevant wiki pages |
Source summaries, target pages, index, log, contradictions | Original raw files |
| Query | Index, selected compiled pages, raw sources only when needed | Synthesis page only after approval, index, log | Unrelated pages or raw files |
| Lint | Entire compiled graph, index, log, schema | Warnings, status labels, fix patches, log entries | Evidence files or unapproved deletions |
Permission checklist
- Raw files are read-only during agent operations.
- Every compiled page has frontmatter, source status, and related links.
- Every ingest updates wiki/index.md and wiki/log.md.
- Every contradiction is preserved until a human resolves it.
- Every destructive cleanup requires explicit human approval.
- Every high-impact model, benchmark, legal, safety, or provider claim includes a dated source or a visible source-needed label.