A practical system for reading books with LLMs
A small system for reading books with an LLM. Each book gets a folder with the full text, a slash command that knows where you are, and a wiki that only contains what you've already read.
I've been using a small system for reading books with Claude. Each book gets a folder with the full text, a slash command that knows where I am, and a wiki that only contains what I've already read. Here's why, and how to set it up.
Why
Reading The Lord of the Rings and A Song of Ice and Fire last year, I leaned heavily on chapter-by-chapter summaries and analyses written by real people, the careful close-reads you find in fandom communities for popular books. They were fantastic for comprehension and retention. It's easy to forget something when reading a long novel, so skimming a companion before the next chapter kept me oriented.
When I started The Count of Monte Cristo a few months ago, I couldn't find something satisfactory like this. So I tried the obvious substitute: ask Claude to summarize chapter 2. It wasn't usable: plot points from the wrong chapters, details that didn't belong, occasional hallucinations. Even for books the model has seen in training, recall is lossy. A fresh chat session with the chapter pasted in, then a request for summary and analysis, worked pretty well.
That takes care of per-chapter summaries and analyses, but the other thing I wanted was reference — a place to review what I've read about the characters, events, locations, etc. Reference wikis solve that, but they occasionally spoil future sections of the book. A wiki built forward from zero, containing only what I've already read, fixes this by construction.
The setup
You need the book in plaintext or markdown. LLMs can't work reliably from PDF pages or EPUB files at chapter-by-chapter granularity. For EPUB, a small conversion script (Calibre's ebook-convert, or a short Python script using ebooklib) splits the book into one markdown file per chapter. PDFs are harder. Scholarly editions with footnotes often need manual cleanup, and depending on your translator and publisher you may not find a clean source file at all. This is the most annoying part of the system. But if you find a DRM-free ebook, and iterate on the conversion script, you'll likely get something good enough.
Once the book is split into files, the rest is just directories and a prompt.
The structure (Hamlet as example)
My Hamlet repo looks like this:
~/hamlet/ ├── scenes/ # the text, one file per scene ├── analyses/ # per-scene analysis, written by Claude ├── wiki/ │ ├── characters/ # one file per character │ ├── themes.md │ ├── events.md │ └── glossary.md ├── marginalia/ # my personal takes per scene └── progress.md # where I am
Four kinds of files, plus a progress marker. Each has a job.
scenes/ (or chapters/) — the source text. The agent reads from here when analyzing. I never edit these.
analyses/ — per-scene analyses written by Claude. Summary, close reading, connections to prior scenes, quotable passages. One file per scene. The exact shape varies by book (see "Variations by genre" below).
A caveat: expect factual drift more than interpretive mistakes — a plot point attached to the wrong chapter, a minor character confused for another. A GPT review pass will catch a lot of this, though I rarely bother for book wikis.
wiki/ — accumulating reference. The wiki only ever contains what I've read. When Claude analyzes a new scene, it updates the wiki with anything new. Hamlet's wiki has characters, themes, events, glossary. Other books get different subdirectories (more on that below).
marginalia/ — my personal takes per scene. These come from the back-and-forth after Claude delivers its analysis. I'll push back, connect it to something, riff on a line. Generally I'll write this while I'm reading the scene, then paste it once I'm done. That discussion gets distilled into a per-scene marginalia file. Analyses are what Claude wrote; marginalia is what I said.
progress.md — one line. Last scene analyzed: 3.4. This is the spoiler barrier. The slash command reads it first.
The slash command
/hamlet is a prompt file that lives with my workspace. When I run /hamlet next, the agent:
- Reads
progress.mdto see where I am. - Reads the next scene's text.
- Writes a fresh analysis file to
analyses/. - Updates any wiki entries that changed (new character, new theme development, new glossary term).
- Refuses to discuss anything past the current scene.
That last instruction matters because the model has almost certainly seen enough of the book in training to spoil it: the full text for well-known works, extensive summaries and discussion for most others. Without the refusal, it'll volunteer what's ahead. The forward-built wiki handles the reference side of spoilers; the refusal instruction handles the model side.
The whole thing is a markdown prompt file, and you only need an agent harness to run it (like Claude Code or Codex). Beyond that, no framework, no custom infrastructure. The agent follows the instructions like a junior editor with very specific conventions.
If I want to look at the wiki, I just open the files directly. After the analysis, I usually keep chatting. That's where marginalia comes from.
Concretely: say progress.md reads 3.3. I run /hamlet next. The agent opens scenes/3.4.md, writes a fresh analyses/3.4.md covering the closet scene, updates the Gertrude and Hamlet character files with what changed, and appends a new entry to wiki/events.md. Then I keep chatting — I might ask why Hamlet spared Claudius in the prayer scene, and the agent answers from what's happened so far, being careful not to telegraph how that choice pays off later.
Variations by genre
The four-directory skeleton is invariant. What changes is the internal shape of the wiki and the register of the analyses, which is where genre shows up.
- Shakespeare (Hamlet): characters, themes, events, glossary. Organized by scene, with analyses as close-reads of the language.
- Long adventure novel (The Count of Monte Cristo): characters, places, plot threads, chapter summaries. Analyses lean on plot-thread tracking to keep a huge cast and twenty-year revenge arc legible.
- Dostoevsky novel (Crime and Punishment): characters, ideas, psychology, events, setting (the city as a character), glossary. The
psychology/directory was specific to this book, and analyses lean heavily into the interior/ideological layer. - Political theory (Lenin's The State and Revolution): concepts, thinkers, cases, arc, glossary, and a running "gap ledger" tracking where the author's programmatic claims diverge from the historical record. Analyses are argument-maps more than summaries.
- Spiritual memoir (Tiago Faleiro's In Search of the Infinite): concepts, thinkers, arc, glossary, plus a
films.mdfile because each chapter revolves around a film the author watched. Analyses track both the film read and the concept development.
The shapes of the wiki and analyses are genre-specific design decisions. I spend a little time at the start of each book thinking about what structure it wants. Sometimes that becomes obvious after a few chapters and I refactor.
Marginalia and cross-book tangents
The wiki is what the book is about. Marginalia is what I am about, in conversation with the book.
Per-chapter marginalia files hold threads anchored to a specific chapter: my claims, connections, half-formed essay seeds, questions the analysis missed.
Capture happens two ways: the agent can distill a thread inline during the session, or a scheduled task runs a few times a day, picks up idle book sessions, and writes marginalia in batch. Either path produces the same file format, so they compose.
Some threads drift. A tangent that starts under one book keeps surfacing under another. For those, I keep a shared directory of cross-book tangent files — topics that recur across multiple reading contexts and accumulate entries from each. One of the more interesting things about running this system for a while is watching which attractors keep showing up.
How far it can go
One of my current reading projects is feeding an essay I'm writing. Alongside the normal analysis, Claude tags passages with labels that route them into sections of the piece. The tags are just strings in the analysis file, nothing fancy, but at the end of the book I'll have a pre-sorted pile of quotes, paraphrases, and structural cases organized by essay section. Reading becomes, among other things, extraction.
That's it
A folder, a plaintext book, a markdown prompt file. Any capable LLM can build this from a description of what you want, which is the other reason it's worth writing out: the design is the whole product.
You get a reading companion that knows where you are, can't spoil what's ahead, and remembers what you've said to it.