How to Manage a Literature Review Across 200+ Papers
By Norbert Wlodarczyk
You’re 80 papers into a literature review. You remember reading something about transfer learning improving small-sample performance in a medical imaging context. You know you highlighted it. You know you wrote a note about it. You just don’t know where.
You search Zotero. Seven results, none of them right. You search your notes app. Four partial matches, two of which are about transfer learning in NLP, not imaging. You open your “Literature Review” folder and scroll through 60 files. Fifteen minutes later, you find the note - misfiled under “Model Architectures” instead of “Clinical Applications.”
This is the standard experience of anyone doing a literature review past 100 papers. The tools are fine individually. The problem is that your knowledge is split across three or four systems, none of which talk to each other, and the connections between papers live nowhere except your memory.
Why literature reviews break at scale
A small literature review - 30 papers on a narrow topic - is manageable with any system. A spreadsheet, a folder of annotated PDFs, a Notion database. At this scale, you can hold the structure in your head. You remember which papers cite which, which findings contradict each other, which methods are novel versus incremental.
Past 100 papers, three things happen simultaneously.
Your memory stops being the index. You can’t remember which of your 150 paper notes contains the statistic you need. You can’t remember whether you’ve already read a newly discovered paper or just skimmed the abstract three weeks ago. The mental model of your literature that felt so clear at 50 papers is now full of gaps and ghosts.
Cross-paper connections multiply faster than you can track them. At 50 papers, maybe 200 meaningful connections exist between them. At 200 papers, that number is in the thousands. Paper A’s methodology was extended by Paper B, contradicted by Paper C, and applied to a different domain by Paper D. Paper D cites Paper E, which you haven’t read yet but probably should because it bridges two clusters of your literature. No human tracks this manually at scale. You track the connections you notice and miss the ones you don’t.
The tools fragment your knowledge. Zotero (or Mendeley, or Paperpile) holds your PDFs and bibliographic data. Your notes app holds your reading notes and synthesis. Your word processor holds your draft. A spreadsheet might hold your paper tracking matrix. Each tool has a piece of the picture. None of them have the whole picture. And the most valuable information - how papers relate to each other - lives in none of them.
The spreadsheet tracker: necessary but insufficient
Most researchers start with a spreadsheet. Columns for author, year, title, methodology, key findings, relevance to your research questions. This is good practice. It gives you a scannable overview of what you’ve read.
But a spreadsheet is a flat data structure. Every paper is a row. Rows don’t connect to other rows. You can sort by methodology or filter by year, but you can’t ask “which papers in my review contradict each other?” or “what’s the chain of influence from this 2019 paper to the three 2025 papers that build on it?”
Some researchers add columns like “Related papers” or “Contradicts” to capture connections. This works at 50 papers. At 200, you have a cell containing “Smith 2022, Chen 2023, Patel 2024, see also Kumar 2021 methods section” - a mini-document inside a spreadsheet cell that nobody, including you, will parse six months from now.
The spreadsheet tells you what you’ve read. It doesn’t tell you how what you’ve read fits together.
The Zettelkasten approach: right idea, wrong labor model
Researchers who discover Zettelkasten-style note-taking often feel like they’ve found the answer. One atomic note per idea. Backlinks between related notes. Build understanding by connecting small pieces rather than writing long summaries.
The method is sound. Luhmann published 70 books and 400 articles using this approach. But Luhmann also spent his entire career maintaining his 90,000-note system. For a PhD student with a three-year timeline and a literature review due in six months, the manual labor of linking every note to every related note is a luxury you can’t afford.
In practice, Zettelkasten for literature reviews plays out like this: you read a paper, create an atomic note for each key idea, link it to a few related notes you remember, and move on. The links you create are the ones you thought of in the moment. The links you didn’t think of - the connection between a methodology note from February and a findings note from August - never get created. Your graph of knowledge has all the nodes but is missing half the edges.
This is the same scaling problem that hits every personal knowledge management system. Manual connection doesn’t scale past a few hundred notes because it depends on you remembering what you’ve already written.
What a literature review system actually needs
Strip away the tools and methods, and a literature review requires four capabilities:
Capture with context. When you read a paper, capture not just the key findings but the metadata that makes it findable later: methodology type, domain, sample size, whether it supports or contradicts your hypotheses. This is the part most tools handle reasonably well.
Automatic relationship detection. When Paper B extends Paper A’s methodology, or when Paper C’s findings contradict Paper D’s, the system should surface those relationships - not wait for you to remember they exist and manually create a link. At 200 papers, the connections you miss are often the most interesting ones.
Synthesis across papers. You need to ask questions that span your entire literature: “What methods have been used to address X?” “Which findings support hypothesis Y and which contradict it?” “Where are the gaps - what questions has nobody studied?” These queries require traversing relationships between papers, not just searching within individual notes.
Staleness and coverage tracking. Which areas of your literature have you read deeply versus skimmed? Are there citation clusters you’ve only partially explored? Has a new paper appeared that contradicts something you wrote in your synthesis last month? Without these signals, your review has blind spots you can’t see.
From flat notes to connected knowledge
The difference between a pile of paper notes and a usable literature review is structure. Specifically, it’s the structure of relationships between papers.
Consider what you actually need to know for a synthesis chapter. Not “what did Paper A find” - you can look that up in your notes. You need to know: Paper A found X using method M. Papers B and C confirmed X with variations of M. Paper D found the opposite using method N, but their sample was smaller and their domain was different. Paper E proposed a theoretical framework that explains why M and N produce different results. Nobody has tested E’s framework empirically yet - that’s your gap.
This is a graph. Papers are nodes. “Confirms,” “contradicts,” “extends,” “applies to different domain,” “proposes explanation for” - these are typed edges. The synthesis chapter writes itself when you can see this structure. It’s nearly impossible to write when the structure lives only in your head.
Knowledge graphs make this structure explicit. Instead of storing paper notes as isolated documents and hoping you’ll remember the connections, a graph-based system models the entities (papers, authors, methods, findings, theories) and the relationships between them. You can traverse the graph to answer synthesis questions directly: “Show me all papers that contradict Finding X” returns an actual list with the reasoning chain, not a keyword search that might miss papers using different terminology.
A practical workflow for 200+ papers
Whether or not you use a graph-based tool, the structural principles apply.
Phase 1: Capture (ongoing). For each paper, create one note with: citation, research question, methodology, key findings (2-3 sentences each), and your assessment (what’s strong, what’s weak, how it relates to your work). Keep it atomic - one paper, one note. Don’t combine papers into topic summaries yet.
Phase 2: Tag relationships, not topics. Instead of filing papers into topic folders, tag the relationships between them. “Paper B extends Paper A” is more useful than “Paper B is about topic X.” If your tool supports typed links, use them. If not, a consistent notation in your notes works: “EXTENDS: Paper A,” “CONTRADICTS: Paper C,” “SAME-METHOD: Paper D.”
Phase 3: Cluster by question, not by topic. Group papers around your research questions, not around subject areas. A paper about transfer learning and a paper about data augmentation might be in different “topic” folders but answer the same research question (“How do we handle small training sets?”). Research questions are the natural organizing principle for a literature review.
Phase 4: Synthesize incrementally. Don’t wait until you’ve read everything to start synthesizing. After every 20-30 papers, write a paragraph summarizing what you know so far about each research question. These paragraphs become your draft. When new papers add or contradict something, update the relevant paragraph. This is far less painful than writing 15,000 words from scratch after reading 200 papers.
Phase 5: Find the gaps. The most valuable part of a literature review is identifying what hasn’t been studied. This requires seeing the full structure of your literature: which methods have been applied to which domains, which findings lack replication, which theoretical frameworks lack empirical testing. If you’ve been tracking relationships, these gaps become visible. If your notes are a flat pile, the gaps are invisible.
When the structural work is automated
The workflow above works. It also requires significant manual discipline - tagging every relationship, maintaining consistent notation, periodically reviewing your structure for gaps. At 200 papers, this is dozens of hours of bookkeeping on top of the actual reading and thinking.
This is where the tools need to catch up. A system that reads your paper notes, identifies the entities and relationships within them, and builds a navigable structure automatically would eliminate the bookkeeping without sacrificing the structure. You’d still do the reading. You’d still do the thinking. But the work of connecting Paper 47 to Paper 183 - two papers you read five months apart that turn out to address the same gap from different angles - would happen without you having to remember both exist.
That’s the difference between a literature review that takes six months of manual labor and one where the structural work is handled by software that understands what your papers are about.
NexaLink reads your notes, extracts concepts and relationships, and builds a knowledge graph you can query: “Which papers contradict this finding?” “What methods have been used for X?” “Where are the gaps?” No manual tagging. No spreadsheet maintenance. Your literature, connected automatically. See how it works.