System advanced Upgraded third-party

I need to give my agent a knowledge base it can actually retrieve from.

Pasting your whole knowledge base into the prompt stops working the moment it outgrows the window — and a vector store you bolt on too early hands back the wrong three paragraphs with total confidence. The job isn't 'store the documents,' it's 'find the right passage and only the right passage, on demand.' Three ways to make a corpus retrievable, simplest first.

3
ways to retrieve
~15 min
to a working first version
0
vector databases needed to start

Ch. 01 What it is


Pasting your whole knowledge base into the prompt stops working the moment it outgrows the window — and a vector store you bolt on too early hands back the wrong three paragraphs with total confidence. The job isn't 'store the documents,' it's 'find the right passage and only the right passage, on demand.' Three ways to make a corpus retrievable, simplest first.

Ch. 02 The three ways to build it


Simplest path first. Every tier carries its real setup time and its honest trade-off — the cost is the part most write-ups leave out.

  1. Tier 1 · simplest path

    Markdown corpus + plain search

    Setup~15 min

    • plain markdown
    • ripgrep or full-text search

    Keep the knowledge as flat markdown files, one topic per file, with a short index file that lists what exists and where. When the agent needs a fact, it searches the corpus by keyword — ripgrep, a built-in file search, whatever's at hand — reads the handful of files that match, and answers from them. No embeddings, no database, no chunking. The whole knowledge base is text you can open and read yourself, which means when the agent retrieves the wrong file you can see exactly why and fix it by renaming a heading. On a corpus of dozens to low hundreds of files, this beats a vector store outright: keyword search is exact, it never 'almost' matches, and there's nothing to go stale or drift.

  2. Tier 2

    Chunk + embed + vector store

    Setup~half day

    • an embedding model
    • a vector store (LanceDB, Chroma, pgvector)

    Now retrieve by meaning. Split each document into passages a few paragraphs long — chunks small enough to be one idea, large enough to stand alone — and run each through an embedding model that turns it into a vector. Store the vectors. At query time, embed the question the same way and pull back the handful of chunks closest to it in meaning, then hand only those to the agent. Now 'churn' finds the passage about cancellations, because they sit near each other in meaning-space even though they share no words. You're no longer feeding the model the whole corpus and hoping the answer's in there — you're feeding it the three passages that actually bear on the question, which is cheaper, faster, and far less prone to the model latching onto an irrelevant aside.

  3. Tier 3

    Hybrid retrieval + re-rank + promotion/expiry

    Setup~1–2 days to wire

    • dense (embedding) + keyword (BM25) retrieval
    • a re-ranker
    • a freshness/promotion policy

    Run both retrievers and let each cover the other's blind spot. A keyword pass (BM25) catches the exact strings — names, codes, versions — that embeddings miss; a dense pass catches the meaning that keywords miss; you fuse the two result lists so a passage that scores on either route surfaces. Then add the step that does the most for accuracy per dollar: a re-ranker. The first pass casts a wide, cheap net — pull twenty candidates; the re-ranker reads the question against each candidate properly and reorders, so the four passages the agent actually sees are the four most relevant, not merely the four nearest. Finally, give the corpus a clock: tag entries with a source and a freshness date, promote a passage that keeps getting retrieved and confirmed into a trusted tier, and expire or down-rank what's gone stale — so retrieval prefers what's current and proven over what's merely present.

Ch. 03 The detail


Pasting your whole knowledge base into the prompt stops working the moment it outgrows the window — and a vector store you bolt on too early hands back the wrong three paragraphs with total confidence. The job isn't 'store the documents,' it's 'find the right passage and only the right passage, on demand.' Three ways to make a corpus retrievable, simplest first.

Category
Knowledge-ops · Retrieval & RAG
Format
System
Level
advanced
Provenance
Upgraded third-party
ragretrievalknowledge-baseembeddingsre-rankingknowledge-opsagents