Open-source RAG toolkit

Build AI that reads your documents — and answers honestly.

Parsing, chunking, retrieval, grounded answers, and evaluation — all in one place. Usable from any programming language. Runs on your own machine. Free and open source.

View on GitHub
Apache-2.0 Runs locally Any language via HTTP
RAGForge mascot — a robot blacksmith forging a glowing AI cube on an anvil
The problem

Building RAG is messy

You have a pile of documents. You want AI to answer questions from them — accurately. Doing it yourself means gluing together six fragile tools.

Documents get mangled

Tables and code get chopped in half, so the AI reads broken pieces and gives wrong answers.

You can't trust the answers

The AI sometimes makes things up, with no sources you can check.

AI agents burn money

When several agents talk to each other, every message costs — and it adds up fast.

What goes wrong

The broken pipeline most people end up with

Skip any one of these stages and the next one inherits the mess.

Bad parse
Tables broken
Wrong chunks
Context lost
Made-up answer
No sources
$$$ wasted
Tokens burned
How it works

One workshop for everything RAG

Watch your question travel through the forge — each stage hands clean data to the next.

Your documents
PDF · Word · HTML
Parse
Read files
Chunk
Split smart
Embed
Store vectors
Search
Hybrid + rerank
Answer
With sources
Grounded answer
with cited sources

Everything is exposed over a simple HTTP API, so you can use it from any language — not just Python.

Featured

Don't migrate blind

When a better embedding model comes out, should you even switch? RAGForge answers that before you spend a fortune re-embedding. Often a model that wins on public benchmarks loses on your domain.

01
Test on your corpus

Freeze your real queries as a golden set, then score the new embedding model against your current one on recall@k, precision@k, and MRR — on YOUR corpus.

02
Gate the cutover

The migration is blocked automatically if the new model regresses. It only proceeds when the new model wins (or ties within a margin you set) on your real queries — so you never waste a full re-embed on a worse model.

03
Migrate the hot set first

Re-embed only the chunks your queries actually hit first, then backfill the cold tail — instead of re-embedding everything up front.

How the safe migration flows
Your queries
Real logs
Golden set
Frozen truth
Compare
Old vs new
Hot-set first
Cheap cutover

After cutover, a smoke test replays golden queries against the new index to confirm the migration actually worked — not just that a command returned OK.

ragforge migrate gate my-kb golden.json --old default --new openai --metric recall_at_k
Real benchmark
NO_GO — migration blocked
all-MiniLM-L6-v2 (baseline) paraphrase-MiniLM-L3-v2 (candidate)
0.738
0.575
recall@5
0.783
0.649
recall@10
0.600
0.468
MRR

Real run on SciFact (5,183 docs, 300 labelled queries). We compared all-MiniLM-L6-v2 against a smaller candidate (paraphrase-MiniLM-L3-v2). The candidate regressed on every metric — recall@5 fell 16 points — so RAGForge's gate returned NO_GO and blocked the migration before any full re-embed. That's the point: the gate turns "is this model actually better on our data?" into a measured, automatic decision.

Everything inside

Nine building blocks

Use what you need. Ignore the rest.

Parsing

Read PDFs, Word, HTML, and more.

Chunking

Split smartly; keep tables and code intact.

Retrieval

Hybrid search (dense + BM25) with reranking.

Answers

Grounded responses that cite sources and refuse to guess.

Evaluation

Score retrieval and answer quality; A/B compare setups.

Quantization

Shrink embeddings to cut storage and cost.

Migration

Swap embedding models safely, with quality validation.

Multi-Agent

Coordinate agents through shared state, not expensive direct messaging.

Dashboard

Local UI to trace pipelines, run evaluations, and chat with your KB.

Multi-agent, plain English

How agents stop wasting your money

Instead of agents repeating each other through chat, they read and write to one shared board.

Agents
Share a board
Blackboard
Common state
Reuse work
No re-asking
Lower cost
Fewer tokens
Why RAGForge

No secret algorithms

Just everything you need, made simple.

All-in-one

One install instead of six libraries to stitch together.

Any language

It's a simple HTTP API — call it from Python, JavaScript, Go, anything.

Local & private

Runs on your own machine; your documents never have to leave.

Free & open source

Apache-2.0. Use it, change it, build on top of it.

Any language

Plain HTTP. Any client.

It's just a JSON API. Call it from anywhere.

import requests

r = requests.post("http://localhost:8000/query", json={
    "knowledge": "my-kb",
    "question": "How do refunds work?",
    "top_k": 5,
    "generate": True,
})
print(r.json()["answer"])

Built. Working. Open source.

Try it. Break it. Tell us what's missing.

View on GitHub
Samsul Jahith

Built by a developer who got tired of tool sprawl

RAGForge is built and maintained by Samsul Jahith — a developer working on open-source RAG tooling. I built it to bring the messy parts of RAG into one place. Feedback and contributions are welcome.