Architecture

How the modules connect, the plugin registry, and where data is stored. Sourced from README.md and core/registry.py.

Module layout

text

ragforge/
├── core/           # Shared models (Document, Chunk) + plugin registry
├── parsing/        # File → Document (txt, md, html, pdf, docling)
├── chunking/       # Document → Chunks (fixed, structure, docling)
├── pipeline/       # Embed + store + retrieve + generate
│   ├── embeddings.py
│   ├── store.py
│   ├── bm25.py
│   ├── retriever.py
│   ├── generation.py
│   └── knowledge.py
├── evaluation/     # Measure retrieval + generation quality
├── quantization/   # Compress embeddings + measure tradeoff
├── migration/      # Swap embedding models safely
├── coordination/   # Multi-agent blackboard
├── tracing.py      # SQLite-backed trace store
├── api/            # FastAPI HTTP endpoints
│   └── routes/
├── cli.py          # Command-line interface
└── ui_static/      # Pre-built dashboard assets

Plugin registry

Every capability registers itself via @register(kind, name). The CLI and API look components up by name with get(kind, name) — they never import implementations directly. Adding a new parser, chunker, or metric is one new file. Current registry kinds: "parser", "chunker", "embedder", "metric".

python

from ragforge.core.registry import register, get, available

@register("chunker", "fixed")
class FixedChunker(...):
    ...

chunker_cls = get("chunker", "fixed")
print(available("chunker"))  # ['fixed', 'structure', ...]

Request flow — ingest

text

CLI: ragforge knowledge build my-kb ./docs/
       ↓
pipeline.build_knowledge_base()
       ↓
  for each source file:
    parsing.parse_file()        → Document
    chunking.chunk_document()   → list[Chunk]
       ↓
  pipeline.embeddings.encode(chunks)   → vectors
       ↓
  pipeline.store.InMemoryStore.add(chunks, vectors)
  pipeline.bm25.BM25Index.add(chunks)
       ↓
  persist to ~/.ragforge/knowledge_bases/<name>/
    vectors.json, bm25_index.json, meta.json

Request flow — query

text

CLI/API: ragforge query my-kb "How do refunds work?"
       ↓
pipeline.query_knowledge_base()
       ↓
  KnowledgeBase.load("my-kb")   ← reads from ~/.ragforge/
       ↓
  kb.query(question, mode="hybrid")
    ↓ dense: embedder.encode_single(question) → store.search()
    ↓ bm25:  bm25_index.search(question)
    ↓ hybrid: RRF fusion of both ranked lists
       ↓
  [optional] cross-encoder reranking
       ↓
  [optional] generation.LLMProvider.generate(prompt, context_chunks)
       ↓
  tracing.Tracer records each step → ~/.ragforge/traces.db
       ↓
  return { chunks, answer }

Data storage

Data	Path
Knowledge bases	`~/.ragforge/knowledge_bases/<name>/`
Vector store	`vectors.json`
BM25 index	`bm25_index.json`
KB metadata	`meta.json`
Migration backup	`vectors_backup.json`
Traces	`~/.ragforge/traces.db`
Blackboards	`~/.ragforge/<board>.db`

Environment & config

No dedicated .env template; LLM providers use their own SDK env vars (OPENAI_API_KEY, ANTHROPIC_API_KEY).
Ollama defaults to http://localhost:11434.
Server defaults: --host 0.0.0.0 --port 8000.

← Previous

Quickstart

CLI reference