HTTP API reference
Start the server with ragforge serve. Interactive Swagger UI at /docs. Route source: ragforge/api/routes/.
Health
GET
/healthServer status and version.
Capabilities
GET
/capabilitiesList all registered parsers, chunkers, embedders, and metrics.
Parse
POST
/parseParse a file path on the server, or raw text, into a Document. Provide either path or text.
Request
json
{
"path": "optional/server/path/to/file.pdf",
"text": "# Optional raw text content",
"doc_type": "txt | md | html",
"source": "api-input",
"parser": "text | html | pdf | docling | null (auto-detect)"
}Response
json
{
"id": "a1b2c3d4",
"text": "Extracted plain text…",
"source": "api-input",
"doc_type": "md",
"metadata": {},
"token_count": 142
}Chunk
POST
/chunkjson
{
"document": { /* DocumentResponse object */ },
"strategy": "structure | fixed | docling",
"options": { "max_tokens": 384 }
}Returns an array of chunk objects (id, text, doc_id, index, metadata, token_count).
Knowledge
POST
/knowledgeBuild and index a knowledge base from source paths.
Request
json
{
"name": "my-kb",
"sources": ["./docs/", "faq.md"],
"embedding_model": "default | sentence-transformers | openai",
"chunk_strategy": "structure | fixed",
"chunk_options": { "max_tokens": 384 }
}Response
json
{
"name": "my-kb",
"status": "built",
"num_documents": 12,
"num_chunks": 87,
"embedding_model": "default"
}Query
POST
/queryRetrieve chunks and optionally generate an LLM answer.
Request
json
{
"knowledge": "my-kb",
"question": "How do refunds work?",
"top_k": 5,
"mode": "hybrid | dense | bm25",
"rerank": false,
"generate": false,
"llm": "openai | anthropic | ollama | null",
"model": null
}Response
json
{
"question": "How do refunds work?",
"knowledge": "my-kb",
"chunks": [
{
"id": "a1b2c3d4",
"text": "All purchases are eligible…",
"doc_id": "b2c3d4e5",
"index": 0,
"metadata": { "section": "Refund Policy" },
"score": 0.9234
}
],
"answer": "Based on the provided context, refunds are…",
"llm": "ollama"
}Evaluate
POST
/evaluatejson
{
"knowledge": "my-kb",
"golden_dataset": [
{
"question": "What is the refund window?",
"expected_answer": "30 days",
"relevant_chunk_ids": ["a1b2c3d4"],
"relevant_sources": ["refund_policy.md"],
"notes": ""
}
],
"metrics": ["hit_rate", "precision_at_k", "recall_at_k", "mrr"],
"top_k": 5,
"mode": "hybrid",
"rerank": false,
"generate": false,
"llm": null
}Quantize
POST
/quantizejson
{
"target": "model-name",
"knowledge": "my-kb",
"options": { "bits": 8, "method": "scalar" }
}Returns a before/after report with size, latency, quality delta, and cost reduction.
Migrate
POST
/migratejson
{
"knowledge": "my-kb",
"from_model": "default",
"to_model": "openai",
"run_validation": true,
"options": {}
}See Migration for the full process.
Traces
GET
/traces?limit=50&offset=0List recent pipeline traces.
GET
/traces/{run_id}Full trace detail (retrieval, rerank, prompt, response) with timing data.
Coordination
| Method + Path | Description |
|---|---|
POST /coordination/boards | Create a named blackboard ({ name, persist }). |
GET /coordination/boards/{name} | Get current board state. |
POST /coordination/boards/{name}/write | Write an entry ({ key, value, author, tags }). |
GET /coordination/boards/{name}/history | Full write history. |
DELETE /coordination/boards/{name} | Clear/delete a board. |
POST /coordination/run | Run an inline multi-agent orchestration task. |
GET /coordination/run/{run_id} | Trace + cost summary of a run. |