RAG over research papers that treats a knowledge graph as a first-class retrieval path, not metadata. It answers both semantic questions ("what does the results section claim?") and relational ones ("what does this build on?", "which papers benchmark on dataset X?") by routing between Qdrant vector search and a Neo4j graph with a LangGraph multi-agent pipeline — grounded, cited, evaluated, and traced.
flowchart LR
subgraph Ingestion[Ingestion · offline]
PDF[PDF] --> DOC[Docling]
DOC --> CH[Chunker]
CH --> EMB[Local bge] --> QD[(Qdrant)]
CH --> EX[LLM extractor] --> NEO[(Neo4j)]
DOC --> CITE[OpenAlex citations] --> NEO
end
subgraph Query[Query · online]
Q[Query] --> R{Router}
R -->|local| RET[Retriever]
R -->|graph| GA[Graph agent]
R -->|hybrid| RET & GA
RET --> QD
GA --> NEO
RET & GA --> SYN[Synthesis] --> CR{Critic} --> A[Answer + citations]
CR -->|fail| R
end
cp .env.example .env # add ANTHROPIC_API_KEY
make install
make up # Qdrant + Neo4j
make migrate
make testmake up && make migrate
make ingest path=papers/attention.pdf # Docling -> chunks -> bge -> Qdrant
make ask q="What BLEU does the Transformer (big) reach on WMT14 EN-DE?"
# -> grounded answer with [chunk_id] citations (page-located)- Phase 0 — Foundations & skeleton
- Phase 1 — Ingestion + vector slice
- Phase 2 — Knowledge graph
- Phase 3 — Multi-agent orchestration
- Phase 4 — Hardening + eval-in-CI
- Phase 5 — UI wiring
See docs/specs/ (start with 00-overview.md)
and decision records in docs/adr/.
Python 3.12 · uv · Qdrant · Neo4j · LangGraph · FastAPI · Docling · OpenAlex · bge (local) · Claude via LiteLLM · RAGAS · Langfuse.