LightRAG is a research project from the University of Hong Kong (HKU) that reimagines retrieval-augmented generation (RAG) using knowledge graphs. Accepted at EMNLP 2025, it replaces the traditional flat vector store approach with a graph-based architecture that extracts entities and their relationships from documents, enabling dramatically better context understanding for LLM applications.
Where conventional RAG systems retrieve isolated document chunks by embedding similarity, LightRAG builds a structured knowledge graph from your documents – entities become nodes, relationships become edges. When a query arrives, it performs dual-level retrieval across this graph: low-level retrieval for specific factual answers, high-level retrieval for broader thematic summaries. The result is retrieval that understands not just what words appear together, but how concepts are actually connected.
The project has garnered significant attention from the open-source community not just for its novel approach, but for being practical. The codebase is lightweight, well-documented, and includes features like incremental updates (no full rebuild when adding new documents), interactive graph visualization, and multiple backend support (OpenAI, Ollama, Hugging Face).
Repository: github.com/HKUDS/LightRAG
How Does LightRAG’s Graph-Based Retrieval Work?
Traditional RAG encodes documents as flat vectors and retrieves chunks by cosine similarity – a process fundamentally limited because it has no understanding of how chunks relate to one another. LightRAG solves this by constructing a knowledge graph during the indexing phase.
graph TD
A[Input Documents] --> B[Entity Extraction]
A --> C[Relation Extraction]
B --> D[Knowledge Graph Construction]
C --> D
D --> E[Community Detection]
D --> F[Entity Embeddings]
D --> G[Relation Embeddings]
E --> H[High-Level Retrieval\nSummaries]
F --> I[Low-Level Retrieval\nSpecific Facts]
G --> I
H --> J[Aggregated Answer]
I --> J
J --> K[LLM Response]The pipeline works in three phases:
- Graph Construction: Documents are processed to extract entities (people, companies, concepts) and their relationships. These form a directed graph stored as structured data.
- Community Detection: The graph is partitioned into communities of related entities using hierarchical clustering. Each community generates a summary embedding for high-level retrieval.
- Dual-Level Retrieval: Queries are matched against both entity-level and community-level embeddings, then the retrieved graph context is fed to the LLM for answer generation.
This architecture produces answers that reflect the actual structure of knowledge rather than arbitrary document boundaries.
What Are the Advantages of Dual-Level Retrieval?
The dual-level design is LightRAG’s defining innovation. To understand why it matters, consider the difference between two types of questions:
| Query Type | Traditional RAG | LightRAG Dual-Level |
|---|---|---|
| “What did Apple acquire in 2020?” | Finds chunks mentioning “Apple” and “2020” and “acquisition” | Low-level retrieval: directly follows the [Apple] --acquired--> [Startup X] edge in the graph |
| “What is Apple’s acquisition strategy?” | Scattershot across unrelated chunks | High-level retrieval: returns community summary of the entire “Apple acquisitions” cluster |
| “How does Apple’s AI strategy compare to Google’s?” | Struggles to connect across separate chunk groups | Traverses both clusters and returns a combined graph context |
| “Explain the timeline of Apple chip development” | May miss chronological connections between sparse chunks | Low-level retrieval follows temporal relation edges: [Apple] --developed--> [A14], [Apple] --developed--> [M1] |
The key insight is that graphs naturally capture relationships in a way flat vector stores cannot. Dual-level retrieval lets LightRAG automatically select the right granularity for each query – a capability that would require manual configuration in traditional RAG systems.
What Are the Key Feature Specifications?
| Feature | Description | Status |
|---|---|---|
| Graph Construction | Entity and relation extraction from documents | Stable |
| Dual-Level Retrieval | Low-level (facts) and high-level (summary) queries | Stable |
| Incremental Updates | Merge new documents without full graph rebuild | Stable |
| Graph Visualization | Interactive entity-relation graph explorer | Stable |
| Multiple Backends | OpenAI, Ollama, Hugging Face, custom API | Supported |
| API Access | Python SDK + RESTful API | Stable |
| Embedding Models | Configurable, multiple model support | Supported |
How Do Incremental Updates Work?
One of the most practical features of LightRAG is its support for incremental knowledge base updates without rebuilding the entire graph. This is critical for real-world deployments where data changes continuously.
flowchart LR
A[New Document] --> B[Extract Entities\nand Relations]
B --> C{Entities exist\nin graph?}
C -->|Yes| D[Add new relations\nand edges]
C -->|No| E[Create new\nentity nodes]
D --> F[Update community\ndetection]
E --> F
F --> G[Recalculate\naffected embeddings]
G --> H[Update retrieval\nindexes]
H --> I[Ready for queries]The incremental update process is efficient because it only processes the new document’s entities and relations, merges them into the existing graph structure, and recalculates embeddings only for affected nodes and communities. This avoids the O(n) cost of full graph reconstruction and makes LightRAG practical for continuously growing knowledge bases.
What Backends Does LightRAG Support?
| Backend | Use Case | Configuration |
|---|---|---|
| OpenAI (GPT-4o, GPT-4o-mini) | Production deployments with cloud LLMs | LIGHTRAG_LLM_BACKEND=openai |
| Ollama | Local/private deployments with open models | LIGHTRAG_LLM_BACKEND=ollama |
| Hugging Face | Custom models from the HF hub | LIGHTRAG_LLM_BACKEND=huggingface |
| Custom Endpoint | Enterprise proxies or self-hosted APIs | LIGHTRAG_LLM_BACKEND=custom |
LightRAG’s backend abstraction means you can prototype with OpenAI, then switch to a local model via Ollama for production – or run a hybrid where embedding happens locally and generation uses the cloud.
How to Install and Use LightRAG
Installation is straightforward via pip:
pip install lightrag-hku
Basic usage requires just a few lines of Python:
from lightrag import LightRAG, QueryMode
rag = LightRAG(workspace_dir="./my_knowledge_base")
# Insert documents
rag.insert("Apple acquired startup X in 2020 to expand its AI capabilities.")
rag.insert("Google launched its Gemini model in 2023, competing with GPT-4.")
# Query with dual-level retrieval
answer = rag.query("What is Apple's acquisition strategy?",
mode=QueryMode.HYBRID)
# Low-level query for specific facts
fact = rag.query("What did Apple acquire in 2020?",
mode=QueryMode.LOCAL)
The system also supports batch document insertion from files, custom embedding configurations, and integration with existing document pipelines.
FAQ
What is LightRAG and how is it different from traditional RAG? LightRAG is an EMNLP 2025 research project from HKU that replaces flat vector database retrieval with a knowledge graph approach. Unlike traditional RAG which retrieves document chunks by embedding similarity, LightRAG extracts entities and relations from documents into a graph structure, enabling dual-level retrieval (low-level for specific facts, high-level for global summaries).
How does dual-level retrieval work in LightRAG? LightRAG’s dual-level retrieval operates at two granularities. Low-level retrieval finds specific entity relationships, while high-level retrieval captures broader summaries. The graph-based approach automatically determines which level is appropriate based on the query, eliminating manual configuration.
Does LightRAG support incremental updates to the knowledge base? Yes, LightRAG supports incremental knowledge base updates through partial graph reconstruction. When new documents are added, LightRAG extracts entities and relations from the new text, merges them into the existing graph without full rebuild, and updates embeddings only for affected nodes.
What graph visualization capabilities does LightRAG offer? LightRAG provides interactive graph visualizations that display entities as nodes and relations as edges, color-coded by community cluster. Users can explore the knowledge graph, click on entities to see detailed context, trace relationship paths, and visually inspect retrieval connections.
What backends and APIs does LightRAG support? LightRAG supports multiple LLM backends including OpenAI, Ollama, Hugging Face, and custom endpoints. It provides both a Python API and a RESTful API for integration with external applications.
Further Reading
- LightRAG GitHub Repository – Official source code, documentation, and examples
- LightRAG Research Paper (EMNLP 2025) – The academic paper describing the architecture and evaluations
- Graph RAG by Microsoft – An alternative graph-based RAG approach with different design trade-offs
- Neo4j Graph Database – Graph database platform that can serve as LightRAG’s storage backend
- Ollama Local LLMs – Run LLMs locally for a fully private LightRAG deployment
無程式碼也能輕鬆打造專業LINE官方帳號!一鍵導入模板,讓AI助你行銷加分!