Daniel.y commited on
Commit
b379f14
·
unverified ·
2 Parent(s): 2f1dc48 b1739da

Merge pull request #1329 from earayu/fix_readme_grammar_typo

Browse files
Files changed (1) hide show
  1. lightrag/api/README.md +126 -132
lightrag/api/README.md CHANGED
@@ -1,6 +1,6 @@
1
  # LightRAG Server and WebUI
2
 
3
- The LightRAG Server is designed to provide Web UI and API support. The Web UI facilitates document indexing, knowledge graph exploration, and a simple RAG query interface. LightRAG Server also provide an Ollama compatible interfaces, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat bot, such as Open WebUI, to access LightRAG easily.
4
 
5
  ![image-20250323122538997](./README.assets/image-20250323122538997.png)
6
 
@@ -8,17 +8,17 @@ The LightRAG Server is designed to provide Web UI and API support. The Web UI fa
8
 
9
  ![image-20250323123011220](./README.assets/image-20250323123011220.png)
10
 
11
- ## Getting Start
12
 
13
  ### Installation
14
 
15
- * Install from PyPI
16
 
17
  ```bash
18
  pip install "lightrag-hku[api]"
19
  ```
20
 
21
- * Installation from Source
22
 
23
  ```bash
24
  # Clone the repository
@@ -27,7 +27,7 @@ git clone https://github.com/HKUDS/lightrag.git
27
  # Change to the repository directory
28
  cd lightrag
29
 
30
- # create a Python virtual enviroment if neccesary
31
  # Install in editable mode with API support
32
  pip install -e ".[api]"
33
  ```
@@ -36,23 +36,23 @@ pip install -e ".[api]"
36
 
37
  LightRAG necessitates the integration of both an LLM (Large Language Model) and an Embedding Model to effectively execute document indexing and querying operations. Prior to the initial deployment of the LightRAG server, it is essential to configure the settings for both the LLM and the Embedding Model. LightRAG supports binding to various LLM/Embedding backends:
38
 
39
- * ollama
40
- * lollms
41
- * openai or openai compatible
42
- * azure_openai
43
 
44
  It is recommended to use environment variables to configure the LightRAG Server. There is an example environment variable file named `env.example` in the root directory of the project. Please copy this file to the startup directory and rename it to `.env`. After that, you can modify the parameters related to the LLM and Embedding models in the `.env` file. It is important to note that the LightRAG Server will load the environment variables from `.env` into the system environment variables each time it starts. Since the LightRAG Server will prioritize the settings in the system environment variables, if you modify the `.env` file after starting the LightRAG Server via the command line, you need to execute `source .env` to make the new settings take effect.
45
 
46
- Here are some examples of common settings for LLM and Embedding models
47
 
48
- * OpenAI LLM + Ollama Embedding
49
 
50
  ```
51
  LLM_BINDING=openai
52
  LLM_MODEL=gpt-4o
53
  LLM_BINDING_HOST=https://api.openai.com/v1
54
  LLM_BINDING_API_KEY=your_api_key
55
- ### Max tokens send to LLM (less than model context size)
56
  MAX_TOKENS=32768
57
 
58
  EMBEDDING_BINDING=ollama
@@ -62,14 +62,14 @@ EMBEDDING_DIM=1024
62
  # EMBEDDING_BINDING_API_KEY=your_api_key
63
  ```
64
 
65
- * Ollama LLM + Ollama Embedding
66
 
67
  ```
68
  LLM_BINDING=ollama
69
  LLM_MODEL=mistral-nemo:latest
70
  LLM_BINDING_HOST=http://localhost:11434
71
  # LLM_BINDING_API_KEY=your_api_key
72
- ### Max tokens send to LLM (base on your Ollama Server capacity)
73
  MAX_TOKENS=8192
74
 
75
  EMBEDDING_BINDING=ollama
@@ -82,12 +82,12 @@ EMBEDDING_DIM=1024
82
  ### Starting LightRAG Server
83
 
84
  The LightRAG Server supports two operational modes:
85
- * The simple and efficient Uvicorn mode
86
 
87
  ```
88
  lightrag-server
89
  ```
90
- * The multiprocess Gunicorn + Uvicorn mode (production mode, not supported on Windows environments)
91
 
92
  ```
93
  lightrag-gunicorn --workers 4
@@ -96,44 +96,44 @@ The `.env` file **must be placed in the startup directory**.
96
 
97
  Upon launching, the LightRAG Server will create a documents directory (default is `./inputs`) and a data directory (default is `./rag_storage`). This allows you to initiate multiple instances of LightRAG Server from different directories, with each instance configured to listen on a distinct network port.
98
 
99
- Here are some common used startup parameters:
100
 
101
- - `--host`: Server listening address (default: 0.0.0.0)
102
- - `--port`: Server listening port (default: 9621)
103
- - `--timeout`: LLM request timeout (default: 150 seconds)
104
- - `--log-level`: Logging level (default: INFO)
105
- - --input-dir: specifying the directory to scan for documents (default: ./input)
106
 
107
- > The requirement for the .env file to be in the startup directory is intentionally designed this way. The purpose is to support users in launching multiple LightRAG instances simultaneously. Allow different .env files for different instances.
108
 
109
  ### Auto scan on startup
110
 
111
  When starting any of the servers with the `--auto-scan-at-startup` parameter, the system will automatically:
112
 
113
- 1. Scan for new files in the input directory
114
- 2. Indexing new documents that aren't already in the database
115
- 3. Make all content immediately available for RAG queries
116
 
117
- > The `--input-dir` parameter specify the input directory to scan for. You can trigger input diretory scan from webui.
118
 
119
  ### Multiple workers for Gunicorn + Uvicorn
120
 
121
- The LightRAG Server can operate in the `Gunicorn + Uvicorn` preload mode. Gunicorn's Multiple Worker (multiprocess) capability prevents document indexing tasks from blocking RAG queries. Using CPU-exhaustive document extraction tools, such as docling, can lead to the entire system being blocked in pure Uvicorn mode.
122
 
123
- Though LightRAG Server uses one workers to process the document indexing pipeline, with aysnc task supporting of Uvicorn, multiple files can be processed in parallell. The bottleneck of document indexing speed mainly lies with the LLM. If your LLM supports high concurrency, you can accelerate document indexing by increasing the concurrency level of the LLM. Below are several environment variables related to concurrent processing, along with their default values:
124
 
125
  ```
126
- ### Num of worker processes, not greater then (2 x number_of_cores) + 1
127
  WORKERS=2
128
- ### Num of parallel files to process in one batch
129
  MAX_PARALLEL_INSERT=2
130
- ### Max concurrency requests of LLM
131
  MAX_ASYNC=4
132
  ```
133
 
134
- ### Install Lightrag as a Linux Service
135
 
136
- Create a your service file `lightrag.sevice` from the sample file : `lightrag.sevice.example`. Modified the WorkingDirectoryand EexecStart in the service file:
137
 
138
  ```text
139
  Description=LightRAG Ollama Service
@@ -141,7 +141,7 @@ WorkingDirectory=<lightrag installed directory>
141
  ExecStart=<lightrag installed directory>/lightrag/api/lightrag-api
142
  ```
143
 
144
- Modify your service startup script: `lightrag-api`. Change you python virtual environment activation command as needed:
145
 
146
  ```shell
147
  #!/bin/bash
@@ -164,21 +164,21 @@ sudo systemctl enable lightrag.service
164
 
165
  ## Ollama Emulation
166
 
167
- We provide an Ollama-compatible interfaces for LightRAG, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat frontends supporting Ollama, such as Open WebUI, to access LightRAG easily.
168
 
169
  ### Connect Open WebUI to LightRAG
170
 
171
- After starting the lightrag-server, you can add an Ollama-type connection in the Open WebUI admin pannel. And then a model named `lightrag:latest` will appear in Open WebUI's model management interface. Users can then send queries to LightRAG through the chat interface. You'd better install LightRAG as service for this use case.
172
 
173
- Open WebUI's use LLM to do the session title and session keyword generation task. So the Ollama chat chat completion API detects and forwards OpenWebUI session-related requests directly to underlying LLM. Screen shot from Open WebUI:
174
 
175
  ![image-20250323194750379](./README.assets/image-20250323194750379.png)
176
 
177
  ### Choose Query mode in chat
178
 
179
- The defautl query mode is `hybrid` if you send a message(query) from Ollama interface of LightRAG. You can select query mode by sending a message with query prefix.
180
 
181
- A query prefix in the query string can determines which LightRAG query mode is used to generate the respond for the query. The supported prefixes include:
182
 
183
  ```
184
  /local
@@ -196,30 +196,28 @@ A query prefix in the query string can determines which LightRAG query mode is u
196
  /mixcontext
197
  ```
198
 
199
- For example, chat message "/mix What's LightRag" will trigger a mix mode query for LighRAG. A chat message without query prefix will trigger a hybrid mode query by default.
200
 
201
- "/bypass" not a LightRAG query mode, it will tell API Server to pass the query directly to the underlying LLM with chat history. So user can use LLM to answer question base on the chat history. If you are using Open WebUI as front end, you can just switch the model to a normal LLM instead of using /bypass prefix.
202
 
203
- "/context" is not a LightRAG query mode neither, it will tell LightRAG to return only the context information prepared for LLM. You can check the context if it's want you want, or process the conext by your self.
204
 
 
205
 
 
206
 
207
- ## API-Key and Authentication
208
-
209
- By default, the LightRAG Server can be accessed without any authentication. We can configure the server with an API-Key or account credentials to secure it.
210
-
211
- * API-KEY
212
 
213
  ```
214
  LIGHTRAG_API_KEY=your-secure-api-key-here
215
  WHITELIST_PATHS=/health,/api/*
216
  ```
217
 
218
- > Health check and Ollama emuluation endpoins is exclude from API-KEY check by default.
219
 
220
- * Account credentials (the web UI requires login before access)
221
 
222
- LightRAG API Server implements JWT-based authentication using HS256 algorithm. To enable secure access control, the following environment variables are required:
223
 
224
  ```bash
225
  # For jwt auth
@@ -230,16 +228,14 @@ TOKEN_EXPIRE_HOURS=4
230
 
231
  > Currently, only the configuration of an administrator account and password is supported. A comprehensive account system is yet to be developed and implemented.
232
 
233
- If Account credentials are not configured, the web UI will access the system as a Guest. Therefore, even if only API-KEY is configured, all API can still be accessed through the Guest account, which remains insecure. Hence, to safeguard the API, it is necessary to configure both authentication methods simultaneously.
234
-
235
-
236
 
237
  ## For Azure OpenAI Backend
238
 
239
  Azure OpenAI API can be created using the following commands in Azure CLI (you need to install Azure CLI first from [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)):
240
 
241
  ```bash
242
- # Change the resource group name, location and OpenAI resource name as needed
243
  RESOURCE_GROUP_NAME=LightRAG
244
  LOCATION=swedencentral
245
  RESOURCE_NAME=LightRAG-OpenAI
@@ -257,7 +253,7 @@ az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_
257
  The output of the last command will give you the endpoint and the key for the OpenAI API. You can use these values to set the environment variables in the `.env` file.
258
 
259
  ```
260
- # Azure OpenAI Configuration in .env
261
  LLM_BINDING=azure_openai
262
  LLM_BINDING_HOST=your-azure-endpoint
263
  LLM_MODEL=your-model-deployment-name
@@ -265,91 +261,89 @@ LLM_BINDING_API_KEY=your-azure-api-key
265
  ### API version is optional, defaults to latest version
266
  AZURE_OPENAI_API_VERSION=2024-08-01-preview
267
 
268
- ### if using Azure OpenAI for embeddings
269
  EMBEDDING_BINDING=azure_openai
270
  EMBEDDING_MODEL=your-embedding-deployment-name
271
  ```
272
 
273
-
274
-
275
  ## LightRAG Server Configuration in Detail
276
 
277
- API Server can be config in three way (highest priority first):
278
 
279
- * Command line arguments
280
- * Enviroment variables or .env file
281
- * Config.ini (Only for storage configuration)
282
 
283
- Most of the configurations come with a default settings, check out details in sample file: `.env.example`. Datastorage configuration can be also set by config.ini. A sample file `config.ini.example` is provided for your convenience.
284
 
285
  ### LLM and Embedding Backend Supported
286
 
287
  LightRAG supports binding to various LLM/Embedding backends:
288
 
289
- * ollama
290
- * lollms
291
- * openai & openai compatible
292
- * azure_openai
293
 
294
- Use environment variables `LLM_BINDING` or CLI argument `--llm-binding` to select LLM backend type. Use environment variables `EMBEDDING_BINDING` or CLI argument `--embedding-binding` to select LLM backend type.
295
 
296
  ### Entity Extraction Configuration
297
- * ENABLE_LLM_CACHE_FOR_EXTRACT: Enable LLM cache for entity extraction (default: true)
298
 
299
- It's very common to set `ENABLE_LLM_CACHE_FOR_EXTRACT` to true for test environment to reduce the cost of LLM calls.
300
 
301
  ### Storage Types Supported
302
 
303
- LightRAG uses 4 types of storage for difference purposes:
304
 
305
- * KV_STORAGE:llm response cache, text chunks, document information
306
- * VECTOR_STORAGE:entities vectors, relation vectors, chunks vectors
307
- * GRAPH_STORAGE:entity relation graph
308
- * DOC_STATUS_STORAGE:documents indexing status
309
 
310
- Each storage type have servals implementations:
311
 
312
- * KV_STORAGE supported implement-name
313
 
314
  ```
315
- JsonKVStorage JsonFile(default)
316
  PGKVStorage Postgres
317
  RedisKVStorage Redis
318
- MongoKVStorage MogonDB
319
  ```
320
 
321
- * GRAPH_STORAGE supported implement-name
322
 
323
  ```
324
- NetworkXStorage NetworkX(defualt)
325
  Neo4JStorage Neo4J
326
  PGGraphStorage Postgres
327
  AGEStorage AGE
328
  ```
329
 
330
- * VECTOR_STORAGE supported implement-name
331
 
332
  ```
333
- NanoVectorDBStorage NanoVector(default)
334
  PGVectorStorage Postgres
335
- MilvusVectorDBStorge Milvus
336
  ChromaVectorDBStorage Chroma
337
  FaissVectorDBStorage Faiss
338
  QdrantVectorDBStorage Qdrant
339
  MongoVectorDBStorage MongoDB
340
  ```
341
 
342
- * DOC_STATUS_STORAGE:supported implement-name
343
 
344
  ```
345
- JsonDocStatusStorage JsonFile(default)
346
  PGDocStatusStorage Postgres
347
  MongoDocStatusStorage MongoDB
348
  ```
349
 
350
- ### How Select Storage Implementation
351
 
352
- You can select storage implementation by environment variables. Your can set the following environmental variables to a specific storage implement-name before the your first start of the API Server:
353
 
354
  ```
355
  LIGHTRAG_KV_STORAGE=PGKVStorage
@@ -358,30 +352,30 @@ LIGHTRAG_GRAPH_STORAGE=PGGraphStorage
358
  LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage
359
  ```
360
 
361
- You can not change storage implementation selection after you add documents to LightRAG. Data migration from one storage implementation to anthor is not supported yet. For further information please read the sample env file or config.ini file.
362
-
363
- ### LightRag API Server Comand Line Options
364
-
365
- | Parameter | Default | Description |
366
- |-----------|---------|-------------|
367
- | --host | 0.0.0.0 | Server host |
368
- | --port | 9621 | Server port |
369
- | --working-dir | ./rag_storage | Working directory for RAG storage |
370
- | --input-dir | ./inputs | Directory containing input documents |
371
- | --max-async | 4 | Maximum async operations |
372
- | --max-tokens | 32768 | Maximum token size |
373
- | --timeout | 150 | Timeout in seconds. None for infinite timeout(not recommended) |
374
- | --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
375
- | --verbose | - | Verbose debug output (True, Flase) |
376
- | --key | None | API key for authentication. Protects lightrag server against unauthorized access |
377
- | --ssl | False | Enable HTTPS |
378
- | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
379
- | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
380
- | --top-k | 50 | Number of top-k items to retrieve; corresponds to entities in "local" mode and relationships in "global" mode. |
381
- | --cosine-threshold | 0.4 | The cossine threshold for nodes and relations retrieval, works with top-k to control the retrieval of nodes and relations. |
382
- | --llm-binding | ollama | LLM binding type (lollms, ollama, openai, openai-ollama, azure_openai) |
383
- | --embedding-binding | ollama | Embedding binding type (lollms, ollama, openai, azure_openai) |
384
- | auto-scan-at-startup | - | Scan input directory for new files and start indexing |
385
 
386
  ### .env Examples
387
 
@@ -427,20 +421,20 @@ EMBEDDING_BINDING_HOST=http://localhost:11434
427
 
428
  ## API Endpoints
429
 
430
- All servers (LoLLMs, Ollama, OpenAI and Azure OpenAI) provide the same REST API endpoints for RAG functionality. When API Server is running, visit:
431
 
432
- - Swagger UI: http://localhost:9621/docs
433
- - ReDoc: http://localhost:9621/redoc
434
 
435
  You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to:
436
 
437
- 1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI)
438
- 2. Start the RAG server
439
- 3. Upload some documents using the document management endpoints
440
- 4. Query the system using the query endpoints
441
- 5. Trigger document scan if new files is put into inputs directory
442
 
443
- ### Query Endpoints
444
 
445
  #### POST /query
446
  Query the RAG system with options for different search modes.
@@ -448,7 +442,7 @@ Query the RAG system with options for different search modes.
448
  ```bash
449
  curl -X POST "http://localhost:9621/query" \
450
  -H "Content-Type: application/json" \
451
- -d '{"query": "Your question here", "mode": "hybrid", ""}'
452
  ```
453
 
454
  #### POST /query/stream
@@ -460,7 +454,7 @@ curl -X POST "http://localhost:9621/query/stream" \
460
  -d '{"query": "Your question here", "mode": "hybrid"}'
461
  ```
462
 
463
- ### Document Management Endpoints
464
 
465
  #### POST /documents/text
466
  Insert text directly into the RAG system.
@@ -491,13 +485,13 @@ curl -X POST "http://localhost:9621/documents/batch" \
491
 
492
  #### POST /documents/scan
493
 
494
- Trigger document scan for new files in the Input directory.
495
 
496
  ```bash
497
  curl -X POST "http://localhost:9621/documents/scan" --max-time 1800
498
  ```
499
 
500
- > Ajust max-time according to the estimated index time for all new files.
501
 
502
  #### DELETE /documents
503
 
@@ -507,7 +501,7 @@ Clear all documents from the RAG system.
507
  curl -X DELETE "http://localhost:9621/documents"
508
  ```
509
 
510
- ### Ollama Emulation Endpoints
511
 
512
  #### GET /api/version
513
 
@@ -519,7 +513,7 @@ curl http://localhost:9621/api/version
519
 
520
  #### GET /api/tags
521
 
522
- Get Ollama available models.
523
 
524
  ```bash
525
  curl http://localhost:9621/api/tags
@@ -527,20 +521,20 @@ curl http://localhost:9621/api/tags
527
 
528
  #### POST /api/chat
529
 
530
- Handle chat completion requests. Routes user queries through LightRAG by selecting query mode based on query prefix. Detects and forwards OpenWebUI session-related requests (for meta data generation task) directly to underlying LLM.
531
 
532
  ```shell
533
  curl -N -X POST http://localhost:9621/api/chat -H "Content-Type: application/json" -d \
534
  '{"model":"lightrag:latest","messages":[{"role":"user","content":"猪八戒是谁"}],"stream":true}'
535
  ```
536
 
537
- > For more information about Ollama API pls. visit : [Ollama API documentation](https://github.com/ollama/ollama/blob/main/docs/api.md)
538
 
539
  #### POST /api/generate
540
 
541
- Handle generate completion requests. For compatibility purpose, the request is not processed by LightRAG, and will be handled by underlying LLM model.
542
 
543
- ### Utility Endpoints
544
 
545
  #### GET /health
546
  Check server health and configuration.
 
1
  # LightRAG Server and WebUI
2
 
3
+ The LightRAG Server is designed to provide a Web UI and API support. The Web UI facilitates document indexing, knowledge graph exploration, and a simple RAG query interface. LightRAG Server also provides an Ollama-compatible interface, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat bots, such as Open WebUI, to access LightRAG easily.
4
 
5
  ![image-20250323122538997](./README.assets/image-20250323122538997.png)
6
 
 
8
 
9
  ![image-20250323123011220](./README.assets/image-20250323123011220.png)
10
 
11
+ ## Getting Started
12
 
13
  ### Installation
14
 
15
+ * Install from PyPI
16
 
17
  ```bash
18
  pip install "lightrag-hku[api]"
19
  ```
20
 
21
+ * Installation from Source
22
 
23
  ```bash
24
  # Clone the repository
 
27
  # Change to the repository directory
28
  cd lightrag
29
 
30
+ # create a Python virtual environment if necessary
31
  # Install in editable mode with API support
32
  pip install -e ".[api]"
33
  ```
 
36
 
37
  LightRAG necessitates the integration of both an LLM (Large Language Model) and an Embedding Model to effectively execute document indexing and querying operations. Prior to the initial deployment of the LightRAG server, it is essential to configure the settings for both the LLM and the Embedding Model. LightRAG supports binding to various LLM/Embedding backends:
38
 
39
+ * ollama
40
+ * lollms
41
+ * openai or openai compatible
42
+ * azure_openai
43
 
44
  It is recommended to use environment variables to configure the LightRAG Server. There is an example environment variable file named `env.example` in the root directory of the project. Please copy this file to the startup directory and rename it to `.env`. After that, you can modify the parameters related to the LLM and Embedding models in the `.env` file. It is important to note that the LightRAG Server will load the environment variables from `.env` into the system environment variables each time it starts. Since the LightRAG Server will prioritize the settings in the system environment variables, if you modify the `.env` file after starting the LightRAG Server via the command line, you need to execute `source .env` to make the new settings take effect.
45
 
46
+ Here are some examples of common settings for LLM and Embedding models:
47
 
48
+ * OpenAI LLM + Ollama Embedding:
49
 
50
  ```
51
  LLM_BINDING=openai
52
  LLM_MODEL=gpt-4o
53
  LLM_BINDING_HOST=https://api.openai.com/v1
54
  LLM_BINDING_API_KEY=your_api_key
55
+ ### Max tokens sent to LLM (less than model context size)
56
  MAX_TOKENS=32768
57
 
58
  EMBEDDING_BINDING=ollama
 
62
  # EMBEDDING_BINDING_API_KEY=your_api_key
63
  ```
64
 
65
+ * Ollama LLM + Ollama Embedding:
66
 
67
  ```
68
  LLM_BINDING=ollama
69
  LLM_MODEL=mistral-nemo:latest
70
  LLM_BINDING_HOST=http://localhost:11434
71
  # LLM_BINDING_API_KEY=your_api_key
72
+ ### Max tokens sent to LLM (based on your Ollama Server capacity)
73
  MAX_TOKENS=8192
74
 
75
  EMBEDDING_BINDING=ollama
 
82
  ### Starting LightRAG Server
83
 
84
  The LightRAG Server supports two operational modes:
85
+ * The simple and efficient Uvicorn mode:
86
 
87
  ```
88
  lightrag-server
89
  ```
90
+ * The multiprocess Gunicorn + Uvicorn mode (production mode, not supported on Windows environments):
91
 
92
  ```
93
  lightrag-gunicorn --workers 4
 
96
 
97
  Upon launching, the LightRAG Server will create a documents directory (default is `./inputs`) and a data directory (default is `./rag_storage`). This allows you to initiate multiple instances of LightRAG Server from different directories, with each instance configured to listen on a distinct network port.
98
 
99
+ Here are some commonly used startup parameters:
100
 
101
+ - `--host`: Server listening address (default: 0.0.0.0)
102
+ - `--port`: Server listening port (default: 9621)
103
+ - `--timeout`: LLM request timeout (default: 150 seconds)
104
+ - `--log-level`: Logging level (default: INFO)
105
+ - `--input-dir`: Specifying the directory to scan for documents (default: ./inputs)
106
 
107
+ > The requirement for the .env file to be in the startup directory is intentionally designed this way. The purpose is to support users in launching multiple LightRAG instances simultaneously, allowing different .env files for different instances.
108
 
109
  ### Auto scan on startup
110
 
111
  When starting any of the servers with the `--auto-scan-at-startup` parameter, the system will automatically:
112
 
113
+ 1. Scan for new files in the input directory
114
+ 2. Index new documents that aren't already in the database
115
+ 3. Make all content immediately available for RAG queries
116
 
117
+ > The `--input-dir` parameter specifies the input directory to scan. You can trigger the input directory scan from the Web UI.
118
 
119
  ### Multiple workers for Gunicorn + Uvicorn
120
 
121
+ The LightRAG Server can operate in the `Gunicorn + Uvicorn` preload mode. Gunicorn's multiple worker (multiprocess) capability prevents document indexing tasks from blocking RAG queries. Using CPU-exhaustive document extraction tools, such as docling, can lead to the entire system being blocked in pure Uvicorn mode.
122
 
123
+ Though LightRAG Server uses one worker to process the document indexing pipeline, with the async task support of Uvicorn, multiple files can be processed in parallel. The bottleneck of document indexing speed mainly lies with the LLM. If your LLM supports high concurrency, you can accelerate document indexing by increasing the concurrency level of the LLM. Below are several environment variables related to concurrent processing, along with their default values:
124
 
125
  ```
126
+ ### Number of worker processes, not greater than (2 x number_of_cores) + 1
127
  WORKERS=2
128
+ ### Number of parallel files to process in one batch
129
  MAX_PARALLEL_INSERT=2
130
+ ### Max concurrent requests to the LLM
131
  MAX_ASYNC=4
132
  ```
133
 
134
+ ### Install LightRAG as a Linux Service
135
 
136
+ Create your service file `lightrag.service` from the sample file: `lightrag.service.example`. Modify the `WorkingDirectory` and `ExecStart` in the service file:
137
 
138
  ```text
139
  Description=LightRAG Ollama Service
 
141
  ExecStart=<lightrag installed directory>/lightrag/api/lightrag-api
142
  ```
143
 
144
+ Modify your service startup script: `lightrag-api`. Change your Python virtual environment activation command as needed:
145
 
146
  ```shell
147
  #!/bin/bash
 
164
 
165
  ## Ollama Emulation
166
 
167
+ We provide Ollama-compatible interfaces for LightRAG, aiming to emulate LightRAG as an Ollama chat model. This allows AI chat frontends supporting Ollama, such as Open WebUI, to access LightRAG easily.
168
 
169
  ### Connect Open WebUI to LightRAG
170
 
171
+ After starting the lightrag-server, you can add an Ollama-type connection in the Open WebUI admin panel. And then a model named `lightrag:latest` will appear in Open WebUI's model management interface. Users can then send queries to LightRAG through the chat interface. You should install LightRAG as a service for this use case.
172
 
173
+ Open WebUI uses an LLM to do the session title and session keyword generation task. So the Ollama chat completion API detects and forwards OpenWebUI session-related requests directly to the underlying LLM. Screenshot from Open WebUI:
174
 
175
  ![image-20250323194750379](./README.assets/image-20250323194750379.png)
176
 
177
  ### Choose Query mode in chat
178
 
179
+ The default query mode is `hybrid` if you send a message (query) from the Ollama interface of LightRAG. You can select query mode by sending a message with a query prefix.
180
 
181
+ A query prefix in the query string can determine which LightRAG query mode is used to generate the response for the query. The supported prefixes include:
182
 
183
  ```
184
  /local
 
196
  /mixcontext
197
  ```
198
 
199
+ For example, the chat message `/mix What's LightRAG?` will trigger a mix mode query for LightRAG. A chat message without a query prefix will trigger a hybrid mode query by default.
200
 
201
+ `/bypass` is not a LightRAG query mode; it will tell the API Server to pass the query directly to the underlying LLM, including the chat history. So the user can use the LLM to answer questions based on the chat history. If you are using Open WebUI as a front end, you can just switch the model to a normal LLM instead of using the `/bypass` prefix.
202
 
203
+ `/context` is also not a LightRAG query mode; it will tell LightRAG to return only the context information prepared for the LLM. You can check the context if it's what you want, or process the context by yourself.
204
 
205
+ ## API Key and Authentication
206
 
207
+ By default, the LightRAG Server can be accessed without any authentication. We can configure the server with an API Key or account credentials to secure it.
208
 
209
+ * API Key:
 
 
 
 
210
 
211
  ```
212
  LIGHTRAG_API_KEY=your-secure-api-key-here
213
  WHITELIST_PATHS=/health,/api/*
214
  ```
215
 
216
+ > Health check and Ollama emulation endpoints are excluded from API Key check by default.
217
 
218
+ * Account credentials (the Web UI requires login before access can be granted):
219
 
220
+ LightRAG API Server implements JWT-based authentication using the HS256 algorithm. To enable secure access control, the following environment variables are required:
221
 
222
  ```bash
223
  # For jwt auth
 
228
 
229
  > Currently, only the configuration of an administrator account and password is supported. A comprehensive account system is yet to be developed and implemented.
230
 
231
+ If Account credentials are not configured, the Web UI will access the system as a Guest. Therefore, even if only an API Key is configured, all APIs can still be accessed through the Guest account, which remains insecure. Hence, to safeguard the API, it is necessary to configure both authentication methods simultaneously.
 
 
232
 
233
  ## For Azure OpenAI Backend
234
 
235
  Azure OpenAI API can be created using the following commands in Azure CLI (you need to install Azure CLI first from [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)):
236
 
237
  ```bash
238
+ # Change the resource group name, location, and OpenAI resource name as needed
239
  RESOURCE_GROUP_NAME=LightRAG
240
  LOCATION=swedencentral
241
  RESOURCE_NAME=LightRAG-OpenAI
 
253
  The output of the last command will give you the endpoint and the key for the OpenAI API. You can use these values to set the environment variables in the `.env` file.
254
 
255
  ```
256
+ # Azure OpenAI Configuration in .env:
257
  LLM_BINDING=azure_openai
258
  LLM_BINDING_HOST=your-azure-endpoint
259
  LLM_MODEL=your-model-deployment-name
 
261
  ### API version is optional, defaults to latest version
262
  AZURE_OPENAI_API_VERSION=2024-08-01-preview
263
 
264
+ ### If using Azure OpenAI for embeddings
265
  EMBEDDING_BINDING=azure_openai
266
  EMBEDDING_MODEL=your-embedding-deployment-name
267
  ```
268
 
 
 
269
  ## LightRAG Server Configuration in Detail
270
 
271
+ The API Server can be configured in three ways (highest priority first):
272
 
273
+ * Command line arguments
274
+ * Environment variables or .env file
275
+ * Config.ini (Only for storage configuration)
276
 
277
+ Most of the configurations come with default settings; check out the details in the sample file: `.env.example`. Data storage configuration can also be set by config.ini. A sample file `config.ini.example` is provided for your convenience.
278
 
279
  ### LLM and Embedding Backend Supported
280
 
281
  LightRAG supports binding to various LLM/Embedding backends:
282
 
283
+ * ollama
284
+ * lollms
285
+ * openai & openai compatible
286
+ * azure_openai
287
 
288
+ Use environment variables `LLM_BINDING` or CLI argument `--llm-binding` to select the LLM backend type. Use environment variables `EMBEDDING_BINDING` or CLI argument `--embedding-binding` to select the Embedding backend type.
289
 
290
  ### Entity Extraction Configuration
291
+ * ENABLE_LLM_CACHE_FOR_EXTRACT: Enable LLM cache for entity extraction (default: true)
292
 
293
+ It's very common to set `ENABLE_LLM_CACHE_FOR_EXTRACT` to true for a test environment to reduce the cost of LLM calls.
294
 
295
  ### Storage Types Supported
296
 
297
+ LightRAG uses 4 types of storage for different purposes:
298
 
299
+ * KV_STORAGE: llm response cache, text chunks, document information
300
+ * VECTOR_STORAGE: entities vectors, relation vectors, chunks vectors
301
+ * GRAPH_STORAGE: entity relation graph
302
+ * DOC_STATUS_STORAGE: document indexing status
303
 
304
+ Each storage type has several implementations:
305
 
306
+ * KV_STORAGE supported implementations:
307
 
308
  ```
309
+ JsonKVStorage JsonFile (default)
310
  PGKVStorage Postgres
311
  RedisKVStorage Redis
312
+ MongoKVStorage MongoDB
313
  ```
314
 
315
+ * GRAPH_STORAGE supported implementations:
316
 
317
  ```
318
+ NetworkXStorage NetworkX (default)
319
  Neo4JStorage Neo4J
320
  PGGraphStorage Postgres
321
  AGEStorage AGE
322
  ```
323
 
324
+ * VECTOR_STORAGE supported implementations:
325
 
326
  ```
327
+ NanoVectorDBStorage NanoVector (default)
328
  PGVectorStorage Postgres
329
+ MilvusVectorDBStorage Milvus
330
  ChromaVectorDBStorage Chroma
331
  FaissVectorDBStorage Faiss
332
  QdrantVectorDBStorage Qdrant
333
  MongoVectorDBStorage MongoDB
334
  ```
335
 
336
+ * DOC_STATUS_STORAGE: supported implementations:
337
 
338
  ```
339
+ JsonDocStatusStorage JsonFile (default)
340
  PGDocStatusStorage Postgres
341
  MongoDocStatusStorage MongoDB
342
  ```
343
 
344
+ ### How to Select Storage Implementation
345
 
346
+ You can select storage implementation by environment variables. You can set the following environment variables to a specific storage implementation name before the first start of the API Server:
347
 
348
  ```
349
  LIGHTRAG_KV_STORAGE=PGKVStorage
 
352
  LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage
353
  ```
354
 
355
+ You cannot change storage implementation selection after adding documents to LightRAG. Data migration from one storage implementation to another is not supported yet. For further information, please read the sample env file or config.ini file.
356
+
357
+ ### LightRAG API Server Command Line Options
358
+
359
+ | Parameter | Default | Description |
360
+ | --------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------- |
361
+ | --host | 0.0.0.0 | Server host |
362
+ | --port | 9621 | Server port |
363
+ | --working-dir | ./rag_storage | Working directory for RAG storage |
364
+ | --input-dir | ./inputs | Directory containing input documents |
365
+ | --max-async | 4 | Maximum number of async operations |
366
+ | --max-tokens | 32768 | Maximum token size |
367
+ | --timeout | 150 | Timeout in seconds. None for infinite timeout (not recommended) |
368
+ | --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
369
+ | --verbose | - | Verbose debug output (True, False) |
370
+ | --key | None | API key for authentication. Protects the LightRAG server against unauthorized access |
371
+ | --ssl | False | Enable HTTPS |
372
+ | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
373
+ | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
374
+ | --top-k | 50 | Number of top-k items to retrieve; corresponds to entities in "local" mode and relationships in "global" mode. |
375
+ | --cosine-threshold | 0.4 | The cosine threshold for nodes and relation retrieval, works with top-k to control the retrieval of nodes and relations. |
376
+ | --llm-binding | ollama | LLM binding type (lollms, ollama, openai, openai-ollama, azure_openai) |
377
+ | --embedding-binding | ollama | Embedding binding type (lollms, ollama, openai, azure_openai) |
378
+ | --auto-scan-at-startup| - | Scan input directory for new files and start indexing |
379
 
380
  ### .env Examples
381
 
 
421
 
422
  ## API Endpoints
423
 
424
+ All servers (LoLLMs, Ollama, OpenAI and Azure OpenAI) provide the same REST API endpoints for RAG functionality. When the API Server is running, visit:
425
 
426
+ - Swagger UI: http://localhost:9621/docs
427
+ - ReDoc: http://localhost:9621/redoc
428
 
429
  You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to:
430
 
431
+ 1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI)
432
+ 2. Start the RAG server
433
+ 3. Upload some documents using the document management endpoints
434
+ 4. Query the system using the query endpoints
435
+ 5. Trigger document scan if new files are put into the inputs directory
436
 
437
+ ### Query Endpoints:
438
 
439
  #### POST /query
440
  Query the RAG system with options for different search modes.
 
442
  ```bash
443
  curl -X POST "http://localhost:9621/query" \
444
  -H "Content-Type: application/json" \
445
+ -d '{"query": "Your question here", "mode": "hybrid"}'
446
  ```
447
 
448
  #### POST /query/stream
 
454
  -d '{"query": "Your question here", "mode": "hybrid"}'
455
  ```
456
 
457
+ ### Document Management Endpoints:
458
 
459
  #### POST /documents/text
460
  Insert text directly into the RAG system.
 
485
 
486
  #### POST /documents/scan
487
 
488
+ Trigger document scan for new files in the input directory.
489
 
490
  ```bash
491
  curl -X POST "http://localhost:9621/documents/scan" --max-time 1800
492
  ```
493
 
494
+ > Adjust max-time according to the estimated indexing time for all new files.
495
 
496
  #### DELETE /documents
497
 
 
501
  curl -X DELETE "http://localhost:9621/documents"
502
  ```
503
 
504
+ ### Ollama Emulation Endpoints:
505
 
506
  #### GET /api/version
507
 
 
513
 
514
  #### GET /api/tags
515
 
516
+ Get available Ollama models.
517
 
518
  ```bash
519
  curl http://localhost:9621/api/tags
 
521
 
522
  #### POST /api/chat
523
 
524
+ Handle chat completion requests. Routes user queries through LightRAG by selecting query mode based on query prefix. Detects and forwards OpenWebUI session-related requests (for metadata generation task) directly to the underlying LLM.
525
 
526
  ```shell
527
  curl -N -X POST http://localhost:9621/api/chat -H "Content-Type: application/json" -d \
528
  '{"model":"lightrag:latest","messages":[{"role":"user","content":"猪八戒是谁"}],"stream":true}'
529
  ```
530
 
531
+ > For more information about Ollama API, please visit: [Ollama API documentation](https://github.com/ollama/ollama/blob/main/docs/api.md)
532
 
533
  #### POST /api/generate
534
 
535
+ Handle generate completion requests. For compatibility purposes, the request is not processed by LightRAG, and will be handled by the underlying LLM model.
536
 
537
+ ### Utility Endpoints:
538
 
539
  #### GET /health
540
  Check server health and configuration.