yangdx
commited on
Commit
·
3e4b84a
1
Parent(s):
492b91c
Update sample env file and documentation
Browse files- Change COSINE_THRESHOLD to 0.4
- Adjust TOP_K to 50
- Enhance API README details
- .env.example +2 -2
- README.md +6 -3
- lightrag/api/README.md +16 -10
.env.example
CHANGED
@@ -14,8 +14,8 @@ MAX_EMBED_TOKENS=8192
|
|
14 |
#HISTORY_TURNS=3
|
15 |
#CHUNK_SIZE=1200
|
16 |
#CHUNK_OVERLAP_SIZE=100
|
17 |
-
#COSINE_THRESHOLD=0.2
|
18 |
-
#TOP_K=50
|
19 |
|
20 |
# LLM Configuration (Use valid host. For local services, you can use host.docker.internal)
|
21 |
# Ollama example
|
|
|
14 |
#HISTORY_TURNS=3
|
15 |
#CHUNK_SIZE=1200
|
16 |
#CHUNK_OVERLAP_SIZE=100
|
17 |
+
#COSINE_THRESHOLD=0.4 # 0.2 while not running API server
|
18 |
+
#TOP_K=50 # 60 while not running API server
|
19 |
|
20 |
# LLM Configuration (Use valid host. For local services, you can use host.docker.internal)
|
21 |
# Ollama example
|
README.md
CHANGED
@@ -360,6 +360,8 @@ class QueryParam:
|
|
360 |
max_token_for_local_context: int = 4000
|
361 |
```
|
362 |
|
|
|
|
|
363 |
### Batch Insert
|
364 |
|
365 |
```python
|
@@ -730,10 +732,10 @@ if __name__ == "__main__":
|
|
730 |
| **embedding\_func\_max\_async** | `int` | Maximum number of concurrent asynchronous embedding processes | `16` |
|
731 |
| **llm\_model\_func** | `callable` | Function for LLM generation | `gpt_4o_mini_complete` |
|
732 |
| **llm\_model\_name** | `str` | LLM model name for generation | `meta-llama/Llama-3.2-1B-Instruct` |
|
733 |
-
| **llm\_model\_max\_token\_size** | `int` | Maximum token size for LLM generation (affects entity relation summaries) | `32768
|
734 |
-
| **llm\_model\_max\_async** | `int` | Maximum number of concurrent asynchronous LLM processes | `16
|
735 |
| **llm\_model\_kwargs** | `dict` | Additional parameters for LLM generation | |
|
736 |
-
| **vector\_db\_storage\_cls\_kwargs** | `dict` | Additional parameters for vector database
|
737 |
| **enable\_llm\_cache** | `bool` | If `TRUE`, stores LLM results in cache; repeated prompts return cached responses | `TRUE` |
|
738 |
| **enable\_llm\_cache\_for\_entity\_extract** | `bool` | If `TRUE`, stores LLM results in cache for entity extraction; Good for beginners to debug your application | `TRUE` |
|
739 |
| **addon\_params** | `dict` | Additional parameters, e.g., `{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"], "insert_batch_size": 10}`: sets example limit, output language, and batch size for document processing | `example_number: all examples, language: English, insert_batch_size: 10` |
|
@@ -741,6 +743,7 @@ if __name__ == "__main__":
|
|
741 |
| **embedding\_cache\_config** | `dict` | Configuration for question-answer caching. Contains three parameters:<br>- `enabled`: Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers.<br>- `similarity_threshold`: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM.<br>- `use_llm_check`: Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. | Default: `{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` |
|
742 |
|
743 |
### Error Handling
|
|
|
744 |
<details>
|
745 |
<summary>Click to view error handling details</summary>
|
746 |
|
|
|
360 |
max_token_for_local_context: int = 4000
|
361 |
```
|
362 |
|
363 |
+
> default value of Top_k can be change by environment variables TOP_K.
|
364 |
+
|
365 |
### Batch Insert
|
366 |
|
367 |
```python
|
|
|
732 |
| **embedding\_func\_max\_async** | `int` | Maximum number of concurrent asynchronous embedding processes | `16` |
|
733 |
| **llm\_model\_func** | `callable` | Function for LLM generation | `gpt_4o_mini_complete` |
|
734 |
| **llm\_model\_name** | `str` | LLM model name for generation | `meta-llama/Llama-3.2-1B-Instruct` |
|
735 |
+
| **llm\_model\_max\_token\_size** | `int` | Maximum token size for LLM generation (affects entity relation summaries) | `32768`(default value changed by env var MAX_TOKENS) |
|
736 |
+
| **llm\_model\_max\_async** | `int` | Maximum number of concurrent asynchronous LLM processes | `16`(default value changed by env var MAX_ASYNC) |
|
737 |
| **llm\_model\_kwargs** | `dict` | Additional parameters for LLM generation | |
|
738 |
+
| **vector\_db\_storage\_cls\_kwargs** | `dict` | Additional parameters for vector database, like setting the threshold for nodes and relations retrieval. | cosine_better_than_threshold: 0.2(default value changed by env var COSINE_THRESHOLD) |
|
739 |
| **enable\_llm\_cache** | `bool` | If `TRUE`, stores LLM results in cache; repeated prompts return cached responses | `TRUE` |
|
740 |
| **enable\_llm\_cache\_for\_entity\_extract** | `bool` | If `TRUE`, stores LLM results in cache for entity extraction; Good for beginners to debug your application | `TRUE` |
|
741 |
| **addon\_params** | `dict` | Additional parameters, e.g., `{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"], "insert_batch_size": 10}`: sets example limit, output language, and batch size for document processing | `example_number: all examples, language: English, insert_batch_size: 10` |
|
|
|
743 |
| **embedding\_cache\_config** | `dict` | Configuration for question-answer caching. Contains three parameters:<br>- `enabled`: Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers.<br>- `similarity_threshold`: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM.<br>- `use_llm_check`: Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. | Default: `{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` |
|
744 |
|
745 |
### Error Handling
|
746 |
+
|
747 |
<details>
|
748 |
<summary>Click to view error handling details</summary>
|
749 |
|
lightrag/api/README.md
CHANGED
@@ -98,6 +98,8 @@ After starting the lightrag-server, you can add an Ollama-type connection in the
|
|
98 |
|
99 |
LightRAG can be configured using either command-line arguments or environment variables. When both are provided, command-line arguments take precedence over environment variables.
|
100 |
|
|
|
|
|
101 |
### Environment Variables
|
102 |
|
103 |
You can configure LightRAG using environment variables by creating a `.env` file in your project root directory. Here's a complete example of available environment variables:
|
@@ -111,6 +113,17 @@ PORT=9621
|
|
111 |
WORKING_DIR=/app/data/rag_storage
|
112 |
INPUT_DIR=/app/data/inputs
|
113 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
114 |
# LLM Configuration
|
115 |
LLM_BINDING=ollama
|
116 |
LLM_BINDING_HOST=http://localhost:11434
|
@@ -124,14 +137,8 @@ EMBEDDING_BINDING=ollama
|
|
124 |
EMBEDDING_BINDING_HOST=http://localhost:11434
|
125 |
EMBEDDING_MODEL=bge-m3:latest
|
126 |
|
127 |
-
# RAG Configuration
|
128 |
-
MAX_ASYNC=4
|
129 |
-
MAX_TOKENS=32768
|
130 |
-
EMBEDDING_DIM=1024
|
131 |
-
MAX_EMBED_TOKENS=8192
|
132 |
-
|
133 |
# Security
|
134 |
-
LIGHTRAG_API_KEY=
|
135 |
|
136 |
# Logging
|
137 |
LOG_LEVEL=INFO
|
@@ -186,10 +193,9 @@ PORT=7000 python lightrag.py
|
|
186 |
| --ssl | False | Enable HTTPS |
|
187 |
| --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
|
188 |
| --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
|
|
|
|
|
189 |
|
190 |
-
|
191 |
-
|
192 |
-
For protecting the server using an authentication key, you can also use an environment variable named `LIGHTRAG_API_KEY`.
|
193 |
### Example Usage
|
194 |
|
195 |
#### Running a Lightrag server with ollama default local server as llm and embedding backends
|
|
|
98 |
|
99 |
LightRAG can be configured using either command-line arguments or environment variables. When both are provided, command-line arguments take precedence over environment variables.
|
100 |
|
101 |
+
For better performance, the API server's default values for TOP_K and COSINE_THRESHOLD are set to 50 and 0.4 respectively. If COSINE_THRESHOLD remains at its default value of 0.2 in LightRAG, many irrelevant entities and relations would be retrieved and sent to the LLM.
|
102 |
+
|
103 |
### Environment Variables
|
104 |
|
105 |
You can configure LightRAG using environment variables by creating a `.env` file in your project root directory. Here's a complete example of available environment variables:
|
|
|
113 |
WORKING_DIR=/app/data/rag_storage
|
114 |
INPUT_DIR=/app/data/inputs
|
115 |
|
116 |
+
# RAG Configuration
|
117 |
+
MAX_ASYNC=4
|
118 |
+
MAX_TOKENS=32768
|
119 |
+
EMBEDDING_DIM=1024
|
120 |
+
MAX_EMBED_TOKENS=8192
|
121 |
+
#HISTORY_TURNS=3
|
122 |
+
#CHUNK_SIZE=1200
|
123 |
+
#CHUNK_OVERLAP_SIZE=100
|
124 |
+
#COSINE_THRESHOLD=0.4
|
125 |
+
#TOP_K=50
|
126 |
+
|
127 |
# LLM Configuration
|
128 |
LLM_BINDING=ollama
|
129 |
LLM_BINDING_HOST=http://localhost:11434
|
|
|
137 |
EMBEDDING_BINDING_HOST=http://localhost:11434
|
138 |
EMBEDDING_MODEL=bge-m3:latest
|
139 |
|
|
|
|
|
|
|
|
|
|
|
|
|
140 |
# Security
|
141 |
+
#LIGHTRAG_API_KEY=you-api-key-for-accessing-LightRAG
|
142 |
|
143 |
# Logging
|
144 |
LOG_LEVEL=INFO
|
|
|
193 |
| --ssl | False | Enable HTTPS |
|
194 |
| --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
|
195 |
| --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
|
196 |
+
| --top-k | 50 | Number of top-k items to retrieve; corresponds to entities in "local" mode and relationships in "global" mode. |
|
197 |
+
| --cosine-threshold | 0.4 | The cossine threshold for nodes and relations retrieval, works with top-k to control the retrieval of nodes and relations. |
|
198 |
|
|
|
|
|
|
|
199 |
### Example Usage
|
200 |
|
201 |
#### Running a Lightrag server with ollama default local server as llm and embedding backends
|