yangdx commited on
Commit
3e4b84a
·
1 Parent(s): 492b91c

Update sample env file and documentation

Browse files

- Change COSINE_THRESHOLD to 0.4
- Adjust TOP_K to 50
- Enhance API README details

Files changed (3) hide show
  1. .env.example +2 -2
  2. README.md +6 -3
  3. lightrag/api/README.md +16 -10
.env.example CHANGED
@@ -14,8 +14,8 @@ MAX_EMBED_TOKENS=8192
14
  #HISTORY_TURNS=3
15
  #CHUNK_SIZE=1200
16
  #CHUNK_OVERLAP_SIZE=100
17
- #COSINE_THRESHOLD=0.2
18
- #TOP_K=50
19
 
20
  # LLM Configuration (Use valid host. For local services, you can use host.docker.internal)
21
  # Ollama example
 
14
  #HISTORY_TURNS=3
15
  #CHUNK_SIZE=1200
16
  #CHUNK_OVERLAP_SIZE=100
17
+ #COSINE_THRESHOLD=0.4 # 0.2 while not running API server
18
+ #TOP_K=50 # 60 while not running API server
19
 
20
  # LLM Configuration (Use valid host. For local services, you can use host.docker.internal)
21
  # Ollama example
README.md CHANGED
@@ -360,6 +360,8 @@ class QueryParam:
360
  max_token_for_local_context: int = 4000
361
  ```
362
 
 
 
363
  ### Batch Insert
364
 
365
  ```python
@@ -730,10 +732,10 @@ if __name__ == "__main__":
730
  | **embedding\_func\_max\_async** | `int` | Maximum number of concurrent asynchronous embedding processes | `16` |
731
  | **llm\_model\_func** | `callable` | Function for LLM generation | `gpt_4o_mini_complete` |
732
  | **llm\_model\_name** | `str` | LLM model name for generation | `meta-llama/Llama-3.2-1B-Instruct` |
733
- | **llm\_model\_max\_token\_size** | `int` | Maximum token size for LLM generation (affects entity relation summaries) | `32768` |
734
- | **llm\_model\_max\_async** | `int` | Maximum number of concurrent asynchronous LLM processes | `16` |
735
  | **llm\_model\_kwargs** | `dict` | Additional parameters for LLM generation | |
736
- | **vector\_db\_storage\_cls\_kwargs** | `dict` | Additional parameters for vector database (currently not used) | |
737
  | **enable\_llm\_cache** | `bool` | If `TRUE`, stores LLM results in cache; repeated prompts return cached responses | `TRUE` |
738
  | **enable\_llm\_cache\_for\_entity\_extract** | `bool` | If `TRUE`, stores LLM results in cache for entity extraction; Good for beginners to debug your application | `TRUE` |
739
  | **addon\_params** | `dict` | Additional parameters, e.g., `{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"], "insert_batch_size": 10}`: sets example limit, output language, and batch size for document processing | `example_number: all examples, language: English, insert_batch_size: 10` |
@@ -741,6 +743,7 @@ if __name__ == "__main__":
741
  | **embedding\_cache\_config** | `dict` | Configuration for question-answer caching. Contains three parameters:<br>- `enabled`: Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers.<br>- `similarity_threshold`: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM.<br>- `use_llm_check`: Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. | Default: `{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` |
742
 
743
  ### Error Handling
 
744
  <details>
745
  <summary>Click to view error handling details</summary>
746
 
 
360
  max_token_for_local_context: int = 4000
361
  ```
362
 
363
+ > default value of Top_k can be change by environment variables TOP_K.
364
+
365
  ### Batch Insert
366
 
367
  ```python
 
732
  | **embedding\_func\_max\_async** | `int` | Maximum number of concurrent asynchronous embedding processes | `16` |
733
  | **llm\_model\_func** | `callable` | Function for LLM generation | `gpt_4o_mini_complete` |
734
  | **llm\_model\_name** | `str` | LLM model name for generation | `meta-llama/Llama-3.2-1B-Instruct` |
735
+ | **llm\_model\_max\_token\_size** | `int` | Maximum token size for LLM generation (affects entity relation summaries) | `32768`(default value changed by env var MAX_TOKENS) |
736
+ | **llm\_model\_max\_async** | `int` | Maximum number of concurrent asynchronous LLM processes | `16`(default value changed by env var MAX_ASYNC) |
737
  | **llm\_model\_kwargs** | `dict` | Additional parameters for LLM generation | |
738
+ | **vector\_db\_storage\_cls\_kwargs** | `dict` | Additional parameters for vector database, like setting the threshold for nodes and relations retrieval. | cosine_better_than_threshold: 0.2(default value changed by env var COSINE_THRESHOLD) |
739
  | **enable\_llm\_cache** | `bool` | If `TRUE`, stores LLM results in cache; repeated prompts return cached responses | `TRUE` |
740
  | **enable\_llm\_cache\_for\_entity\_extract** | `bool` | If `TRUE`, stores LLM results in cache for entity extraction; Good for beginners to debug your application | `TRUE` |
741
  | **addon\_params** | `dict` | Additional parameters, e.g., `{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"], "insert_batch_size": 10}`: sets example limit, output language, and batch size for document processing | `example_number: all examples, language: English, insert_batch_size: 10` |
 
743
  | **embedding\_cache\_config** | `dict` | Configuration for question-answer caching. Contains three parameters:<br>- `enabled`: Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers.<br>- `similarity_threshold`: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM.<br>- `use_llm_check`: Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. | Default: `{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` |
744
 
745
  ### Error Handling
746
+
747
  <details>
748
  <summary>Click to view error handling details</summary>
749
 
lightrag/api/README.md CHANGED
@@ -98,6 +98,8 @@ After starting the lightrag-server, you can add an Ollama-type connection in the
98
 
99
  LightRAG can be configured using either command-line arguments or environment variables. When both are provided, command-line arguments take precedence over environment variables.
100
 
 
 
101
  ### Environment Variables
102
 
103
  You can configure LightRAG using environment variables by creating a `.env` file in your project root directory. Here's a complete example of available environment variables:
@@ -111,6 +113,17 @@ PORT=9621
111
  WORKING_DIR=/app/data/rag_storage
112
  INPUT_DIR=/app/data/inputs
113
 
 
 
 
 
 
 
 
 
 
 
 
114
  # LLM Configuration
115
  LLM_BINDING=ollama
116
  LLM_BINDING_HOST=http://localhost:11434
@@ -124,14 +137,8 @@ EMBEDDING_BINDING=ollama
124
  EMBEDDING_BINDING_HOST=http://localhost:11434
125
  EMBEDDING_MODEL=bge-m3:latest
126
 
127
- # RAG Configuration
128
- MAX_ASYNC=4
129
- MAX_TOKENS=32768
130
- EMBEDDING_DIM=1024
131
- MAX_EMBED_TOKENS=8192
132
-
133
  # Security
134
- LIGHTRAG_API_KEY=
135
 
136
  # Logging
137
  LOG_LEVEL=INFO
@@ -186,10 +193,9 @@ PORT=7000 python lightrag.py
186
  | --ssl | False | Enable HTTPS |
187
  | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
188
  | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
 
 
189
 
190
-
191
-
192
- For protecting the server using an authentication key, you can also use an environment variable named `LIGHTRAG_API_KEY`.
193
  ### Example Usage
194
 
195
  #### Running a Lightrag server with ollama default local server as llm and embedding backends
 
98
 
99
  LightRAG can be configured using either command-line arguments or environment variables. When both are provided, command-line arguments take precedence over environment variables.
100
 
101
+ For better performance, the API server's default values for TOP_K and COSINE_THRESHOLD are set to 50 and 0.4 respectively. If COSINE_THRESHOLD remains at its default value of 0.2 in LightRAG, many irrelevant entities and relations would be retrieved and sent to the LLM.
102
+
103
  ### Environment Variables
104
 
105
  You can configure LightRAG using environment variables by creating a `.env` file in your project root directory. Here's a complete example of available environment variables:
 
113
  WORKING_DIR=/app/data/rag_storage
114
  INPUT_DIR=/app/data/inputs
115
 
116
+ # RAG Configuration
117
+ MAX_ASYNC=4
118
+ MAX_TOKENS=32768
119
+ EMBEDDING_DIM=1024
120
+ MAX_EMBED_TOKENS=8192
121
+ #HISTORY_TURNS=3
122
+ #CHUNK_SIZE=1200
123
+ #CHUNK_OVERLAP_SIZE=100
124
+ #COSINE_THRESHOLD=0.4
125
+ #TOP_K=50
126
+
127
  # LLM Configuration
128
  LLM_BINDING=ollama
129
  LLM_BINDING_HOST=http://localhost:11434
 
137
  EMBEDDING_BINDING_HOST=http://localhost:11434
138
  EMBEDDING_MODEL=bge-m3:latest
139
 
 
 
 
 
 
 
140
  # Security
141
+ #LIGHTRAG_API_KEY=you-api-key-for-accessing-LightRAG
142
 
143
  # Logging
144
  LOG_LEVEL=INFO
 
193
  | --ssl | False | Enable HTTPS |
194
  | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
195
  | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
196
+ | --top-k | 50 | Number of top-k items to retrieve; corresponds to entities in "local" mode and relationships in "global" mode. |
197
+ | --cosine-threshold | 0.4 | The cossine threshold for nodes and relations retrieval, works with top-k to control the retrieval of nodes and relations. |
198
 
 
 
 
199
  ### Example Usage
200
 
201
  #### Running a Lightrag server with ollama default local server as llm and embedding backends