yangdx commited on
Commit
c8ee1ad
·
1 Parent(s): 8e3f210

Update README

Browse files
Files changed (2) hide show
  1. README-zh.md +114 -122
  2. README.md +109 -115
README-zh.md CHANGED
@@ -93,9 +93,11 @@ python examples/lightrag_openai_demo.py
93
 
94
  **注意事项**:在运行demo程序的时候需要注意,不同的测试程序可能使用的是不同的embedding模型,更换不同的embeding模型的时候需要把清空数据目录(`./dickens`),否则层序执行会出错。如果你想保留LLM缓存,可以在清除数据目录是保留`kv_store_llm_response_cache.json`文件。
95
 
96
- ## 查询
97
 
98
- 使用以下Python代码片段(在脚本中)初始化LightRAG并执行查询:
 
 
99
 
100
  ```python
101
  import os
@@ -107,6 +109,7 @@ from lightrag.utils import setup_logger
107
 
108
  setup_logger("lightrag", level="INFO")
109
 
 
110
  if not os.path.exists(WORKING_DIR):
111
  os.mkdir(WORKING_DIR)
112
 
@@ -120,23 +123,24 @@ async def initialize_rag():
120
  await initialize_pipeline_status()
121
  return rag
122
 
123
- def main():
124
  try:
125
- # Initialize RAG instance
126
  rag = await initialize_rag()
127
- rag.insert("Your text")
 
128
 
129
- # Perform hybrid search
130
- mode="hybrid"
131
  print(
132
- await rag.query(
133
- "What are the top themes in this story?",
134
- param=QueryParam(mode=mode)
135
- )
136
  )
137
 
138
  except Exception as e:
139
- print(f"An error occurred: {e}")
140
  finally:
141
  if rag:
142
  await rag.finalize_storages()
@@ -145,8 +149,54 @@ if __name__ == "__main__":
145
  asyncio.run(main())
146
  ```
147
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
148
  ### 查询参数
149
 
 
 
150
  ```python
151
  class QueryParam:
152
  mode: Literal["local", "global", "hybrid", "naive", "mix"] = "global"
@@ -421,54 +471,6 @@ if __name__ == "__main__":
421
 
422
  </details>
423
 
424
- ### Token统计功能
425
- <details>
426
- <summary> <b>概述和使用</b> </summary>
427
-
428
- LightRAG提供了TokenTracker工具来跟踪和管理大模型的token消耗。这个功能对于控制API成本和优化性能特别有用。
429
-
430
- #### 使用方法
431
-
432
- ```python
433
- from lightrag.utils import TokenTracker
434
-
435
- # 创建TokenTracker实例
436
- token_tracker = TokenTracker()
437
-
438
- # 方法1:使用上下文管理器(推荐)
439
- # 适用于需要自动跟踪token使用的场景
440
- with token_tracker:
441
- result1 = await llm_model_func("你的问题1")
442
- result2 = await llm_model_func("你的问题2")
443
-
444
- # 方法2:手动添加token使用记录
445
- # 适用于需要更精细控制token统计的场景
446
- token_tracker.reset()
447
-
448
- rag.insert()
449
-
450
- rag.query("你的问题1", param=QueryParam(mode="naive"))
451
- rag.query("你的问题2", param=QueryParam(mode="mix"))
452
-
453
- # 显示总token使用量(包含插入和查询操作)
454
- print("Token usage:", token_tracker.get_usage())
455
- ```
456
-
457
- #### 使用建议
458
- - 在长会话或批量操作中使用上下文管理器,可以自动跟踪所有token消耗
459
- - 对于需要分段统计的场景,使用手动模式并适时调用reset()
460
- - 定期检查token使用情况,有助于及时发现异常消耗
461
- - 在开发测试阶段积极使用此功能,以便优化生产环境的成本
462
-
463
- #### 实际应用示例
464
- 您可以参考以下示例来实现token统计:
465
- - `examples/lightrag_gemini_track_token_demo.py`:使用Google Gemini模型的token统计示例
466
- - `examples/lightrag_siliconcloud_track_token_demo.py`:使用SiliconCloud模型的token统计示例
467
-
468
- 这些示例展示了如何在不同模型和场景下有效地使用TokenTracker功能。
469
-
470
- </details>
471
-
472
  ### 对话历史
473
 
474
  LightRAG现在通过对话历史功能支持多轮对话。以下是使用方法:
@@ -619,7 +621,7 @@ custom_kg = {
619
  rag.insert_custom_kg(custom_kg)
620
  ```
621
 
622
- ## 插入
623
 
624
  <details>
625
  <summary> <b> 基本插入 </b></summary>
@@ -718,7 +720,9 @@ rag.insert(documents, file_paths=file_paths)
718
 
719
  </details>
720
 
721
- ## 存储
 
 
722
 
723
  <details>
724
  <summary> <b>使用Neo4J进行存储</b> </summary>
@@ -846,16 +850,6 @@ rag = LightRAG(
846
 
847
  </details>
848
 
849
- ## 删除
850
-
851
- ```python
852
- # 删除实体:通过实体名称删除实体
853
- rag.delete_by_entity("Project Gutenberg")
854
-
855
- # 删除文档:通过文档ID删除与文档相关的实体和关系
856
- rag.delete_by_doc_id("doc_id")
857
- ```
858
-
859
  ## 编辑实体和关系
860
 
861
  LightRAG现在支持全面的知识图谱管理功能,允许您在知识图谱中创建、编辑和删除实体和关系。
@@ -926,6 +920,54 @@ updated_relation = rag.edit_relation("Google", "Google Mail", {
926
 
927
  这些操作在图数据库和向量数据库组件之间保持数据一致性,确保您的知识图谱保持连贯。
928
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
929
  ## 数据导出功能
930
 
931
  ### 概述
@@ -1082,56 +1124,6 @@ rag.clear_cache(modes=["local"])
1082
 
1083
  </details>
1084
 
1085
- ## LightRAG初始化参数
1086
-
1087
- <details>
1088
- <summary> 参数 </summary>
1089
-
1090
- | **参数** | **类型** | **说明** | **默认值** |
1091
- |--------------|----------|-----------------|-------------|
1092
- | **working_dir** | `str` | 存储缓存的目录 | `lightrag_cache+timestamp` |
1093
- | **kv_storage** | `str` | Storage type for documents and text chunks. Supported types: `JsonKVStorage`,`PGKVStorage`,`RedisKVStorage`,`MongoKVStorage` | `JsonKVStorage` |
1094
- | **vector_storage** | `str` | Storage type for embedding vectors. Supported types: `NanoVectorDBStorage`,`PGVectorStorage`,`MilvusVectorDBStorage`,`ChromaVectorDBStorage`,`FaissVectorDBStorage`,`MongoVectorDBStorage`,`QdrantVectorDBStorage` | `NanoVectorDBStorage` |
1095
- | **graph_storage** | `str` | Storage type for graph edges and nodes. Supported types: `NetworkXStorage`,`Neo4JStorage`,`PGGraphStorage`,`AGEStorage` | `NetworkXStorage` |
1096
- | **doc_status_storage** | `str` | Storage type for documents process status. Supported types: `JsonDocStatusStorage`,`PGDocStatusStorage`,`MongoDocStatusStorage` | `JsonDocStatusStorage` |
1097
- | **chunk_token_size** | `int` | 拆分文档时每个块的最大令牌大小 | `1200` |
1098
- | **chunk_overlap_token_size** | `int` | 拆分文档时两个块之间的重叠令牌大小 | `100` |
1099
- | **tokenizer** | `Tokenizer` | 用于将文本转换为 tokens(数字)以及使用遵循 TokenizerInterface 协议的 .encode() 和 .decode() 函数将 tokens 转换回文本的函数。 如果您不指定,它将使用默认的 Tiktoken tokenizer。 | `TiktokenTokenizer` |
1100
- | **tiktoken_model_name** | `str` | 如果您使用的是默认的 Tiktoken tokenizer,那么这是要使用的特定 Tiktoken 模型的名称。如果您提供自己的 tokenizer,则忽略此设置。 | `gpt-4o-mini` |
1101
- | **entity_extract_max_gleaning** | `int` | 实体提取过程中的循环次数,附加历史消息 | `1` |
1102
- | **entity_summary_to_max_tokens** | `int` | 每个实体摘要的最大令牌大小 | `500` |
1103
- | **node_embedding_algorithm** | `str` | 节点嵌入算法(当前未使用) | `node2vec` |
1104
- | **node2vec_params** | `dict` | 节点嵌入的参数 | `{"dimensions": 1536,"num_walks": 10,"walk_length": 40,"window_size": 2,"iterations": 3,"random_seed": 3,}` |
1105
- | **embedding_func** | `EmbeddingFunc` | 从文本生成嵌入向量的函数 | `openai_embed` |
1106
- | **embedding_batch_num** | `int` | 嵌入过程的最大批量大小(每批发送多个文本) | `32` |
1107
- | **embedding_func_max_async** | `int` | 最大并发异步嵌入进程数 | `16` |
1108
- | **llm_model_func** | `callable` | LLM生成的函数 | `gpt_4o_mini_complete` |
1109
- | **llm_model_name** | `str` | 用于生成的LLM模型名称 | `meta-llama/Llama-3.2-1B-Instruct` |
1110
- | **llm_model_max_token_size** | `int` | LLM生成的最大令牌大小(影响实体关系摘要) | `32768`(默认值由环境变量MAX_TOKENS更改) |
1111
- | **llm_model_max_async** | `int` | 最大并发异步LLM进程数 | `4`(默认值由环境变量MAX_ASYNC更改) |
1112
- | **llm_model_kwargs** | `dict` | LLM生成的附加参数 | |
1113
- | **vector_db_storage_cls_kwargs** | `dict` | 向量数据库的附加参数,如设置节点和关系检索的阈值 | cosine_better_than_threshold: 0.2(默认值由环境变量COSINE_THRESHOLD更改) |
1114
- | **enable_llm_cache** | `bool` | 如果为`TRUE`,将LLM结果存储在缓存中;重复的提示返回缓存的响应 | `TRUE` |
1115
- | **enable_llm_cache_for_entity_extract** | `bool` | 如果为`TRUE`,将实体提取的LLM结果存储在缓存中;适合初学者调试应用程序 | `TRUE` |
1116
- | **addon_params** | `dict` | 附加参数,例如`{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"]}`:设置示例限制、输出语言和文档处理的批量大小 | `example_number: 所有示例, language: English` |
1117
- | **convert_response_to_json_func** | `callable` | 未使用 | `convert_response_to_json` |
1118
- | **embedding_cache_config** | `dict` | 问答缓存的配置。包含三个参数:`enabled`:布尔值,启用/禁用缓存查找功能。启用时,系统将在生成新答案之前检查缓存的响应。`similarity_threshold`:浮点值(0-1),相似度阈值。当新问题与缓存问题的相似度超过此阈值时,将直接返回缓存的答案而不调用LLM。`use_llm_check`:布尔值,启用/禁用LLM相似度验证。启用时,在返回缓存答案之前,将使用LLM作为二次检查来验证问题之间的相似度。 | 默认:`{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` |
1119
-
1120
- </details>
1121
-
1122
- ## 错误处理
1123
-
1124
- <details>
1125
- <summary>点击查看错误处理详情</summary>
1126
-
1127
- API包括全面的错误处理:
1128
-
1129
- - 文件未找到错误(404)
1130
- - 处理错误(500)
1131
- - 支持多种文件编码(UTF-8和GBK)
1132
-
1133
- </details>
1134
-
1135
  ## LightRAG API
1136
 
1137
  LightRAG服务器旨在提供Web UI和API支持。**有关LightRAG服务器的更多信息,请参阅[LightRAG服务器](./lightrag/api/README.md)。**
 
93
 
94
  **注意事项**:在运行demo程序的时候需要注意,不同的测试程序可能使用的是不同的embedding模型,更换不同的embeding模型的时候需要把清空数据目录(`./dickens`),否则层序执行会出错。如果你想保留LLM缓存,可以在清除数据目录是保留`kv_store_llm_response_cache.json`文件。
95
 
96
+ ## 使用LightRAG Core进行编程
97
 
98
+ ### 一个简单程序
99
+
100
+ 以下Python代码片段演示了如何初始化LightRAG、插入文本并进行查询:
101
 
102
  ```python
103
  import os
 
109
 
110
  setup_logger("lightrag", level="INFO")
111
 
112
+ WORKING_DIR = "./rag_storage"
113
  if not os.path.exists(WORKING_DIR):
114
  os.mkdir(WORKING_DIR)
115
 
 
123
  await initialize_pipeline_status()
124
  return rag
125
 
126
+ async def main():
127
  try:
128
+ # 初始化RAG实例
129
  rag = await initialize_rag()
130
+ # 插入文本
131
+ await rag.insert("Your text")
132
 
133
+ # 执行混合检索
134
+ mode = "hybrid"
135
  print(
136
+ await rag.query(
137
+ "这个故事的主要主题是什么?",
138
+ param=QueryParam(mode=mode)
139
+ )
140
  )
141
 
142
  except Exception as e:
143
+ print(f"发生错误: {e}")
144
  finally:
145
  if rag:
146
  await rag.finalize_storages()
 
149
  asyncio.run(main())
150
  ```
151
 
152
+ 重要说明:
153
+ - 运行脚本前请先导出你的OPENAI_API_KEY环境变量。
154
+ - 该程序使用LightRAG的默认存储设置,所有数据将持久化在WORKING_DIR/rag_storage目录下。
155
+ - 该示例仅展示了初始化LightRAG对象的最简单方式:注入embedding和LLM函数,并在创建LightRAG对象后初始化存储和管道状态。
156
+
157
+ ### LightRAG初始化参数
158
+
159
+ 以下是完整的LightRAG对象初始化参数清单:
160
+
161
+ <details>
162
+ <summary> 参数 </summary>
163
+
164
+ | **参数** | **类型** | **说明** | **默认值** |
165
+ |--------------|----------|-----------------|-------------|
166
+ | **working_dir** | `str` | 存储缓存的目录 | `lightrag_cache+timestamp` |
167
+ | **kv_storage** | `str` | Storage type for documents and text chunks. Supported types: `JsonKVStorage`,`PGKVStorage`,`RedisKVStorage`,`MongoKVStorage` | `JsonKVStorage` |
168
+ | **vector_storage** | `str` | Storage type for embedding vectors. Supported types: `NanoVectorDBStorage`,`PGVectorStorage`,`MilvusVectorDBStorage`,`ChromaVectorDBStorage`,`FaissVectorDBStorage`,`MongoVectorDBStorage`,`QdrantVectorDBStorage` | `NanoVectorDBStorage` |
169
+ | **graph_storage** | `str` | Storage type for graph edges and nodes. Supported types: `NetworkXStorage`,`Neo4JStorage`,`PGGraphStorage`,`AGEStorage` | `NetworkXStorage` |
170
+ | **doc_status_storage** | `str` | Storage type for documents process status. Supported types: `JsonDocStatusStorage`,`PGDocStatusStorage`,`MongoDocStatusStorage` | `JsonDocStatusStorage` |
171
+ | **chunk_token_size** | `int` | 拆分文档时每个块的最大令牌大小 | `1200` |
172
+ | **chunk_overlap_token_size** | `int` | 拆分文档时两个块之间的重叠令牌大小 | `100` |
173
+ | **tokenizer** | `Tokenizer` | 用于将文本转换为 tokens(数字)以及使用遵循 TokenizerInterface 协议的 .encode() 和 .decode() 函数将 tokens 转换回文本的函数。 如果您不指定,它将使用默认的 Tiktoken tokenizer。 | `TiktokenTokenizer` |
174
+ | **tiktoken_model_name** | `str` | 如果您使用的是默认的 Tiktoken tokenizer,那么这是要使用的特定 Tiktoken 模型的名称。如果您提供自己的 tokenizer,则忽略此设置。 | `gpt-4o-mini` |
175
+ | **entity_extract_max_gleaning** | `int` | 实体提取过程中的循环次数,附加历史消息 | `1` |
176
+ | **entity_summary_to_max_tokens** | `int` | 每个实体摘要的最大令牌大小 | `500` |
177
+ | **node_embedding_algorithm** | `str` | 节点嵌入算法(当前未使用) | `node2vec` |
178
+ | **node2vec_params** | `dict` | 节点嵌入的参数 | `{"dimensions": 1536,"num_walks": 10,"walk_length": 40,"window_size": 2,"iterations": 3,"random_seed": 3,}` |
179
+ | **embedding_func** | `EmbeddingFunc` | 从文本生成嵌入向量的函数 | `openai_embed` |
180
+ | **embedding_batch_num** | `int` | 嵌入过程的最大批量大小(每批发送多个文本) | `32` |
181
+ | **embedding_func_max_async** | `int` | 最大并发异步嵌入进程数 | `16` |
182
+ | **llm_model_func** | `callable` | LLM生成的函数 | `gpt_4o_mini_complete` |
183
+ | **llm_model_name** | `str` | 用于生成的LLM模型名称 | `meta-llama/Llama-3.2-1B-Instruct` |
184
+ | **llm_model_max_token_size** | `int` | LLM生成的最大令牌大小(影响实体关系摘要) | `32768`(默认值由环境变量MAX_TOKENS更改) |
185
+ | **llm_model_max_async** | `int` | 最大并发异步LLM进程数 | `4`(默认值由环境变量MAX_ASYNC更改) |
186
+ | **llm_model_kwargs** | `dict` | LLM生成的附加参数 | |
187
+ | **vector_db_storage_cls_kwargs** | `dict` | 向量数据库的附加参数,如设置节点和关系检索的阈值 | cosine_better_than_threshold: 0.2(默认值由环境变量COSINE_THRESHOLD更改) |
188
+ | **enable_llm_cache** | `bool` | 如果为`TRUE`,将LLM结果存储在缓存中;重复的提示返回缓存的响应 | `TRUE` |
189
+ | **enable_llm_cache_for_entity_extract** | `bool` | 如果为`TRUE`,将实体提取的LLM结果存储在缓存中;适合初学者调试应用程序 | `TRUE` |
190
+ | **addon_params** | `dict` | 附加参数,例如`{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"]}`:设置示例限制、输出语言和文档处理的批量大小 | `example_number: 所有示例, language: English` |
191
+ | **convert_response_to_json_func** | `callable` | 未使用 | `convert_response_to_json` |
192
+ | **embedding_cache_config** | `dict` | 问答缓存的配置。包含三个参数:`enabled`:布尔值,启用/禁用缓存查找功能。启用时,系统将在生成新答案之前检查缓存的响应。`similarity_threshold`:浮点值(0-1),相似度阈值。当新问题与缓存问题的相似度超过此阈值时,将直接返回缓存的答案而不调用LLM。`use_llm_check`:布尔值,启用/禁用LLM相似度验证。启用时,在返回缓存答案之前,将使用LLM作为二次检查来验证问题之间的相似度。 | 默认:`{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` |
193
+
194
+ </details>
195
+
196
  ### 查询参数
197
 
198
+ 使用QueryParam控制你的查询行为:
199
+
200
  ```python
201
  class QueryParam:
202
  mode: Literal["local", "global", "hybrid", "naive", "mix"] = "global"
 
471
 
472
  </details>
473
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
474
  ### 对话历史
475
 
476
  LightRAG现在通过对话历史功能支持多轮对话。以下是使用方法:
 
621
  rag.insert_custom_kg(custom_kg)
622
  ```
623
 
624
+ ### 插入
625
 
626
  <details>
627
  <summary> <b> 基本插入 </b></summary>
 
720
 
721
  </details>
722
 
723
+ ### 存储
724
+
725
+ LightRAG使用到4种类型的存储,每一种存储都有多种实现方案。在初始化LightRAG的时候可以通过参数设定这四类存储的实现方案。详情请参看前面的LightRAG初始化参数。
726
 
727
  <details>
728
  <summary> <b>使用Neo4J进行存储</b> </summary>
 
850
 
851
  </details>
852
 
 
 
 
 
 
 
 
 
 
 
853
  ## 编辑实体和关系
854
 
855
  LightRAG现在支持全面的知识图谱管理功能,允许您在知识图谱中创建、编辑和删除实体和关系。
 
920
 
921
  这些操作在图数据库和向量数据库组件之间保持数据一致性,确保您的知识图谱保持连贯。
922
 
923
+ ## Token统��功能
924
+ <details>
925
+ <summary> <b>概述和使用</b> </summary>
926
+
927
+ LightRAG提供了TokenTracker工具来跟踪和管理大模型的token消耗。这个功能对于控制API成本和优化性能特别有用。
928
+
929
+ ### 使用方法
930
+
931
+ ```python
932
+ from lightrag.utils import TokenTracker
933
+
934
+ # 创建TokenTracker实例
935
+ token_tracker = TokenTracker()
936
+
937
+ # 方法1:使用上下文管理器(推荐)
938
+ # 适用于需要自动跟踪token使用的场景
939
+ with token_tracker:
940
+ result1 = await llm_model_func("你的问题1")
941
+ result2 = await llm_model_func("你的问题2")
942
+
943
+ # 方法2:手动添加token使用记录
944
+ # 适用于需要更精细控制token统计的场景
945
+ token_tracker.reset()
946
+
947
+ rag.insert()
948
+
949
+ rag.query("你的问题1", param=QueryParam(mode="naive"))
950
+ rag.query("你的问题2", param=QueryParam(mode="mix"))
951
+
952
+ # 显示总token使用量(包含插入和查询操作)
953
+ print("Token usage:", token_tracker.get_usage())
954
+ ```
955
+
956
+ ### 使用建议
957
+ - 在长会话或批量操作中使用上下文管理器,可以自动跟踪所有token消耗
958
+ - 对于需要分段统计的场景,使用手动模式并适时调用reset()
959
+ - 定期检查token使用情况,有助于及时发现异常消耗
960
+ - 在开发测试阶段积极使用此功能,以便优化生产环境的成本
961
+
962
+ ### 实际应用示例
963
+ 您可以参考以下示例来实现token统计:
964
+ - `examples/lightrag_gemini_track_token_demo.py`:使用Google Gemini模型的token统计示例
965
+ - `examples/lightrag_siliconcloud_track_token_demo.py`:使用SiliconCloud模型的token统计示例
966
+
967
+ 这些示例展示了如何在不同模型和场景下有效地使用TokenTracker功能。
968
+
969
+ </details>
970
+
971
  ## 数据导出功能
972
 
973
  ### 概述
 
1124
 
1125
  </details>
1126
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1127
  ## LightRAG API
1128
 
1129
  LightRAG服务器旨在提供Web UI和API支持。**有关LightRAG服务器的更多信息,请参阅[LightRAG服务器](./lightrag/api/README.md)。**
README.md CHANGED
@@ -129,9 +129,13 @@ For a streaming response implementation example, please see `examples/lightrag_o
129
 
130
  **Note**: When running the demo program, please be aware that different test scripts may use different embedding models. If you switch to a different embedding model, you must clear the data directory (`./dickens`); otherwise, the program may encounter errors. If you wish to retain the LLM cache, you can preserve the `kv_store_llm_response_cache.json` file while clearing the data directory.
131
 
132
- ## Query
133
 
134
- Use the below Python snippet (in a script) to initialize LightRAG and perform queries:
 
 
 
 
135
 
136
  ```python
137
  import os
@@ -143,6 +147,7 @@ from lightrag.utils import setup_logger
143
 
144
  setup_logger("lightrag", level="INFO")
145
 
 
146
  if not os.path.exists(WORKING_DIR):
147
  os.mkdir(WORKING_DIR)
148
 
@@ -156,11 +161,11 @@ async def initialize_rag():
156
  await initialize_pipeline_status()
157
  return rag
158
 
159
- def main():
160
  try:
161
  # Initialize RAG instance
162
  rag = await initialize_rag()
163
- rag.insert("Your text")
164
 
165
  # Perform hybrid search
166
  mode="hybrid"
@@ -181,8 +186,55 @@ if __name__ == "__main__":
181
  asyncio.run(main())
182
  ```
183
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
184
  ### Query Param
185
 
 
 
186
  ```python
187
  class QueryParam:
188
  mode: Literal["local", "global", "hybrid", "naive", "mix"] = "global"
@@ -460,55 +512,6 @@ if __name__ == "__main__":
460
 
461
  </details>
462
 
463
- ### Token Usage Tracking
464
-
465
- <details>
466
- <summary> <b>Overview and Usage</b> </summary>
467
-
468
- LightRAG provides a TokenTracker tool to monitor and manage token consumption by large language models. This feature is particularly useful for controlling API costs and optimizing performance.
469
-
470
- #### Usage
471
-
472
- ```python
473
- from lightrag.utils import TokenTracker
474
-
475
- # Create TokenTracker instance
476
- token_tracker = TokenTracker()
477
-
478
- # Method 1: Using context manager (Recommended)
479
- # Suitable for scenarios requiring automatic token usage tracking
480
- with token_tracker:
481
- result1 = await llm_model_func("your question 1")
482
- result2 = await llm_model_func("your question 2")
483
-
484
- # Method 2: Manually adding token usage records
485
- # Suitable for scenarios requiring more granular control over token statistics
486
- token_tracker.reset()
487
-
488
- rag.insert()
489
-
490
- rag.query("your question 1", param=QueryParam(mode="naive"))
491
- rag.query("your question 2", param=QueryParam(mode="mix"))
492
-
493
- # Display total token usage (including insert and query operations)
494
- print("Token usage:", token_tracker.get_usage())
495
- ```
496
-
497
- #### Usage Tips
498
- - Use context managers for long sessions or batch operations to automatically track all token consumption
499
- - For scenarios requiring segmented statistics, use manual mode and call reset() when appropriate
500
- - Regular checking of token usage helps detect abnormal consumption early
501
- - Actively use this feature during development and testing to optimize production costs
502
-
503
- #### Practical Examples
504
- You can refer to these examples for implementing token tracking:
505
- - `examples/lightrag_gemini_track_token_demo.py`: Token tracking example using Google Gemini model
506
- - `examples/lightrag_siliconcloud_track_token_demo.py`: Token tracking example using SiliconCloud model
507
-
508
- These examples demonstrate how to effectively use the TokenTracker feature with different models and scenarios.
509
-
510
- </details>
511
-
512
  ### Conversation History Support
513
 
514
 
@@ -612,7 +615,7 @@ rag.query_with_separate_keyword_extraction(
612
 
613
  </details>
614
 
615
- ## Insert
616
 
617
  <details>
618
  <summary> <b> Basic Insert </b></summary>
@@ -775,7 +778,9 @@ rag.insert(documents, file_paths=file_paths)
775
 
776
  </details>
777
 
778
- ## Storage
 
 
779
 
780
  <details>
781
  <summary> <b>Using Neo4J for Storage</b> </summary>
@@ -904,16 +909,6 @@ rag = LightRAG(
904
 
905
  </details>
906
 
907
- ## Delete
908
-
909
- ```python
910
- # Delete Entity: Deleting entities by their names
911
- rag.delete_by_entity("Project Gutenberg")
912
-
913
- # Delete Document: Deleting entities and relationships associated with the document by doc id
914
- rag.delete_by_doc_id("doc_id")
915
- ```
916
-
917
  ## Edit Entities and Relations
918
 
919
  LightRAG now supports comprehensive knowledge graph management capabilities, allowing you to create, edit, and delete entities and relationships within your knowledge graph.
@@ -984,6 +979,55 @@ These operations maintain data consistency across both the graph database and ve
984
 
985
  </details>
986
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
987
  ## Data Export Functions
988
 
989
  ### Overview
@@ -1148,56 +1192,6 @@ Valid modes are:
1148
 
1149
  </details>
1150
 
1151
- ## LightRAG init parameters
1152
-
1153
- <details>
1154
- <summary> Parameters </summary>
1155
-
1156
- | **Parameter** | **Type** | **Explanation** | **Default** |
1157
- |--------------|----------|-----------------|-------------|
1158
- | **working_dir** | `str` | Directory where the cache will be stored | `lightrag_cache+timestamp` |
1159
- | **kv_storage** | `str` | Storage type for documents and text chunks. Supported types: `JsonKVStorage`,`PGKVStorage`,`RedisKVStorage`,`MongoKVStorage` | `JsonKVStorage` |
1160
- | **vector_storage** | `str` | Storage type for embedding vectors. Supported types: `NanoVectorDBStorage`,`PGVectorStorage`,`MilvusVectorDBStorage`,`ChromaVectorDBStorage`,`FaissVectorDBStorage`,`MongoVectorDBStorage`,`QdrantVectorDBStorage` | `NanoVectorDBStorage` |
1161
- | **graph_storage** | `str` | Storage type for graph edges and nodes. Supported types: `NetworkXStorage`,`Neo4JStorage`,`PGGraphStorage`,`AGEStorage` | `NetworkXStorage` |
1162
- | **doc_status_storage** | `str` | Storage type for documents process status. Supported types: `JsonDocStatusStorage`,`PGDocStatusStorage`,`MongoDocStatusStorage` | `JsonDocStatusStorage` |
1163
- | **chunk_token_size** | `int` | Maximum token size per chunk when splitting documents | `1200` |
1164
- | **chunk_overlap_token_size** | `int` | Overlap token size between two chunks when splitting documents | `100` |
1165
- | **tokenizer** | `Tokenizer` | The function used to convert text into tokens (numbers) and back using .encode() and .decode() functions following `TokenizerInterface` protocol. If you don't specify one, it will use the default Tiktoken tokenizer. | `TiktokenTokenizer` |
1166
- | **tiktoken_model_name** | `str` | If you're using the default Tiktoken tokenizer, this is the name of the specific Tiktoken model to use. This setting is ignored if you provide your own tokenizer. | `gpt-4o-mini` |
1167
- | **entity_extract_max_gleaning** | `int` | Number of loops in the entity extraction process, appending history messages | `1` |
1168
- | **entity_summary_to_max_tokens** | `int` | Maximum token size for each entity summary | `500` |
1169
- | **node_embedding_algorithm** | `str` | Algorithm for node embedding (currently not used) | `node2vec` |
1170
- | **node2vec_params** | `dict` | Parameters for node embedding | `{"dimensions": 1536,"num_walks": 10,"walk_length": 40,"window_size": 2,"iterations": 3,"random_seed": 3,}` |
1171
- | **embedding_func** | `EmbeddingFunc` | Function to generate embedding vectors from text | `openai_embed` |
1172
- | **embedding_batch_num** | `int` | Maximum batch size for embedding processes (multiple texts sent per batch) | `32` |
1173
- | **embedding_func_max_async** | `int` | Maximum number of concurrent asynchronous embedding processes | `16` |
1174
- | **llm_model_func** | `callable` | Function for LLM generation | `gpt_4o_mini_complete` |
1175
- | **llm_model_name** | `str` | LLM model name for generation | `meta-llama/Llama-3.2-1B-Instruct` |
1176
- | **llm_model_max_token_size** | `int` | Maximum token size for LLM generation (affects entity relation summaries) | `32768`(default value changed by env var MAX_TOKENS) |
1177
- | **llm_model_max_async** | `int` | Maximum number of concurrent asynchronous LLM processes | `4`(default value changed by env var MAX_ASYNC) |
1178
- | **llm_model_kwargs** | `dict` | Additional parameters for LLM generation | |
1179
- | **vector_db_storage_cls_kwargs** | `dict` | Additional parameters for vector database, like setting the threshold for nodes and relations retrieval | cosine_better_than_threshold: 0.2(default value changed by env var COSINE_THRESHOLD) |
1180
- | **enable_llm_cache** | `bool` | If `TRUE`, stores LLM results in cache; repeated prompts return cached responses | `TRUE` |
1181
- | **enable_llm_cache_for_entity_extract** | `bool` | If `TRUE`, stores LLM results in cache for entity extraction; Good for beginners to debug your application | `TRUE` |
1182
- | **addon_params** | `dict` | Additional parameters, e.g., `{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"]}`: sets example limit, entiy/relation extraction output language | `example_number: all examples, language: English` |
1183
- | **convert_response_to_json_func** | `callable` | Not used | `convert_response_to_json` |
1184
- | **embedding_cache_config** | `dict` | Configuration for question-answer caching. Contains three parameters: `enabled`: Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers. `similarity_threshold`: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM. `use_llm_check`: Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. | Default: `{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` |
1185
-
1186
- </details>
1187
-
1188
- ## Error Handling
1189
-
1190
- <details>
1191
- <summary>Click to view error handling details</summary>
1192
-
1193
- The API includes comprehensive error handling:
1194
-
1195
- - File not found errors (404)
1196
- - Processing errors (500)
1197
- - Supports multiple file encodings (UTF-8 and GBK)
1198
-
1199
- </details>
1200
-
1201
  ## LightRAG API
1202
 
1203
  The LightRAG Server is designed to provide Web UI and API support. **For more information about LightRAG Server, please refer to [LightRAG Server](./lightrag/api/README.md).**
 
129
 
130
  **Note**: When running the demo program, please be aware that different test scripts may use different embedding models. If you switch to a different embedding model, you must clear the data directory (`./dickens`); otherwise, the program may encounter errors. If you wish to retain the LLM cache, you can preserve the `kv_store_llm_response_cache.json` file while clearing the data directory.
131
 
132
+ Integrate Using LightRAG core object
133
 
134
+ ## Programing with LightRAG Core
135
+
136
+ ### A Simple Program
137
+
138
+ Use the below Python snippet to initialize LightRAG, insert text to it, and perform queries:
139
 
140
  ```python
141
  import os
 
147
 
148
  setup_logger("lightrag", level="INFO")
149
 
150
+ WORKING_DIR = "./rag_storage"
151
  if not os.path.exists(WORKING_DIR):
152
  os.mkdir(WORKING_DIR)
153
 
 
161
  await initialize_pipeline_status()
162
  return rag
163
 
164
+ async def main():
165
  try:
166
  # Initialize RAG instance
167
  rag = await initialize_rag()
168
+ rag.insert("Your text")
169
 
170
  # Perform hybrid search
171
  mode="hybrid"
 
186
  asyncio.run(main())
187
  ```
188
 
189
+ Important notes for the above snippet:
190
+
191
+ - Export your OPENAI_API_KEY environment variable before running the script.
192
+ - This program uses the default storage settings for LightRAG, so all data will be persisted to WORKING_DIR/rag_storage.
193
+ - This program demonstrates only the simplest way to initialize a LightRAG object: Injecting the embedding and LLM functions, and initializing storage and pipeline status after creating the LightRAG object.
194
+
195
+ ### LightRAG init parameters
196
+
197
+ A full list of LightRAG init parameters:
198
+
199
+ <details>
200
+ <summary> Parameters </summary>
201
+
202
+ | **Parameter** | **Type** | **Explanation** | **Default** |
203
+ |--------------|----------|-----------------|-------------|
204
+ | **working_dir** | `str` | Directory where the cache will be stored | `lightrag_cache+timestamp` |
205
+ | **kv_storage** | `str` | Storage type for documents and text chunks. Supported types: `JsonKVStorage`,`PGKVStorage`,`RedisKVStorage`,`MongoKVStorage` | `JsonKVStorage` |
206
+ | **vector_storage** | `str` | Storage type for embedding vectors. Supported types: `NanoVectorDBStorage`,`PGVectorStorage`,`MilvusVectorDBStorage`,`ChromaVectorDBStorage`,`FaissVectorDBStorage`,`MongoVectorDBStorage`,`QdrantVectorDBStorage` | `NanoVectorDBStorage` |
207
+ | **graph_storage** | `str` | Storage type for graph edges and nodes. Supported types: `NetworkXStorage`,`Neo4JStorage`,`PGGraphStorage`,`AGEStorage` | `NetworkXStorage` |
208
+ | **doc_status_storage** | `str` | Storage type for documents process status. Supported types: `JsonDocStatusStorage`,`PGDocStatusStorage`,`MongoDocStatusStorage` | `JsonDocStatusStorage` |
209
+ | **chunk_token_size** | `int` | Maximum token size per chunk when splitting documents | `1200` |
210
+ | **chunk_overlap_token_size** | `int` | Overlap token size between two chunks when splitting documents | `100` |
211
+ | **tokenizer** | `Tokenizer` | The function used to convert text into tokens (numbers) and back using .encode() and .decode() functions following `TokenizerInterface` protocol. If you don't specify one, it will use the default Tiktoken tokenizer. | `TiktokenTokenizer` |
212
+ | **tiktoken_model_name** | `str` | If you're using the default Tiktoken tokenizer, this is the name of the specific Tiktoken model to use. This setting is ignored if you provide your own tokenizer. | `gpt-4o-mini` |
213
+ | **entity_extract_max_gleaning** | `int` | Number of loops in the entity extraction process, appending history messages | `1` |
214
+ | **entity_summary_to_max_tokens** | `int` | Maximum token size for each entity summary | `500` |
215
+ | **node_embedding_algorithm** | `str` | Algorithm for node embedding (currently not used) | `node2vec` |
216
+ | **node2vec_params** | `dict` | Parameters for node embedding | `{"dimensions": 1536,"num_walks": 10,"walk_length": 40,"window_size": 2,"iterations": 3,"random_seed": 3,}` |
217
+ | **embedding_func** | `EmbeddingFunc` | Function to generate embedding vectors from text | `openai_embed` |
218
+ | **embedding_batch_num** | `int` | Maximum batch size for embedding processes (multiple texts sent per batch) | `32` |
219
+ | **embedding_func_max_async** | `int` | Maximum number of concurrent asynchronous embedding processes | `16` |
220
+ | **llm_model_func** | `callable` | Function for LLM generation | `gpt_4o_mini_complete` |
221
+ | **llm_model_name** | `str` | LLM model name for generation | `meta-llama/Llama-3.2-1B-Instruct` |
222
+ | **llm_model_max_token_size** | `int` | Maximum token size for LLM generation (affects entity relation summaries) | `32768`(default value changed by env var MAX_TOKENS) |
223
+ | **llm_model_max_async** | `int` | Maximum number of concurrent asynchronous LLM processes | `4`(default value changed by env var MAX_ASYNC) |
224
+ | **llm_model_kwargs** | `dict` | Additional parameters for LLM generation | |
225
+ | **vector_db_storage_cls_kwargs** | `dict` | Additional parameters for vector database, like setting the threshold for nodes and relations retrieval | cosine_better_than_threshold: 0.2(default value changed by env var COSINE_THRESHOLD) |
226
+ | **enable_llm_cache** | `bool` | If `TRUE`, stores LLM results in cache; repeated prompts return cached responses | `TRUE` |
227
+ | **enable_llm_cache_for_entity_extract** | `bool` | If `TRUE`, stores LLM results in cache for entity extraction; Good for beginners to debug your application | `TRUE` |
228
+ | **addon_params** | `dict` | Additional parameters, e.g., `{"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"]}`: sets example limit, entiy/relation extraction output language | `example_number: all examples, language: English` |
229
+ | **convert_response_to_json_func** | `callable` | Not used | `convert_response_to_json` |
230
+ | **embedding_cache_config** | `dict` | Configuration for question-answer caching. Contains three parameters: `enabled`: Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers. `similarity_threshold`: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM. `use_llm_check`: Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. | Default: `{"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}` |
231
+
232
+ </details>
233
+
234
  ### Query Param
235
 
236
+ Use QueryParam to control the behavior your query:
237
+
238
  ```python
239
  class QueryParam:
240
  mode: Literal["local", "global", "hybrid", "naive", "mix"] = "global"
 
512
 
513
  </details>
514
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
515
  ### Conversation History Support
516
 
517
 
 
615
 
616
  </details>
617
 
618
+ ### Insert
619
 
620
  <details>
621
  <summary> <b> Basic Insert </b></summary>
 
778
 
779
  </details>
780
 
781
+ ### Storage
782
+
783
+ LightRAG uses four types of storage, each of which has multiple implementation options. When initializing LightRAG, the implementation schemes for these four types of storage can be set through parameters. For details, please refer to the previous LightRAG initialization parameters.
784
 
785
  <details>
786
  <summary> <b>Using Neo4J for Storage</b> </summary>
 
909
 
910
  </details>
911
 
 
 
 
 
 
 
 
 
 
 
912
  ## Edit Entities and Relations
913
 
914
  LightRAG now supports comprehensive knowledge graph management capabilities, allowing you to create, edit, and delete entities and relationships within your knowledge graph.
 
979
 
980
  </details>
981
 
982
+ ## Token Usage Tracking
983
+
984
+ <details>
985
+ <summary> <b>Overview and Usage</b> </summary>
986
+
987
+ LightRAG provides a TokenTracker tool to monitor and manage token consumption by large language models. This feature is particularly useful for controlling API costs and optimizing performance.
988
+
989
+ ### Usage
990
+
991
+ ```python
992
+ from lightrag.utils import TokenTracker
993
+
994
+ # Create TokenTracker instance
995
+ token_tracker = TokenTracker()
996
+
997
+ # Method 1: Using context manager (Recommended)
998
+ # Suitable for scenarios requiring automatic token usage tracking
999
+ with token_tracker:
1000
+ result1 = await llm_model_func("your question 1")
1001
+ result2 = await llm_model_func("your question 2")
1002
+
1003
+ # Method 2: Manually adding token usage records
1004
+ # Suitable for scenarios requiring more granular control over token statistics
1005
+ token_tracker.reset()
1006
+
1007
+ rag.insert()
1008
+
1009
+ rag.query("your question 1", param=QueryParam(mode="naive"))
1010
+ rag.query("your question 2", param=QueryParam(mode="mix"))
1011
+
1012
+ # Display total token usage (including insert and query operations)
1013
+ print("Token usage:", token_tracker.get_usage())
1014
+ ```
1015
+
1016
+ ### Usage Tips
1017
+ - Use context managers for long sessions or batch operations to automatically track all token consumption
1018
+ - For scenarios requiring segmented statistics, use manual mode and call reset() when appropriate
1019
+ - Regular checking of token usage helps detect abnormal consumption early
1020
+ - Actively use this feature during development and testing to optimize production costs
1021
+
1022
+ ### Practical Examples
1023
+ You can refer to these examples for implementing token tracking:
1024
+ - `examples/lightrag_gemini_track_token_demo.py`: Token tracking example using Google Gemini model
1025
+ - `examples/lightrag_siliconcloud_track_token_demo.py`: Token tracking example using SiliconCloud model
1026
+
1027
+ These examples demonstrate how to effectively use the TokenTracker feature with different models and scenarios.
1028
+
1029
+ </details>
1030
+
1031
  ## Data Export Functions
1032
 
1033
  ### Overview
 
1192
 
1193
  </details>
1194
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1195
  ## LightRAG API
1196
 
1197
  The LightRAG Server is designed to provide Web UI and API support. **For more information about LightRAG Server, please refer to [LightRAG Server](./lightrag/api/README.md).**