gzdaniel commited on
Commit
1e0e33e
·
1 Parent(s): ea77ef9

Update README.md

Browse files
Files changed (2) hide show
  1. README-zh.md +152 -186
  2. README.md +150 -200
README-zh.md CHANGED
@@ -260,6 +260,11 @@ class QueryParam:
260
  If provided, this will be used instead of the global model function.
261
  This allows using different models for different query modes.
262
  """
 
 
 
 
 
263
  ```
264
 
265
  > top_k的默认值可以通过环境变量TOP_K更改。
@@ -527,128 +532,23 @@ response = rag.query(
527
  )
528
  ```
529
 
530
- ### 自定义提示词
531
 
532
- LightRAG现在支持自定义提示,以便对系统行为进行精细控制。以下是使用方法:
533
 
534
  ```python
535
  # 创建查询参数
536
  query_param = QueryParam(
537
- mode="hybrid", # 或其他模式:"local"、"global"、"hybrid"、"mix"和"naive"
 
538
  )
539
 
540
- # 示例1:使用默认系统提示
541
  response_default = rag.query(
542
- "可再生能源的主要好处是什么?",
543
  param=query_param
544
  )
545
  print(response_default)
546
-
547
- # 示例2:使用自定义提示
548
- custom_prompt = """
549
- 您是环境科学领域的专家助手。请提供详细且结构化的答案,并附带示例。
550
- ---对话历史---
551
- {history}
552
-
553
- ---知识库---
554
- {context_data}
555
-
556
- ---响应规则---
557
-
558
- - 目标格式和长度:{response_type}
559
- """
560
- response_custom = rag.query(
561
- "可再生能源的主要好处是什么?",
562
- param=query_param,
563
- system_prompt=custom_prompt # 传递自定义提示
564
- )
565
- print(response_custom)
566
- ```
567
-
568
- ### 关键词提取
569
-
570
- 我们引入了新函数`query_with_separate_keyword_extraction`来增强关键词提取功能。该函数将关键词提取过程与用户提示分开,专注于查询以提高提取关键词的相关性。
571
-
572
- * 工作原理
573
-
574
- 该函数将输入分为两部分:
575
-
576
- - `用户查询`
577
- - `提示`
578
-
579
- 然后仅对`用户查询`执行关键词提取。这种分离确保提取过程是集中和相关的,不受`提示`中任何额外语言的影响。它还允许`提示`纯粹用于响应格式化,保持用户原始问题的意图和清晰度。
580
-
581
- * 使用示例
582
-
583
- 这个`示例`展示了如何为教育内容定制函数,专注于为高年级学生提供详细解释。
584
-
585
- ```python
586
- rag.query_with_separate_keyword_extraction(
587
- query="解释重力定律",
588
- prompt="提供适合学习物理的高中生的详细解释。",
589
- param=QueryParam(mode="hybrid")
590
- )
591
- ```
592
-
593
- ### 插入自定义知识
594
-
595
- ```python
596
- custom_kg = {
597
- "chunks": [
598
- {
599
- "content": "Alice和Bob正在合作进行量子计算研究。",
600
- "source_id": "doc-1"
601
- }
602
- ],
603
- "entities": [
604
- {
605
- "entity_name": "Alice",
606
- "entity_type": "person",
607
- "description": "Alice是一位专门研究量子物理的研究员。",
608
- "source_id": "doc-1"
609
- },
610
- {
611
- "entity_name": "Bob",
612
- "entity_type": "person",
613
- "description": "Bob是一位数学家。",
614
- "source_id": "doc-1"
615
- },
616
- {
617
- "entity_name": "量子计算",
618
- "entity_type": "technology",
619
- "description": "量子计算利用量子力学现象进行计算。",
620
- "source_id": "doc-1"
621
- }
622
- ],
623
- "relationships": [
624
- {
625
- "src_id": "Alice",
626
- "tgt_id": "Bob",
627
- "description": "Alice和Bob是研究伙伴。",
628
- "keywords": "合作 研究",
629
- "weight": 1.0,
630
- "source_id": "doc-1"
631
- },
632
- {
633
- "src_id": "Alice",
634
- "tgt_id": "量子计算",
635
- "description": "Alice进行量子计算研究。",
636
- "keywords": "研究 专业",
637
- "weight": 1.0,
638
- "source_id": "doc-1"
639
- },
640
- {
641
- "src_id": "Bob",
642
- "tgt_id": "量子计算",
643
- "description": "Bob研究量子计算。",
644
- "keywords": "研究 应用",
645
- "weight": 1.0,
646
- "source_id": "doc-1"
647
- }
648
- ]
649
- }
650
-
651
- rag.insert_custom_kg(custom_kg)
652
  ```
653
 
654
  ### 插入
@@ -934,23 +834,160 @@ updated_relation = rag.edit_relation("Google", "Google Mail", {
934
  })
935
  ```
936
 
 
 
937
  </details>
938
 
939
- 所有操作都有同步和异步版本。异步版本带有前缀"a"(例如,`acreate_entity`,`aedit_relation`)。
 
940
 
941
- #### 实体操作
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
942
 
943
  - **create_entity**:创建具有指定属性的新实体
944
  - **edit_entity**:更新现有实体的属性或重命名它
945
 
946
- #### 关系操作
947
-
948
  - **create_relation**:在现有实体之间创建新关系
949
  - **edit_relation**:更新现有关系的属性
950
 
951
  这些操作在图数据库和向量数据库组件之间保持数据一致性,确保您的知识图谱保持连贯。
952
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
953
  ## Token统计功能
 
954
  <details>
955
  <summary> <b>概述和使用</b> </summary>
956
 
@@ -1048,77 +1085,6 @@ rag.export_data("complete_data.csv", include_vector_data=True)
1048
  * 关系数据(实体之间的连接)
1049
  * 来自向量数据库的关系信息
1050
 
1051
- ## 实体合并
1052
-
1053
- <details>
1054
- <summary> <b>合并实体及其关系</b> </summary>
1055
-
1056
- LightRAG现在支持将多个实体合并为单个实体,自动处理所有关系:
1057
-
1058
- ```python
1059
- # 基本实体合并
1060
- rag.merge_entities(
1061
- source_entities=["人工智能", "AI", "机器智能"],
1062
- target_entity="AI技术"
1063
- )
1064
- ```
1065
-
1066
- 使用自定义合并策略:
1067
-
1068
- ```python
1069
- # 为不同字段定义自定义合并策略
1070
- rag.merge_entities(
1071
- source_entities=["约翰·史密斯", "史密斯博士", "J·史密斯"],
1072
- target_entity="约翰·史密斯",
1073
- merge_strategy={
1074
- "description": "concatenate", # 组合所有描述
1075
- "entity_type": "keep_first", # 保留第一个实体的类型
1076
- "source_id": "join_unique" # 组合所有唯一的源ID
1077
- }
1078
- )
1079
- ```
1080
-
1081
- 使用自定义目标实体数据:
1082
-
1083
- ```python
1084
- # 为合并后的实体指定确切值
1085
- rag.merge_entities(
1086
- source_entities=["纽约", "NYC", "大苹果"],
1087
- target_entity="纽约市",
1088
- target_entity_data={
1089
- "entity_type": "LOCATION",
1090
- "description": "纽约市是美国人口最多的城市。",
1091
- }
1092
- )
1093
- ```
1094
-
1095
- 结合两种方法的高级用法:
1096
-
1097
- ```python
1098
- # 使用策略和自定义数据合并公司实体
1099
- rag.merge_entities(
1100
- source_entities=["微软公司", "Microsoft Corporation", "MSFT"],
1101
- target_entity="微软",
1102
- merge_strategy={
1103
- "description": "concatenate", # 组合所有描述
1104
- "source_id": "join_unique" # 组合源ID
1105
- },
1106
- target_entity_data={
1107
- "entity_type": "ORGANIZATION",
1108
- }
1109
- )
1110
- ```
1111
-
1112
- 合并实体时:
1113
-
1114
- * 所有来自源实体的关系都会重定向到目标实体
1115
- * 重复的关系会被智能合并
1116
- * 防止自我关系(循环)
1117
- * 合并后删除源实体
1118
- * 保留关系权重和属性
1119
-
1120
- </details>
1121
-
1122
  ## 缓存
1123
 
1124
  <details>
 
260
  If provided, this will be used instead of the global model function.
261
  This allows using different models for different query modes.
262
  """
263
+
264
+ user_prompt: str | None = None
265
+ """User-provided prompt for the query.
266
+ If proivded, this will be use instead of the default vaulue from prompt template.
267
+ """
268
  ```
269
 
270
  > top_k的默认值可以通过环境变量TOP_K更改。
 
532
  )
533
  ```
534
 
535
+ ### 自定义用户提示词
536
 
537
+ 自定义用户提示词不影响查询内容,仅仅用于向LLM指示如何处理查询结果。以下是使用方法:
538
 
539
  ```python
540
  # 创建查询参数
541
  query_param = QueryParam(
542
+ mode = "hybrid", # 或其他模式:"local"、"global"、"hybrid"、"mix"和"naive"
543
+ user_prompt = "Please create the diagram using the Mermaid syntax"
544
  )
545
 
546
+ # 查询和处理
547
  response_default = rag.query(
548
+ "Please draw a character relationship diagram for Scrooge",
549
  param=query_param
550
  )
551
  print(response_default)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
552
  ```
553
 
554
  ### 插入
 
834
  })
835
  ```
836
 
837
+ 所有操作都有同步和异步版本。异步版本带有前缀"a"(例如,`acreate_entity`,`aedit_relation`)。
838
+
839
  </details>
840
 
841
+ <details>
842
+ <summary> <b>插入自定义知识</b> </summary>
843
 
844
+ ```python
845
+ custom_kg = {
846
+ "chunks": [
847
+ {
848
+ "content": "Alice和Bob正在合作进行量子计算研究。",
849
+ "source_id": "doc-1"
850
+ }
851
+ ],
852
+ "entities": [
853
+ {
854
+ "entity_name": "Alice",
855
+ "entity_type": "person",
856
+ "description": "Alice是一位专门研究量子物理的研究员。",
857
+ "source_id": "doc-1"
858
+ },
859
+ {
860
+ "entity_name": "Bob",
861
+ "entity_type": "person",
862
+ "description": "Bob是一位数学家。",
863
+ "source_id": "doc-1"
864
+ },
865
+ {
866
+ "entity_name": "量子计算",
867
+ "entity_type": "technology",
868
+ "description": "量子计算利用量子力学现象进行计算。",
869
+ "source_id": "doc-1"
870
+ }
871
+ ],
872
+ "relationships": [
873
+ {
874
+ "src_id": "Alice",
875
+ "tgt_id": "Bob",
876
+ "description": "Alice和Bob是研究伙伴。",
877
+ "keywords": "合作 研究",
878
+ "weight": 1.0,
879
+ "source_id": "doc-1"
880
+ },
881
+ {
882
+ "src_id": "Alice",
883
+ "tgt_id": "量子计算",
884
+ "description": "Alice进行量子计算研究。",
885
+ "keywords": "研究 专业",
886
+ "weight": 1.0,
887
+ "source_id": "doc-1"
888
+ },
889
+ {
890
+ "src_id": "Bob",
891
+ "tgt_id": "量子计算",
892
+ "description": "Bob研究量子计算。",
893
+ "keywords": "研究 应用",
894
+ "weight": 1.0,
895
+ "source_id": "doc-1"
896
+ }
897
+ ]
898
+ }
899
+
900
+ rag.insert_custom_kg(custom_kg)
901
+ ```
902
+
903
+ </details>
904
+
905
+ <details>
906
+ <summary> <b>其它实体与关系操作</b> </summary>
907
 
908
  - **create_entity**:创建具有指定属性的新实体
909
  - **edit_entity**:更新现有实体的属性或重命名它
910
 
 
 
911
  - **create_relation**:在现有实体之间创建新关系
912
  - **edit_relation**:更新现有关系的属性
913
 
914
  这些操作在图数据库和向量数据库组件之间保持数据一致性,确保您的知识图谱保持连贯。
915
 
916
+ </details>
917
+
918
+ ## 实体合并
919
+
920
+ <details>
921
+ <summary> <b>合并实体及其关系</b> </summary>
922
+
923
+ LightRAG现在支持将多个实体合并为单个实体,自动处理所有关系:
924
+
925
+ ```python
926
+ # 基本实体合并
927
+ rag.merge_entities(
928
+ source_entities=["人工智能", "AI", "机器智能"],
929
+ target_entity="AI技术"
930
+ )
931
+ ```
932
+
933
+ 使用自定义合并策略:
934
+
935
+ ```python
936
+ # 为不同字段定义自定义合并策略
937
+ rag.merge_entities(
938
+ source_entities=["约翰·史密斯", "史密斯博士", "J·史密斯"],
939
+ target_entity="约翰·史密斯",
940
+ merge_strategy={
941
+ "description": "concatenate", # 组合所有描述
942
+ "entity_type": "keep_first", # 保留第一个实体的类型
943
+ "source_id": "join_unique" # 组合所有唯一的源ID
944
+ }
945
+ )
946
+ ```
947
+
948
+ 使用自定义目标实体数据:
949
+
950
+ ```python
951
+ # 为合并后的实体指定确切值
952
+ rag.merge_entities(
953
+ source_entities=["纽约", "NYC", "大苹果"],
954
+ target_entity="纽约市",
955
+ target_entity_data={
956
+ "entity_type": "LOCATION",
957
+ "description": "纽约市是美国人口最多的城市。",
958
+ }
959
+ )
960
+ ```
961
+
962
+ 结合两种方法的高级用法:
963
+
964
+ ```python
965
+ # 使用策略和自定义数据合并公司实体
966
+ rag.merge_entities(
967
+ source_entities=["微软公司", "Microsoft Corporation", "MSFT"],
968
+ target_entity="微软",
969
+ merge_strategy={
970
+ "description": "concatenate", # 组合所有描述
971
+ "source_id": "join_unique" # 组合源ID
972
+ },
973
+ target_entity_data={
974
+ "entity_type": "ORGANIZATION",
975
+ }
976
+ )
977
+ ```
978
+
979
+ 合并实体时:
980
+
981
+ * 所有来自源实体的关系都会重定向到目标实体
982
+ * 重复的关系会被智能合并
983
+ * 防止自我关系(循环)
984
+ * 合并后删除源实体
985
+ * 保留关系权重和属性
986
+
987
+ </details>
988
+
989
  ## Token统计功能
990
+
991
  <details>
992
  <summary> <b>概述和使用</b> </summary>
993
 
 
1085
  * 关系数据(实体之间的连接)
1086
  * 来自向量数据库的关系信息
1087
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1088
  ## 缓存
1089
 
1090
  <details>
README.md CHANGED
@@ -274,12 +274,6 @@ class QueryParam:
274
  max_token_for_local_context: int = int(os.getenv("MAX_TOKEN_ENTITY_DESC", "4000"))
275
  """Maximum number of tokens allocated for entity descriptions in local retrieval."""
276
 
277
- hl_keywords: list[str] = field(default_factory=list)
278
- """List of high-level keywords to prioritize in retrieval."""
279
-
280
- ll_keywords: list[str] = field(default_factory=list)
281
- """List of low-level keywords to refine retrieval focus."""
282
-
283
  conversation_history: list[dict[str, str]] = field(default_factory=list)
284
  """Stores past conversation history to maintain context.
285
  Format: [{"role": "user/assistant", "content": "message"}].
@@ -296,6 +290,11 @@ class QueryParam:
296
  If provided, this will be used instead of the global model function.
297
  This allows using different models for different query modes.
298
  """
 
 
 
 
 
299
  ```
300
 
301
  > default value of Top_k can be change by environment variables TOP_K.
@@ -571,76 +570,26 @@ response = rag.query(
571
 
572
  </details>
573
 
574
- ### Custom Prompt Support
575
-
576
- LightRAG now supports custom prompts for fine-tuned control over the system's behavior. Here's how to use it:
577
 
578
- <details>
579
- <summary> <b> Usage Example </b></summary>
580
 
581
  ```python
582
  # Create query parameters
583
  query_param = QueryParam(
584
- mode="hybrid", # or other mode: "local", "global", "hybrid", "mix" and "naive"
 
585
  )
586
 
587
- # Example 1: Using the default system prompt
588
  response_default = rag.query(
589
- "What are the primary benefits of renewable energy?",
590
  param=query_param
591
  )
592
  print(response_default)
593
-
594
- # Example 2: Using a custom prompt
595
- custom_prompt = """
596
- You are an expert assistant in environmental science. Provide detailed and structured answers with examples.
597
- ---Conversation History---
598
- {history}
599
-
600
- ---Knowledge Base---
601
- {context_data}
602
-
603
- ---Response Rules---
604
-
605
- - Target format and length: {response_type}
606
- """
607
- response_custom = rag.query(
608
- "What are the primary benefits of renewable energy?",
609
- param=query_param,
610
- system_prompt=custom_prompt # Pass the custom prompt
611
- )
612
- print(response_custom)
613
  ```
614
 
615
- </details>
616
-
617
- ### Separate Keyword Extraction
618
-
619
- We've introduced a new function `query_with_separate_keyword_extraction` to enhance the keyword extraction capabilities. This function separates the keyword extraction process from the user's prompt, focusing solely on the query to improve the relevance of extracted keywords.
620
-
621
- **How It Works?**
622
-
623
- The function operates by dividing the input into two parts:
624
-
625
- - `User Query`
626
- - `Prompt`
627
-
628
- It then performs keyword extraction exclusively on the `user query`. This separation ensures that the extraction process is focused and relevant, unaffected by any additional language in the `prompt`. It also allows the `prompt` to serve purely for response formatting, maintaining the intent and clarity of the user's original question.
629
 
630
- <details>
631
- <summary> <b> Usage Example </b></summary>
632
-
633
- This `example` shows how to tailor the function for educational content, focusing on detailed explanations for older students.
634
-
635
- ```python
636
- rag.query_with_separate_keyword_extraction(
637
- query="Explain the law of gravity",
638
- prompt="Provide a detailed explanation suitable for high school students studying physics.",
639
- param=QueryParam(mode="hybrid")
640
- )
641
- ```
642
-
643
- </details>
644
 
645
  ### Insert
646
 
@@ -725,70 +674,6 @@ rag.insert(text_content.decode('utf-8'))
725
 
726
  </details>
727
 
728
- <details>
729
- <summary> <b> Insert Custom KG </b></summary>
730
-
731
- ```python
732
- custom_kg = {
733
- "chunks": [
734
- {
735
- "content": "Alice and Bob are collaborating on quantum computing research.",
736
- "source_id": "doc-1"
737
- }
738
- ],
739
- "entities": [
740
- {
741
- "entity_name": "Alice",
742
- "entity_type": "person",
743
- "description": "Alice is a researcher specializing in quantum physics.",
744
- "source_id": "doc-1"
745
- },
746
- {
747
- "entity_name": "Bob",
748
- "entity_type": "person",
749
- "description": "Bob is a mathematician.",
750
- "source_id": "doc-1"
751
- },
752
- {
753
- "entity_name": "Quantum Computing",
754
- "entity_type": "technology",
755
- "description": "Quantum computing utilizes quantum mechanical phenomena for computation.",
756
- "source_id": "doc-1"
757
- }
758
- ],
759
- "relationships": [
760
- {
761
- "src_id": "Alice",
762
- "tgt_id": "Bob",
763
- "description": "Alice and Bob are research partners.",
764
- "keywords": "collaboration research",
765
- "weight": 1.0,
766
- "source_id": "doc-1"
767
- },
768
- {
769
- "src_id": "Alice",
770
- "tgt_id": "Quantum Computing",
771
- "description": "Alice conducts research on quantum computing.",
772
- "keywords": "research expertise",
773
- "weight": 1.0,
774
- "source_id": "doc-1"
775
- },
776
- {
777
- "src_id": "Bob",
778
- "tgt_id": "Quantum Computing",
779
- "description": "Bob researches quantum computing.",
780
- "keywords": "research application",
781
- "weight": 1.0,
782
- "source_id": "doc-1"
783
- }
784
- ]
785
- }
786
-
787
- rag.insert_custom_kg(custom_kg)
788
- ```
789
-
790
- </details>
791
-
792
  <details>
793
  <summary><b>Citation Functionality</b></summary>
794
 
@@ -992,12 +877,78 @@ updated_relation = rag.edit_relation("Google", "Google Mail", {
992
 
993
  All operations are available in both synchronous and asynchronous versions. The asynchronous versions have the prefix "a" (e.g., `acreate_entity`, `aedit_relation`).
994
 
995
- #### Entity Operations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
996
 
997
  - **create_entity**: Creates a new entity with specified attributes
998
  - **edit_entity**: Updates an existing entity's attributes or renames it
999
 
1000
- #### Relation Operations
1001
 
1002
  - **create_relation**: Creates a new relation between existing entities
1003
  - **edit_relation**: Updates an existing relation's attributes
@@ -1006,6 +957,77 @@ These operations maintain data consistency across both the graph database and ve
1006
 
1007
  </details>
1008
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1009
  ## Token Usage Tracking
1010
 
1011
  <details>
@@ -1112,78 +1134,6 @@ All exports include:
1112
  * Relation data (connections between entities)
1113
  * Relationship information from vector database
1114
 
1115
-
1116
- ## Entity Merging
1117
-
1118
- <details>
1119
- <summary> <b>Merge Entities and Their Relationships</b> </summary>
1120
-
1121
- LightRAG now supports merging multiple entities into a single entity, automatically handling all relationships:
1122
-
1123
- ```python
1124
- # Basic entity merging
1125
- rag.merge_entities(
1126
- source_entities=["Artificial Intelligence", "AI", "Machine Intelligence"],
1127
- target_entity="AI Technology"
1128
- )
1129
- ```
1130
-
1131
- With custom merge strategy:
1132
-
1133
- ```python
1134
- # Define custom merge strategy for different fields
1135
- rag.merge_entities(
1136
- source_entities=["John Smith", "Dr. Smith", "J. Smith"],
1137
- target_entity="John Smith",
1138
- merge_strategy={
1139
- "description": "concatenate", # Combine all descriptions
1140
- "entity_type": "keep_first", # Keep the entity type from the first entity
1141
- "source_id": "join_unique" # Combine all unique source IDs
1142
- }
1143
- )
1144
- ```
1145
-
1146
- With custom target entity data:
1147
-
1148
- ```python
1149
- # Specify exact values for the merged entity
1150
- rag.merge_entities(
1151
- source_entities=["New York", "NYC", "Big Apple"],
1152
- target_entity="New York City",
1153
- target_entity_data={
1154
- "entity_type": "LOCATION",
1155
- "description": "New York City is the most populous city in the United States.",
1156
- }
1157
- )
1158
- ```
1159
-
1160
- Advanced usage combining both approaches:
1161
-
1162
- ```python
1163
- # Merge company entities with both strategy and custom data
1164
- rag.merge_entities(
1165
- source_entities=["Microsoft Corp", "Microsoft Corporation", "MSFT"],
1166
- target_entity="Microsoft",
1167
- merge_strategy={
1168
- "description": "concatenate", # Combine all descriptions
1169
- "source_id": "join_unique" # Combine source IDs
1170
- },
1171
- target_entity_data={
1172
- "entity_type": "ORGANIZATION",
1173
- }
1174
- )
1175
- ```
1176
-
1177
- When merging entities:
1178
-
1179
- * All relationships from source entities are redirected to the target entity
1180
- * Duplicate relationships are intelligently merged
1181
- * Self-relationships (loops) are prevented
1182
- * Source entities are removed after merging
1183
- * Relationship weights and attributes are preserved
1184
-
1185
- </details>
1186
-
1187
  ## Cache
1188
 
1189
  <details>
 
274
  max_token_for_local_context: int = int(os.getenv("MAX_TOKEN_ENTITY_DESC", "4000"))
275
  """Maximum number of tokens allocated for entity descriptions in local retrieval."""
276
 
 
 
 
 
 
 
277
  conversation_history: list[dict[str, str]] = field(default_factory=list)
278
  """Stores past conversation history to maintain context.
279
  Format: [{"role": "user/assistant", "content": "message"}].
 
290
  If provided, this will be used instead of the global model function.
291
  This allows using different models for different query modes.
292
  """
293
+
294
+ user_prompt: str | None = None
295
+ """User-provided prompt for the query.
296
+ If proivded, this will be use instead of the default vaulue from prompt template.
297
+ """
298
  ```
299
 
300
  > default value of Top_k can be change by environment variables TOP_K.
 
570
 
571
  </details>
572
 
573
+ ### Custom User Prompt Support
 
 
574
 
575
+ Custom user prompts do not affect the query content; they are only used to instruct the LLM on how to handle the query results. Here's how to use it:
 
576
 
577
  ```python
578
  # Create query parameters
579
  query_param = QueryParam(
580
+ mode = "hybrid", # 或其他模式:"local""global""hybrid""mix""naive"
581
+ user_prompt = "Please create the diagram using the Mermaid syntax"
582
  )
583
 
584
+ # Query and process
585
  response_default = rag.query(
586
+ "Please draw a character relationship diagram for Scrooge",
587
  param=query_param
588
  )
589
  print(response_default)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
590
  ```
591
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
592
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
593
 
594
  ### Insert
595
 
 
674
 
675
  </details>
676
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
677
  <details>
678
  <summary><b>Citation Functionality</b></summary>
679
 
 
877
 
878
  All operations are available in both synchronous and asynchronous versions. The asynchronous versions have the prefix "a" (e.g., `acreate_entity`, `aedit_relation`).
879
 
880
+ </details>
881
+
882
+ <details>
883
+ <summary> <b> Insert Custom KG </b></summary>
884
+
885
+ ```python
886
+ custom_kg = {
887
+ "chunks": [
888
+ {
889
+ "content": "Alice and Bob are collaborating on quantum computing research.",
890
+ "source_id": "doc-1"
891
+ }
892
+ ],
893
+ "entities": [
894
+ {
895
+ "entity_name": "Alice",
896
+ "entity_type": "person",
897
+ "description": "Alice is a researcher specializing in quantum physics.",
898
+ "source_id": "doc-1"
899
+ },
900
+ {
901
+ "entity_name": "Bob",
902
+ "entity_type": "person",
903
+ "description": "Bob is a mathematician.",
904
+ "source_id": "doc-1"
905
+ },
906
+ {
907
+ "entity_name": "Quantum Computing",
908
+ "entity_type": "technology",
909
+ "description": "Quantum computing utilizes quantum mechanical phenomena for computation.",
910
+ "source_id": "doc-1"
911
+ }
912
+ ],
913
+ "relationships": [
914
+ {
915
+ "src_id": "Alice",
916
+ "tgt_id": "Bob",
917
+ "description": "Alice and Bob are research partners.",
918
+ "keywords": "collaboration research",
919
+ "weight": 1.0,
920
+ "source_id": "doc-1"
921
+ },
922
+ {
923
+ "src_id": "Alice",
924
+ "tgt_id": "Quantum Computing",
925
+ "description": "Alice conducts research on quantum computing.",
926
+ "keywords": "research expertise",
927
+ "weight": 1.0,
928
+ "source_id": "doc-1"
929
+ },
930
+ {
931
+ "src_id": "Bob",
932
+ "tgt_id": "Quantum Computing",
933
+ "description": "Bob researches quantum computing.",
934
+ "keywords": "research application",
935
+ "weight": 1.0,
936
+ "source_id": "doc-1"
937
+ }
938
+ ]
939
+ }
940
+
941
+ rag.insert_custom_kg(custom_kg)
942
+ ```
943
+
944
+ </details>
945
+
946
+ <details>
947
+ <summary> <b>Other Entity and Relation Operations</b></summary>
948
 
949
  - **create_entity**: Creates a new entity with specified attributes
950
  - **edit_entity**: Updates an existing entity's attributes or renames it
951
 
 
952
 
953
  - **create_relation**: Creates a new relation between existing entities
954
  - **edit_relation**: Updates an existing relation's attributes
 
957
 
958
  </details>
959
 
960
+ ## Entity Merging
961
+
962
+ <details>
963
+ <summary> <b>Merge Entities and Their Relationships</b> </summary>
964
+
965
+ LightRAG now supports merging multiple entities into a single entity, automatically handling all relationships:
966
+
967
+ ```python
968
+ # Basic entity merging
969
+ rag.merge_entities(
970
+ source_entities=["Artificial Intelligence", "AI", "Machine Intelligence"],
971
+ target_entity="AI Technology"
972
+ )
973
+ ```
974
+
975
+ With custom merge strategy:
976
+
977
+ ```python
978
+ # Define custom merge strategy for different fields
979
+ rag.merge_entities(
980
+ source_entities=["John Smith", "Dr. Smith", "J. Smith"],
981
+ target_entity="John Smith",
982
+ merge_strategy={
983
+ "description": "concatenate", # Combine all descriptions
984
+ "entity_type": "keep_first", # Keep the entity type from the first entity
985
+ "source_id": "join_unique" # Combine all unique source IDs
986
+ }
987
+ )
988
+ ```
989
+
990
+ With custom target entity data:
991
+
992
+ ```python
993
+ # Specify exact values for the merged entity
994
+ rag.merge_entities(
995
+ source_entities=["New York", "NYC", "Big Apple"],
996
+ target_entity="New York City",
997
+ target_entity_data={
998
+ "entity_type": "LOCATION",
999
+ "description": "New York City is the most populous city in the United States.",
1000
+ }
1001
+ )
1002
+ ```
1003
+
1004
+ Advanced usage combining both approaches:
1005
+
1006
+ ```python
1007
+ # Merge company entities with both strategy and custom data
1008
+ rag.merge_entities(
1009
+ source_entities=["Microsoft Corp", "Microsoft Corporation", "MSFT"],
1010
+ target_entity="Microsoft",
1011
+ merge_strategy={
1012
+ "description": "concatenate", # Combine all descriptions
1013
+ "source_id": "join_unique" # Combine source IDs
1014
+ },
1015
+ target_entity_data={
1016
+ "entity_type": "ORGANIZATION",
1017
+ }
1018
+ )
1019
+ ```
1020
+
1021
+ When merging entities:
1022
+
1023
+ * All relationships from source entities are redirected to the target entity
1024
+ * Duplicate relationships are intelligently merged
1025
+ * Self-relationships (loops) are prevented
1026
+ * Source entities are removed after merging
1027
+ * Relationship weights and attributes are preserved
1028
+
1029
+ </details>
1030
+
1031
  ## Token Usage Tracking
1032
 
1033
  <details>
 
1134
  * Relation data (connections between entities)
1135
  * Relationship information from vector database
1136
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1137
  ## Cache
1138
 
1139
  <details>