LarFii commited on
Commit
fe5faf4
·
1 Parent(s): 0792fda

update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -2
README.md CHANGED
@@ -20,8 +20,8 @@ This repository hosts the code of LightRAG. The structure of this code is based
20
  </div>
21
 
22
  ## 🎉 News
23
- - [x] [2024.10.16]🎯🎯📢📢LightRAG now supports Ollama models!
24
- - [x] [2024.10.15]🎯🎯📢📢LightRAG now supports Hugging Face models!
25
 
26
  ## Install
27
 
@@ -75,6 +75,42 @@ print(rag.query("What are the top themes in this story?", param=QueryParam(mode=
75
  # Perform hybrid search
76
  print(rag.query("What are the top themes in this story?", param=QueryParam(mode="hybrid")))
77
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
  ### Using Hugging Face Models
79
  If you want to use Hugging Face models, you only need to set LightRAG as follows:
80
  ```python
@@ -98,6 +134,7 @@ rag = LightRAG(
98
  ),
99
  )
100
  ```
 
101
  ### Using Ollama Models
102
  If you want to use Ollama models, you only need to set LightRAG as follows:
103
  ```python
@@ -119,11 +156,13 @@ rag = LightRAG(
119
  ),
120
  )
121
  ```
 
122
  ### Batch Insert
123
  ```python
124
  # Batch Insert: Insert multiple texts at once
125
  rag.insert(["TEXT1", "TEXT2",...])
126
  ```
 
127
  ### Incremental Insert
128
 
129
  ```python
@@ -207,6 +246,7 @@ Output your evaluation in the following JSON format:
207
  }}
208
  }}
209
  ```
 
210
  ### Overall Performance Table
211
  | | **Agriculture** | | **CS** | | **Legal** | | **Mix** | |
212
  |----------------------|-------------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|
@@ -233,6 +273,7 @@ Output your evaluation in the following JSON format:
233
 
234
  ## Reproduce
235
  All the code can be found in the `./reproduce` directory.
 
236
  ### Step-0 Extract Unique Contexts
237
  First, we need to extract unique contexts in the datasets.
238
  ```python
@@ -286,6 +327,7 @@ def extract_unique_contexts(input_directory, output_directory):
286
  print("All files have been processed.")
287
 
288
  ```
 
289
  ### Step-1 Insert Contexts
290
  For the extracted contexts, we insert them into the LightRAG system.
291
 
@@ -307,6 +349,7 @@ def insert_text(rag, file_path):
307
  if retries == max_retries:
308
  print("Insertion failed after exceeding the maximum number of retries")
309
  ```
 
310
  ### Step-2 Generate Queries
311
 
312
  We extract tokens from both the first half and the second half of each context in the dataset, then combine them as the dataset description to generate queries.
 
20
  </div>
21
 
22
  ## 🎉 News
23
+ - [x] [2024.10.16]🎯🎯📢📢LightRAG now supports [Ollama models](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-ollama-models)!
24
+ - [x] [2024.10.15]🎯🎯📢📢LightRAG now supports [Hugging Face models](https://github.com/HKUDS/LightRAG?tab=readme-ov-file#using-hugging-face-models)!
25
 
26
  ## Install
27
 
 
75
  # Perform hybrid search
76
  print(rag.query("What are the top themes in this story?", param=QueryParam(mode="hybrid")))
77
  ```
78
+
79
+ ### Open AI-like APIs
80
+ LightRAG also support Open AI-like chat/embeddings APIs:
81
+ ```python
82
+ async def llm_model_func(
83
+ prompt, system_prompt=None, history_messages=[], **kwargs
84
+ ) -> str:
85
+ return await openai_complete_if_cache(
86
+ "solar-mini",
87
+ prompt,
88
+ system_prompt=system_prompt,
89
+ history_messages=history_messages,
90
+ api_key=os.getenv("UPSTAGE_API_KEY"),
91
+ base_url="https://api.upstage.ai/v1/solar",
92
+ **kwargs
93
+ )
94
+
95
+ async def embedding_func(texts: list[str]) -> np.ndarray:
96
+ return await openai_embedding(
97
+ texts,
98
+ model="solar-embedding-1-large-query",
99
+ api_key=os.getenv("UPSTAGE_API_KEY"),
100
+ base_url="https://api.upstage.ai/v1/solar"
101
+ )
102
+
103
+ rag = LightRAG(
104
+ working_dir=WORKING_DIR,
105
+ llm_model_func=llm_model_func,
106
+ embedding_func=EmbeddingFunc(
107
+ embedding_dim=4096,
108
+ max_token_size=8192,
109
+ func=embedding_func
110
+ )
111
+ )
112
+ ```
113
+
114
  ### Using Hugging Face Models
115
  If you want to use Hugging Face models, you only need to set LightRAG as follows:
116
  ```python
 
134
  ),
135
  )
136
  ```
137
+
138
  ### Using Ollama Models
139
  If you want to use Ollama models, you only need to set LightRAG as follows:
140
  ```python
 
156
  ),
157
  )
158
  ```
159
+
160
  ### Batch Insert
161
  ```python
162
  # Batch Insert: Insert multiple texts at once
163
  rag.insert(["TEXT1", "TEXT2",...])
164
  ```
165
+
166
  ### Incremental Insert
167
 
168
  ```python
 
246
  }}
247
  }}
248
  ```
249
+
250
  ### Overall Performance Table
251
  | | **Agriculture** | | **CS** | | **Legal** | | **Mix** | |
252
  |----------------------|-------------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|
 
273
 
274
  ## Reproduce
275
  All the code can be found in the `./reproduce` directory.
276
+
277
  ### Step-0 Extract Unique Contexts
278
  First, we need to extract unique contexts in the datasets.
279
  ```python
 
327
  print("All files have been processed.")
328
 
329
  ```
330
+
331
  ### Step-1 Insert Contexts
332
  For the extracted contexts, we insert them into the LightRAG system.
333
 
 
349
  if retries == max_retries:
350
  print("Insertion failed after exceeding the maximum number of retries")
351
  ```
352
+
353
  ### Step-2 Generate Queries
354
 
355
  We extract tokens from both the first half and the second half of each context in the dataset, then combine them as the dataset description to generate queries.