Andrii Lazarchuk commited on
Commit
0cea469
·
1 Parent(s): a2724fe

Update README with more details

Browse files
Files changed (1) hide show
  1. README.md +40 -6
README.md CHANGED
@@ -163,7 +163,10 @@ rag = LightRAG(
163
  <details>
164
  <summary> Using Ollama Models </summary>
165
 
166
- * If you want to use Ollama models, you only need to set LightRAG as follows:
 
 
 
167
 
168
  ```python
169
  from lightrag.llm import ollama_model_complete, ollama_embedding
@@ -185,28 +188,59 @@ rag = LightRAG(
185
  )
186
  ```
187
 
188
- * Increasing the `num_ctx` parameter:
 
 
 
189
 
190
  1. Pull the model:
191
- ```python
192
  ollama pull qwen2
193
  ```
194
 
195
  2. Display the model file:
196
- ```python
197
  ollama show --modelfile qwen2 > Modelfile
198
  ```
199
 
200
  3. Edit the Modelfile by adding the following line:
201
- ```python
202
  PARAMETER num_ctx 32768
203
  ```
204
 
205
  4. Create the modified model:
206
- ```python
207
  ollama create -f Modelfile qwen2m
208
  ```
209
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
210
  </details>
211
 
212
  ### Query Param
 
163
  <details>
164
  <summary> Using Ollama Models </summary>
165
 
166
+ ### Overview
167
+ If you want to use Ollama models, you need to pull model you plan to use and embedding model, for example `nomic-embed-text`.
168
+
169
+ Then you only need to set LightRAG as follows:
170
 
171
  ```python
172
  from lightrag.llm import ollama_model_complete, ollama_embedding
 
188
  )
189
  ```
190
 
191
+ ### Increasing context size
192
+ In order for LightRAG to work context should be at least 32k tokens. By default Ollama models have context size of 8k. You can achieve this using one of two ways:
193
+
194
+ #### Increasing the `num_ctx` parameter in Modelfile.
195
 
196
  1. Pull the model:
197
+ ```bash
198
  ollama pull qwen2
199
  ```
200
 
201
  2. Display the model file:
202
+ ```bash
203
  ollama show --modelfile qwen2 > Modelfile
204
  ```
205
 
206
  3. Edit the Modelfile by adding the following line:
207
+ ```bash
208
  PARAMETER num_ctx 32768
209
  ```
210
 
211
  4. Create the modified model:
212
+ ```bash
213
  ollama create -f Modelfile qwen2m
214
  ```
215
 
216
+ #### Setup `num_ctx` via Ollama API.
217
+ Tiy can use `llm_model_kwargs` param to configure ollama:
218
+
219
+ ```python
220
+ rag = LightRAG(
221
+ working_dir=WORKING_DIR,
222
+ llm_model_func=ollama_model_complete, # Use Ollama model for text generation
223
+ llm_model_name='your_model_name', # Your model name
224
+ llm_model_kwargs={"options": {"num_ctx": 32768}},
225
+ # Use Ollama embedding function
226
+ embedding_func=EmbeddingFunc(
227
+ embedding_dim=768,
228
+ max_token_size=8192,
229
+ func=lambda texts: ollama_embedding(
230
+ texts,
231
+ embed_model="nomic-embed-text"
232
+ )
233
+ ),
234
+ )
235
+ ```
236
+ #### Fully functional example
237
+
238
+ There fully functional example `examples/lightrag_ollama_demo.py` that utilizes `gemma2:2b` model, runs only 4 requests in parallel and set context size to 32k.
239
+
240
+ #### Low RAM GPUs
241
+
242
+ In order to run this experiment on low RAM GPU you should select small model and tune context window (increasing context increase memory consumption). For example, running this ollama example on repurposed mining GPU with 6Gb of RAM required to set context size to 26k while using `gemma2:2b`. It was able to find 197 entities and 19 relations on `book.txt`.
243
+
244
  </details>
245
 
246
  ### Query Param