yangdx
commited on
Commit
·
9371b8e
1
Parent(s):
9fab518
Add document scan API notes in API README.md
Browse files- lightrag/api/README.md +15 -4
lightrag/api/README.md
CHANGED
@@ -17,6 +17,7 @@ git clone https://github.com/HKUDS/lightrag.git
|
|
17 |
# Change to the repository directory
|
18 |
cd lightrag
|
19 |
|
|
|
20 |
# Install in editable mode with API support
|
21 |
pip install -e ".[api]"
|
22 |
```
|
@@ -309,6 +310,16 @@ curl -X POST "http://localhost:9621/documents/batch" \
|
|
309 |
-F "files=@/path/to/doc2.txt"
|
310 |
```
|
311 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
312 |
### Ollama Emulation Endpoints
|
313 |
|
314 |
#### GET /api/version
|
@@ -391,15 +402,15 @@ You can test the API endpoints using the provided curl commands or through the S
|
|
391 |
2. Start the RAG server
|
392 |
3. Upload some documents using the document management endpoints
|
393 |
4. Query the system using the query endpoints
|
|
|
394 |
|
395 |
### Important Features
|
396 |
|
397 |
#### Automatic Document Vectorization
|
398 |
When starting any of the servers with the `--input-dir` parameter, the system will automatically:
|
399 |
-
1.
|
400 |
-
2.
|
401 |
-
3.
|
402 |
-
4. Make all content immediately available for RAG queries
|
403 |
|
404 |
This intelligent caching mechanism:
|
405 |
- Prevents unnecessary re-vectorization of existing documents
|
|
|
17 |
# Change to the repository directory
|
18 |
cd lightrag
|
19 |
|
20 |
+
# create a Python virtual enviroment if neccesary
|
21 |
# Install in editable mode with API support
|
22 |
pip install -e ".[api]"
|
23 |
```
|
|
|
310 |
-F "files=@/path/to/doc2.txt"
|
311 |
```
|
312 |
|
313 |
+
#### POST /documents/scan
|
314 |
+
|
315 |
+
Trigger document scan for new files in the Input directory.
|
316 |
+
|
317 |
+
```bash
|
318 |
+
curl -X POST "http://localhost:9621/documents/scan" --max-time 1800
|
319 |
+
```
|
320 |
+
|
321 |
+
> Ajust max-time according to the estimated index time for all new files.
|
322 |
+
|
323 |
### Ollama Emulation Endpoints
|
324 |
|
325 |
#### GET /api/version
|
|
|
402 |
2. Start the RAG server
|
403 |
3. Upload some documents using the document management endpoints
|
404 |
4. Query the system using the query endpoints
|
405 |
+
5. Trigger document scan if new files is put into inputs directory
|
406 |
|
407 |
### Important Features
|
408 |
|
409 |
#### Automatic Document Vectorization
|
410 |
When starting any of the servers with the `--input-dir` parameter, the system will automatically:
|
411 |
+
1. Check for existing vectorized content in the database
|
412 |
+
2. Only vectorize new documents that aren't already in the database
|
413 |
+
3. Make all content immediately available for RAG queries
|
|
|
414 |
|
415 |
This intelligent caching mechanism:
|
416 |
- Prevents unnecessary re-vectorization of existing documents
|