yangdx commited on
Commit
9371b8e
·
1 Parent(s): 9fab518

Add document scan API notes in API README.md

Browse files
Files changed (1) hide show
  1. lightrag/api/README.md +15 -4
lightrag/api/README.md CHANGED
@@ -17,6 +17,7 @@ git clone https://github.com/HKUDS/lightrag.git
17
  # Change to the repository directory
18
  cd lightrag
19
 
 
20
  # Install in editable mode with API support
21
  pip install -e ".[api]"
22
  ```
@@ -309,6 +310,16 @@ curl -X POST "http://localhost:9621/documents/batch" \
309
  -F "files=@/path/to/doc2.txt"
310
  ```
311
 
 
 
 
 
 
 
 
 
 
 
312
  ### Ollama Emulation Endpoints
313
 
314
  #### GET /api/version
@@ -391,15 +402,15 @@ You can test the API endpoints using the provided curl commands or through the S
391
  2. Start the RAG server
392
  3. Upload some documents using the document management endpoints
393
  4. Query the system using the query endpoints
 
394
 
395
  ### Important Features
396
 
397
  #### Automatic Document Vectorization
398
  When starting any of the servers with the `--input-dir` parameter, the system will automatically:
399
- 1. Scan the specified directory for documents
400
- 2. Check for existing vectorized content in the database
401
- 3. Only vectorize new documents that aren't already in the database
402
- 4. Make all content immediately available for RAG queries
403
 
404
  This intelligent caching mechanism:
405
  - Prevents unnecessary re-vectorization of existing documents
 
17
  # Change to the repository directory
18
  cd lightrag
19
 
20
+ # create a Python virtual enviroment if neccesary
21
  # Install in editable mode with API support
22
  pip install -e ".[api]"
23
  ```
 
310
  -F "files=@/path/to/doc2.txt"
311
  ```
312
 
313
+ #### POST /documents/scan
314
+
315
+ Trigger document scan for new files in the Input directory.
316
+
317
+ ```bash
318
+ curl -X POST "http://localhost:9621/documents/scan" --max-time 1800
319
+ ```
320
+
321
+ > Ajust max-time according to the estimated index time for all new files.
322
+
323
  ### Ollama Emulation Endpoints
324
 
325
  #### GET /api/version
 
402
  2. Start the RAG server
403
  3. Upload some documents using the document management endpoints
404
  4. Query the system using the query endpoints
405
+ 5. Trigger document scan if new files is put into inputs directory
406
 
407
  ### Important Features
408
 
409
  #### Automatic Document Vectorization
410
  When starting any of the servers with the `--input-dir` parameter, the system will automatically:
411
+ 1. Check for existing vectorized content in the database
412
+ 2. Only vectorize new documents that aren't already in the database
413
+ 3. Make all content immediately available for RAG queries
 
414
 
415
  This intelligent caching mechanism:
416
  - Prevents unnecessary re-vectorization of existing documents