zrguo commited on
Commit
3538442
·
unverified ·
2 Parent(s): 41b651b dfe4914

Merge pull request #593 from ParisNeo/main

Browse files

Major Updates: Docker Support, Enhanced Configuration, and Documentation Restructuring #554

.env.example ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Server Configuration
2
+ HOST=0.0.0.0
3
+ PORT=9621
4
+
5
+ # Directory Configuration
6
+ WORKING_DIR=/app/data/rag_storage
7
+ INPUT_DIR=/app/data/inputs
8
+
9
+ # LLM Configuration (Use valid host. For local services, you can use host.docker.internal)
10
+ # Ollama example
11
+ LLM_BINDING=ollama
12
+ LLM_BINDING_HOST=http://host.docker.internal:11434
13
+ LLM_MODEL=mistral-nemo:latest
14
+
15
+ # Lollms example
16
+ LLM_BINDING=lollms
17
+ LLM_BINDING_HOST=http://host.docker.internal:9600
18
+ LLM_MODEL=mistral-nemo:latest
19
+
20
+
21
+ # Embedding Configuration (Use valid host. For local services, you can use host.docker.internal)
22
+ # Ollama example
23
+ EMBEDDING_BINDING=ollama
24
+ EMBEDDING_BINDING_HOST=http://host.docker.internal:11434
25
+ EMBEDDING_MODEL=bge-m3:latest
26
+
27
+ # Lollms example
28
+ EMBEDDING_BINDING=lollms
29
+ EMBEDDING_BINDING_HOST=http://host.docker.internal:9600
30
+ EMBEDDING_MODEL=bge-m3:latest
31
+
32
+ # RAG Configuration
33
+ MAX_ASYNC=4
34
+ MAX_TOKENS=32768
35
+ EMBEDDING_DIM=1024
36
+ MAX_EMBED_TOKENS=8192
37
+
38
+ # Security (empty for no key)
39
+ LIGHTRAG_API_KEY=your-secure-api-key-here
40
+
41
+ # Logging
42
+ LOG_LEVEL=INFO
43
+
44
+ # Optional SSL Configuration
45
+ #SSL=true
46
+ #SSL_CERTFILE=/path/to/cert.pem
47
+ #SSL_KEYFILE=/path/to/key.pem
48
+
49
+ # Optional Timeout
50
+ #TIMEOUT=30
51
+
52
+
53
+ # Optional for Azure
54
+ # AZURE_OPENAI_API_VERSION=2024-08-01-preview
55
+ # AZURE_OPENAI_DEPLOYMENT=gpt-4o
56
+ # AZURE_OPENAI_API_KEY=myapikey
57
+ # AZURE_OPENAI_ENDPOINT=https://myendpoint.openai.azure.com
58
+
59
+ # AZURE_EMBEDDING_DEPLOYMENT=text-embedding-3-large
60
+ # AZURE_EMBEDDING_API_VERSION=2023-05-15
Dockerfile ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Build stage
2
+ FROM python:3.11-slim as builder
3
+
4
+ WORKDIR /app
5
+
6
+ # Install build dependencies
7
+ RUN apt-get update && apt-get install -y --no-install-recommends \
8
+ build-essential \
9
+ && rm -rf /var/lib/apt/lists/*
10
+
11
+ # Copy only requirements files first to leverage Docker cache
12
+ COPY requirements.txt .
13
+ COPY lightrag/api/requirements.txt ./lightrag/api/
14
+
15
+ # Install dependencies
16
+ RUN pip install --user --no-cache-dir -r requirements.txt
17
+ RUN pip install --user --no-cache-dir -r lightrag/api/requirements.txt
18
+
19
+ # Final stage
20
+ FROM python:3.11-slim
21
+
22
+ WORKDIR /app
23
+
24
+ # Copy only necessary files from builder
25
+ COPY --from=builder /root/.local /root/.local
26
+ COPY ./lightrag ./lightrag
27
+ COPY setup.py .
28
+ COPY .env .
29
+
30
+ RUN pip install .
31
+ # Make sure scripts in .local are usable
32
+ ENV PATH=/root/.local/bin:$PATH
33
+
34
+ # Create necessary directories
35
+ RUN mkdir -p /app/data/rag_storage /app/data/inputs
36
+
37
+ # Expose the default port
38
+ EXPOSE 9621
39
+
40
+ # Set entrypoint
41
+ ENTRYPOINT ["python", "-m", "lightrag.api.lightrag_server"]
README.md CHANGED
@@ -921,342 +921,10 @@ def extract_queries(file_path):
921
  ```
922
  </details>
923
 
924
- ## Install with API Support
 
925
 
926
- LightRAG provides optional API support through FastAPI servers that add RAG capabilities to existing LLM services. You can install LightRAG with API support in two ways:
927
-
928
- ### 1. Installation from PyPI
929
-
930
- ```bash
931
- pip install "lightrag-hku[api]"
932
- ```
933
-
934
- ### 2. Installation from Source (Development)
935
-
936
- ```bash
937
- # Clone the repository
938
- git clone https://github.com/HKUDS/lightrag.git
939
-
940
- # Change to the repository directory
941
- cd lightrag
942
-
943
- # Install in editable mode with API support
944
- pip install -e ".[api]"
945
- ```
946
-
947
- ### Prerequisites
948
-
949
- Before running any of the servers, ensure you have the corresponding backend service running for both llm and embedding.
950
- The new api allows you to mix different bindings for llm/embeddings.
951
- For example, you have the possibility to use ollama for the embedding and openai for the llm.
952
-
953
- #### For LoLLMs Server
954
- - LoLLMs must be running and accessible
955
- - Default connection: http://localhost:9600
956
- - Configure using --llm-binding-host and/or --embedding-binding-host if running on a different host/port
957
-
958
- #### For Ollama Server
959
- - Ollama must be running and accessible
960
- - Default connection: http://localhost:11434
961
- - Configure using --ollama-host if running on a different host/port
962
-
963
- #### For OpenAI Server
964
- - Requires valid OpenAI API credentials set in environment variables
965
- - OPENAI_API_KEY must be set
966
-
967
- #### For Azure OpenAI Server
968
- Azure OpenAI API can be created using the following commands in Azure CLI (you need to install Azure CLI first from [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)):
969
- ```bash
970
- # Change the resource group name, location and OpenAI resource name as needed
971
- RESOURCE_GROUP_NAME=LightRAG
972
- LOCATION=swedencentral
973
- RESOURCE_NAME=LightRAG-OpenAI
974
-
975
- az login
976
- az group create --name $RESOURCE_GROUP_NAME --location $LOCATION
977
- az cognitiveservices account create --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --kind OpenAI --sku S0 --location swedencentral
978
- az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name gpt-4o --model-name gpt-4o --model-version "2024-08-06" --sku-capacity 100 --sku-name "Standard"
979
- az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name text-embedding-3-large --model-name text-embedding-3-large --model-version "1" --sku-capacity 80 --sku-name "Standard"
980
- az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --query "properties.endpoint"
981
- az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_NAME
982
-
983
- ```
984
- The output of the last command will give you the endpoint and the key for the OpenAI API. You can use these values to set the environment variables in the `.env` file.
985
-
986
-
987
-
988
- ### Configuration Options
989
-
990
- Each server has its own specific configuration options:
991
-
992
- #### LightRag Server Options
993
-
994
- | Parameter | Default | Description |
995
- |-----------|---------|-------------|
996
- | --host | 0.0.0.0 | Server host |
997
- | --port | 9621 | Server port |
998
- | --llm-binding | ollama | LLM binding to be used. Supported: lollms, ollama, openai |
999
- | --llm-binding-host | (dynamic) | LLM server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
1000
- | --llm-model | mistral-nemo:latest | LLM model name |
1001
- | --embedding-binding | ollama | Embedding binding to be used. Supported: lollms, ollama, openai |
1002
- | --embedding-binding-host | (dynamic) | Embedding server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
1003
- | --embedding-model | bge-m3:latest | Embedding model name |
1004
- | --working-dir | ./rag_storage | Working directory for RAG storage |
1005
- | --input-dir | ./inputs | Directory containing input documents |
1006
- | --max-async | 4 | Maximum async operations |
1007
- | --max-tokens | 32768 | Maximum token size |
1008
- | --embedding-dim | 1024 | Embedding dimensions |
1009
- | --max-embed-tokens | 8192 | Maximum embedding token size |
1010
- | --timeout | None | Timeout in seconds (useful when using slow AI). Use None for infinite timeout |
1011
- | --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
1012
- | --key | None | API key for authentication. Protects lightrag server against unauthorized access |
1013
- | --ssl | False | Enable HTTPS |
1014
- | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
1015
- | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
1016
-
1017
-
1018
-
1019
- For protecting the server using an authentication key, you can also use an environment variable named `LIGHTRAG_API_KEY`.
1020
- ### Example Usage
1021
-
1022
- #### Running a Lightrag server with ollama default local server as llm and embedding backends
1023
-
1024
- Ollama is the default backend for both llm and embedding, so by default you can run lightrag-server with no parameters and the default ones will be used. Make sure ollama is installed and is running and default models are already installed on ollama.
1025
-
1026
- ```bash
1027
- # Run lightrag with ollama, mistral-nemo:latest for llm, and bge-m3:latest for embedding
1028
- lightrag-server
1029
-
1030
- # Using specific models (ensure they are installed in your ollama instance)
1031
- lightrag-server --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-model nomic-embed-text --embedding-dim 1024
1032
-
1033
- # Using an authentication key
1034
- lightrag-server --key my-key
1035
-
1036
- # Using lollms for llm and ollama for embedding
1037
- lightrag-server --llm-binding lollms
1038
- ```
1039
-
1040
- #### Running a Lightrag server with lollms default local server as llm and embedding backends
1041
-
1042
- ```bash
1043
- # Run lightrag with lollms, mistral-nemo:latest for llm, and bge-m3:latest for embedding, use lollms for both llm and embedding
1044
- lightrag-server --llm-binding lollms --embedding-binding lollms
1045
-
1046
- # Using specific models (ensure they are installed in your ollama instance)
1047
- lightrag-server --llm-binding lollms --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-binding lollms --embedding-model nomic-embed-text --embedding-dim 1024
1048
-
1049
- # Using an authentication key
1050
- lightrag-server --key my-key
1051
-
1052
- # Using lollms for llm and openai for embedding
1053
- lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
1054
- ```
1055
-
1056
-
1057
- #### Running a Lightrag server with openai server as llm and embedding backends
1058
-
1059
- ```bash
1060
- # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
1061
- lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
1062
-
1063
- # Using an authentication key
1064
- lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small --key my-key
1065
-
1066
- # Using lollms for llm and openai for embedding
1067
- lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
1068
- ```
1069
-
1070
- #### Running a Lightrag server with azure openai server as llm and embedding backends
1071
-
1072
- ```bash
1073
- # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
1074
- lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
1075
-
1076
- # Using an authentication key
1077
- lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding azure_openai --embedding-model text-embedding-3-small --key my-key
1078
-
1079
- # Using lollms for llm and azure_openai for embedding
1080
- lightrag-server --llm-binding lollms --embedding-binding azure_openai --embedding-model text-embedding-3-small
1081
- ```
1082
-
1083
- **Important Notes:**
1084
- - For LoLLMs: Make sure the specified models are installed in your LoLLMs instance
1085
- - For Ollama: Make sure the specified models are installed in your Ollama instance
1086
- - For OpenAI: Ensure you have set up your OPENAI_API_KEY environment variable
1087
- - For Azure OpenAI: Build and configure your server as stated in the Prequisites section
1088
-
1089
- For help on any server, use the --help flag:
1090
- ```bash
1091
- lightrag-server --help
1092
- ```
1093
-
1094
- Note: If you don't need the API functionality, you can install the base package without API support using:
1095
- ```bash
1096
- pip install lightrag-hku
1097
- ```
1098
-
1099
- ## API Endpoints
1100
-
1101
- All servers (LoLLMs, Ollama, OpenAI and Azure OpenAI) provide the same REST API endpoints for RAG functionality.
1102
-
1103
- ### Query Endpoints
1104
-
1105
- #### POST /query
1106
- Query the RAG system with options for different search modes.
1107
-
1108
- ```bash
1109
- curl -X POST "http://localhost:9621/query" \
1110
- -H "Content-Type: application/json" \
1111
- -d '{"query": "Your question here", "mode": "hybrid", ""}'
1112
- ```
1113
-
1114
- #### POST /query/stream
1115
- Stream responses from the RAG system.
1116
-
1117
- ```bash
1118
- curl -X POST "http://localhost:9621/query/stream" \
1119
- -H "Content-Type: application/json" \
1120
- -d '{"query": "Your question here", "mode": "hybrid"}'
1121
- ```
1122
-
1123
- ### Document Management Endpoints
1124
-
1125
- #### POST /documents/text
1126
- Insert text directly into the RAG system.
1127
-
1128
- ```bash
1129
- curl -X POST "http://localhost:9621/documents/text" \
1130
- -H "Content-Type: application/json" \
1131
- -d '{"text": "Your text content here", "description": "Optional description"}'
1132
- ```
1133
-
1134
- #### POST /documents/file
1135
- Upload a single file to the RAG system.
1136
-
1137
- ```bash
1138
- curl -X POST "http://localhost:9621/documents/file" \
1139
- -F "file=@/path/to/your/document.txt" \
1140
- -F "description=Optional description"
1141
- ```
1142
-
1143
- #### POST /documents/batch
1144
- Upload multiple files at once.
1145
-
1146
- ```bash
1147
- curl -X POST "http://localhost:9621/documents/batch" \
1148
- -F "files=@/path/to/doc1.txt" \
1149
- -F "files=@/path/to/doc2.txt"
1150
- ```
1151
-
1152
- #### DELETE /documents
1153
- Clear all documents from the RAG system.
1154
-
1155
- ```bash
1156
- curl -X DELETE "http://localhost:9621/documents"
1157
- ```
1158
-
1159
- ### Utility Endpoints
1160
-
1161
- #### GET /health
1162
- Check server health and configuration.
1163
-
1164
- ```bash
1165
- curl "http://localhost:9621/health"
1166
- ```
1167
-
1168
- ## Development
1169
- Contribute to the project: [Guide](contributor-readme.MD)
1170
-
1171
- ### Running in Development Mode
1172
-
1173
- For LoLLMs:
1174
- ```bash
1175
- uvicorn lollms_lightrag_server:app --reload --port 9621
1176
- ```
1177
-
1178
- For Ollama:
1179
- ```bash
1180
- uvicorn ollama_lightrag_server:app --reload --port 9621
1181
- ```
1182
-
1183
- For OpenAI:
1184
- ```bash
1185
- uvicorn openai_lightrag_server:app --reload --port 9621
1186
- ```
1187
- For Azure OpenAI:
1188
- ```bash
1189
- uvicorn azure_openai_lightrag_server:app --reload --port 9621
1190
- ```
1191
- ### API Documentation
1192
-
1193
- When any server is running, visit:
1194
- - Swagger UI: http://localhost:9621/docs
1195
- - ReDoc: http://localhost:9621/redoc
1196
-
1197
- ### Testing API Endpoints
1198
-
1199
- You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to:
1200
- 1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI)
1201
- 2. Start the RAG server
1202
- 3. Upload some documents using the document management endpoints
1203
- 4. Query the system using the query endpoints
1204
-
1205
- ### Important Features
1206
-
1207
- #### Automatic Document Vectorization
1208
- When starting any of the servers with the `--input-dir` parameter, the system will automatically:
1209
- 1. Scan the specified directory for documents
1210
- 2. Check for existing vectorized content in the database
1211
- 3. Only vectorize new documents that aren't already in the database
1212
- 4. Make all content immediately available for RAG queries
1213
-
1214
- This intelligent caching mechanism:
1215
- - Prevents unnecessary re-vectorization of existing documents
1216
- - Reduces startup time for subsequent runs
1217
- - Preserves system resources
1218
- - Maintains consistency across restarts
1219
-
1220
- ### Example Usage
1221
-
1222
- #### LoLLMs RAG Server
1223
-
1224
- ```bash
1225
- # Start server with automatic document vectorization
1226
- # Only new documents will be vectorized, existing ones will be loaded from cache
1227
- lollms-lightrag-server --input-dir ./my_documents --port 8080
1228
- ```
1229
-
1230
- #### Ollama RAG Server
1231
-
1232
- ```bash
1233
- # Start server with automatic document vectorization
1234
- # Previously vectorized documents will be loaded from the database
1235
- ollama-lightrag-server --input-dir ./my_documents --port 8080
1236
- ```
1237
-
1238
- #### OpenAI RAG Server
1239
-
1240
- ```bash
1241
- # Start server with automatic document vectorization
1242
- # Existing documents are retrieved from cache, only new ones are processed
1243
- openai-lightrag-server --input-dir ./my_documents --port 9624
1244
- ```
1245
-
1246
- #### Azure OpenAI RAG Server
1247
-
1248
- ```bash
1249
- # Start server with automatic document vectorization
1250
- # Existing documents are retrieved from cache, only new ones are processed
1251
- azure-openai-lightrag-server --input-dir ./my_documents --port 9624
1252
- ```
1253
-
1254
- **Important Notes:**
1255
- - The `--input-dir` parameter enables automatic document processing at startup
1256
- - Documents already in the database are not re-vectorized
1257
- - Only new documents in the input directory will be processed
1258
- - This optimization significantly reduces startup time for subsequent runs
1259
- - The working directory (`--working-dir`) stores the vectorized documents database
1260
 
1261
  ## Star History
1262
 
 
921
  ```
922
  </details>
923
 
924
+ ## API
925
+ LightRag can be installed with API support to serve a Fast api interface to perform data upload and indexing/Rag operations/Rescan of the input folder etc..
926
 
927
+ The documentation can be found [https://github.com/ParisNeo/LightRAG/blob/main/docs/LightRagAPI.md](here)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
928
 
929
  ## Star History
930
 
docker-compose.yml ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: '3.8'
2
+
3
+ services:
4
+ lightrag:
5
+ build: .
6
+ ports:
7
+ - "${PORT:-9621}:9621"
8
+ volumes:
9
+ - ./data/rag_storage:/app/data/rag_storage
10
+ - ./data/inputs:/app/data/inputs
11
+ env_file:
12
+ - .env
13
+ environment:
14
+ - TZ=UTC
15
+ restart: unless-stopped
16
+ networks:
17
+ - lightrag_net
18
+ extra_hosts:
19
+ - "host.docker.internal:host-gateway"
20
+ networks:
21
+ lightrag_net:
22
+ driver: bridge
docs/DockerDeployment.md ADDED
@@ -0,0 +1,176 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LightRAG
2
+
3
+ A lightweight Knowledge Graph Retrieval-Augmented Generation system with multiple LLM backend support.
4
+
5
+ ## 🚀 Installation
6
+
7
+ ### Prerequisites
8
+ - Python 3.10+
9
+ - Git
10
+ - Docker (optional for Docker deployment)
11
+
12
+ ### Native Installation
13
+
14
+ 1. Clone the repository:
15
+ ```bash
16
+ # Linux/MacOS
17
+ git clone https://github.com/ParisNeo/LightRAG.git
18
+ cd LightRAG
19
+ ```
20
+ ```powershell
21
+ # Windows PowerShell
22
+ git clone https://github.com/ParisNeo/LightRAG.git
23
+ cd LightRAG
24
+ ```
25
+
26
+ 2. Configure your environment:
27
+ ```bash
28
+ # Linux/MacOS
29
+ cp .env.example .env
30
+ # Edit .env with your preferred configuration
31
+ ```
32
+ ```powershell
33
+ # Windows PowerShell
34
+ Copy-Item .env.example .env
35
+ # Edit .env with your preferred configuration
36
+ ```
37
+
38
+ 3. Create and activate virtual environment:
39
+ ```bash
40
+ # Linux/MacOS
41
+ python -m venv venv
42
+ source venv/bin/activate
43
+ ```
44
+ ```powershell
45
+ # Windows PowerShell
46
+ python -m venv venv
47
+ .\venv\Scripts\Activate
48
+ ```
49
+
50
+ 4. Install dependencies:
51
+ ```bash
52
+ # Both platforms
53
+ pip install -r requirements.txt
54
+ ```
55
+
56
+ ## 🐳 Docker Deployment
57
+
58
+ Docker instructions work the same on all platforms with Docker Desktop installed.
59
+
60
+ 1. Build and start the container:
61
+ ```bash
62
+ docker-compose up -d
63
+ ```
64
+
65
+ ### Configuration Options
66
+
67
+ LightRAG can be configured using environment variables in the `.env` file:
68
+
69
+ #### Server Configuration
70
+ - `HOST`: Server host (default: 0.0.0.0)
71
+ - `PORT`: Server port (default: 9621)
72
+
73
+ #### LLM Configuration
74
+ - `LLM_BINDING`: LLM backend to use (lollms/ollama/openai)
75
+ - `LLM_BINDING_HOST`: LLM server host URL
76
+ - `LLM_MODEL`: Model name to use
77
+
78
+ #### Embedding Configuration
79
+ - `EMBEDDING_BINDING`: Embedding backend (lollms/ollama/openai)
80
+ - `EMBEDDING_BINDING_HOST`: Embedding server host URL
81
+ - `EMBEDDING_MODEL`: Embedding model name
82
+
83
+ #### RAG Configuration
84
+ - `MAX_ASYNC`: Maximum async operations
85
+ - `MAX_TOKENS`: Maximum token size
86
+ - `EMBEDDING_DIM`: Embedding dimensions
87
+ - `MAX_EMBED_TOKENS`: Maximum embedding token size
88
+
89
+ #### Security
90
+ - `LIGHTRAG_API_KEY`: API key for authentication
91
+
92
+ ### Data Storage Paths
93
+
94
+ The system uses the following paths for data storage:
95
+ ```
96
+ data/
97
+ ├── rag_storage/ # RAG data persistence
98
+ └── inputs/ # Input documents
99
+ ```
100
+
101
+ ### Example Deployments
102
+
103
+ 1. Using with Ollama:
104
+ ```env
105
+ LLM_BINDING=ollama
106
+ LLM_BINDING_HOST=http://host.docker.internal:11434
107
+ LLM_MODEL=mistral
108
+ EMBEDDING_BINDING=ollama
109
+ EMBEDDING_BINDING_HOST=http://host.docker.internal:11434
110
+ EMBEDDING_MODEL=bge-m3
111
+ ```
112
+
113
+ you can't just use localhost from docker, that's why you need to use host.docker.internal which is defined in the docker compose file and should allow you to access the localhost services.
114
+
115
+ 2. Using with OpenAI:
116
+ ```env
117
+ LLM_BINDING=openai
118
+ LLM_MODEL=gpt-3.5-turbo
119
+ EMBEDDING_BINDING=openai
120
+ EMBEDDING_MODEL=text-embedding-ada-002
121
+ OPENAI_API_KEY=your-api-key
122
+ ```
123
+
124
+ ### API Usage
125
+
126
+ Once deployed, you can interact with the API at `http://localhost:9621`
127
+
128
+ Example query using PowerShell:
129
+ ```powershell
130
+ $headers = @{
131
+ "X-API-Key" = "your-api-key"
132
+ "Content-Type" = "application/json"
133
+ }
134
+ $body = @{
135
+ query = "your question here"
136
+ } | ConvertTo-Json
137
+
138
+ Invoke-RestMethod -Uri "http://localhost:9621/query" -Method Post -Headers $headers -Body $body
139
+ ```
140
+
141
+ Example query using curl:
142
+ ```bash
143
+ curl -X POST "http://localhost:9621/query" \
144
+ -H "X-API-Key: your-api-key" \
145
+ -H "Content-Type: application/json" \
146
+ -d '{"query": "your question here"}'
147
+ ```
148
+
149
+ ## 🔒 Security
150
+
151
+ Remember to:
152
+ 1. Set a strong API key in production
153
+ 2. Use SSL in production environments
154
+ 3. Configure proper network security
155
+
156
+ ## 📦 Updates
157
+
158
+ To update the Docker container:
159
+ ```bash
160
+ docker-compose pull
161
+ docker-compose up -d --build
162
+ ```
163
+
164
+ To update native installation:
165
+ ```bash
166
+ # Linux/MacOS
167
+ git pull
168
+ source venv/bin/activate
169
+ pip install -r requirements.txt
170
+ ```
171
+ ```powershell
172
+ # Windows PowerShell
173
+ git pull
174
+ .\venv\Scripts\Activate
175
+ pip install -r requirements.txt
176
+ ```
docs/LightRagAPI.md ADDED
@@ -0,0 +1,361 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Install with API Support
2
+
3
+ LightRAG provides optional API support through FastAPI servers that add RAG capabilities to existing LLM services. You can install LightRAG with API support in two ways:
4
+
5
+ ### 1. Installation from PyPI
6
+
7
+ ```bash
8
+ pip install "lightrag-hku[api]"
9
+ ```
10
+
11
+ ### 2. Installation from Source (Development)
12
+
13
+ ```bash
14
+ # Clone the repository
15
+ git clone https://github.com/HKUDS/lightrag.git
16
+
17
+ # Change to the repository directory
18
+ cd lightrag
19
+
20
+ # Install in editable mode with API support
21
+ pip install -e ".[api]"
22
+ ```
23
+
24
+ ### Prerequisites
25
+
26
+ Before running any of the servers, ensure you have the corresponding backend service running for both llm and embedding.
27
+ The new api allows you to mix different bindings for llm/embeddings.
28
+ For example, you have the possibility to use ollama for the embedding and openai for the llm.
29
+
30
+ #### For LoLLMs Server
31
+ - LoLLMs must be running and accessible
32
+ - Default connection: http://localhost:9600
33
+ - Configure using --llm-binding-host and/or --embedding-binding-host if running on a different host/port
34
+
35
+ #### For Ollama Server
36
+ - Ollama must be running and accessible
37
+ - Default connection: http://localhost:11434
38
+ - Configure using --ollama-host if running on a different host/port
39
+
40
+ #### For OpenAI Server
41
+ - Requires valid OpenAI API credentials set in environment variables
42
+ - OPENAI_API_KEY must be set
43
+
44
+ #### For Azure OpenAI Server
45
+ Azure OpenAI API can be created using the following commands in Azure CLI (you need to install Azure CLI first from [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)):
46
+ ```bash
47
+ # Change the resource group name, location and OpenAI resource name as needed
48
+ RESOURCE_GROUP_NAME=LightRAG
49
+ LOCATION=swedencentral
50
+ RESOURCE_NAME=LightRAG-OpenAI
51
+
52
+ az login
53
+ az group create --name $RESOURCE_GROUP_NAME --location $LOCATION
54
+ az cognitiveservices account create --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --kind OpenAI --sku S0 --location swedencentral
55
+ az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name gpt-4o --model-name gpt-4o --model-version "2024-08-06" --sku-capacity 100 --sku-name "Standard"
56
+ az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name text-embedding-3-large --model-name text-embedding-3-large --model-version "1" --sku-capacity 80 --sku-name "Standard"
57
+ az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --query "properties.endpoint"
58
+ az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_NAME
59
+
60
+ ```
61
+ The output of the last command will give you the endpoint and the key for the OpenAI API. You can use these values to set the environment variables in the `.env` file.
62
+
63
+
64
+ ## Configuration
65
+
66
+ LightRAG can be configured using either command-line arguments or environment variables. When both are provided, command-line arguments take precedence over environment variables.
67
+
68
+ ### Environment Variables
69
+
70
+ You can configure LightRAG using environment variables by creating a `.env` file in your project root directory. Here's a complete example of available environment variables:
71
+
72
+ ```env
73
+ # Server Configuration
74
+ HOST=0.0.0.0
75
+ PORT=9621
76
+
77
+ # Directory Configuration
78
+ WORKING_DIR=/app/data/rag_storage
79
+ INPUT_DIR=/app/data/inputs
80
+
81
+ # LLM Configuration
82
+ LLM_BINDING=ollama
83
+ LLM_BINDING_HOST=http://localhost:11434
84
+ LLM_MODEL=mistral-nemo:latest
85
+
86
+ # Embedding Configuration
87
+ EMBEDDING_BINDING=ollama
88
+ EMBEDDING_BINDING_HOST=http://localhost:11434
89
+ EMBEDDING_MODEL=bge-m3:latest
90
+
91
+ # RAG Configuration
92
+ MAX_ASYNC=4
93
+ MAX_TOKENS=32768
94
+ EMBEDDING_DIM=1024
95
+ MAX_EMBED_TOKENS=8192
96
+
97
+ # Security
98
+ LIGHTRAG_API_KEY=
99
+
100
+ # Logging
101
+ LOG_LEVEL=INFO
102
+
103
+ # Optional SSL Configuration
104
+ #SSL=true
105
+ #SSL_CERTFILE=/path/to/cert.pem
106
+ #SSL_KEYFILE=/path/to/key.pem
107
+
108
+ # Optional Timeout
109
+ #TIMEOUT=30
110
+ ```
111
+
112
+ ### Configuration Priority
113
+
114
+ The configuration values are loaded in the following order (highest priority first):
115
+ 1. Command-line arguments
116
+ 2. Environment variables
117
+ 3. Default values
118
+
119
+ For example:
120
+ ```bash
121
+ # This command-line argument will override both the environment variable and default value
122
+ python lightrag.py --port 8080
123
+
124
+ # The environment variable will override the default value but not the command-line argument
125
+ PORT=7000 python lightrag.py
126
+ ```
127
+
128
+ #### LightRag Server Options
129
+
130
+ | Parameter | Default | Description |
131
+ |-----------|---------|-------------|
132
+ | --host | 0.0.0.0 | Server host |
133
+ | --port | 9621 | Server port |
134
+ | --llm-binding | ollama | LLM binding to be used. Supported: lollms, ollama, openai |
135
+ | --llm-binding-host | (dynamic) | LLM server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
136
+ | --llm-model | mistral-nemo:latest | LLM model name |
137
+ | --embedding-binding | ollama | Embedding binding to be used. Supported: lollms, ollama, openai |
138
+ | --embedding-binding-host | (dynamic) | Embedding server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
139
+ | --embedding-model | bge-m3:latest | Embedding model name |
140
+ | --working-dir | ./rag_storage | Working directory for RAG storage |
141
+ | --input-dir | ./inputs | Directory containing input documents |
142
+ | --max-async | 4 | Maximum async operations |
143
+ | --max-tokens | 32768 | Maximum token size |
144
+ | --embedding-dim | 1024 | Embedding dimensions |
145
+ | --max-embed-tokens | 8192 | Maximum embedding token size |
146
+ | --timeout | None | Timeout in seconds (useful when using slow AI). Use None for infinite timeout |
147
+ | --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
148
+ | --key | None | API key for authentication. Protects lightrag server against unauthorized access |
149
+ | --ssl | False | Enable HTTPS |
150
+ | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
151
+ | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
152
+
153
+
154
+
155
+ For protecting the server using an authentication key, you can also use an environment variable named `LIGHTRAG_API_KEY`.
156
+ ### Example Usage
157
+
158
+ #### Running a Lightrag server with ollama default local server as llm and embedding backends
159
+
160
+ Ollama is the default backend for both llm and embedding, so by default you can run lightrag-server with no parameters and the default ones will be used. Make sure ollama is installed and is running and default models are already installed on ollama.
161
+
162
+ ```bash
163
+ # Run lightrag with ollama, mistral-nemo:latest for llm, and bge-m3:latest for embedding
164
+ lightrag-server
165
+
166
+ # Using specific models (ensure they are installed in your ollama instance)
167
+ lightrag-server --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-model nomic-embed-text --embedding-dim 1024
168
+
169
+ # Using an authentication key
170
+ lightrag-server --key my-key
171
+
172
+ # Using lollms for llm and ollama for embedding
173
+ lightrag-server --llm-binding lollms
174
+ ```
175
+
176
+ #### Running a Lightrag server with lollms default local server as llm and embedding backends
177
+
178
+ ```bash
179
+ # Run lightrag with lollms, mistral-nemo:latest for llm, and bge-m3:latest for embedding, use lollms for both llm and embedding
180
+ lightrag-server --llm-binding lollms --embedding-binding lollms
181
+
182
+ # Using specific models (ensure they are installed in your ollama instance)
183
+ lightrag-server --llm-binding lollms --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-binding lollms --embedding-model nomic-embed-text --embedding-dim 1024
184
+
185
+ # Using an authentication key
186
+ lightrag-server --key my-key
187
+
188
+ # Using lollms for llm and openai for embedding
189
+ lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
190
+ ```
191
+
192
+
193
+ #### Running a Lightrag server with openai server as llm and embedding backends
194
+
195
+ ```bash
196
+ # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
197
+ lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
198
+
199
+ # Using an authentication key
200
+ lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small --key my-key
201
+
202
+ # Using lollms for llm and openai for embedding
203
+ lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
204
+ ```
205
+
206
+ #### Running a Lightrag server with azure openai server as llm and embedding backends
207
+
208
+ ```bash
209
+ # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
210
+ lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
211
+
212
+ # Using an authentication key
213
+ lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding azure_openai --embedding-model text-embedding-3-small --key my-key
214
+
215
+ # Using lollms for llm and azure_openai for embedding
216
+ lightrag-server --llm-binding lollms --embedding-binding azure_openai --embedding-model text-embedding-3-small
217
+ ```
218
+
219
+ **Important Notes:**
220
+ - For LoLLMs: Make sure the specified models are installed in your LoLLMs instance
221
+ - For Ollama: Make sure the specified models are installed in your Ollama instance
222
+ - For OpenAI: Ensure you have set up your OPENAI_API_KEY environment variable
223
+ - For Azure OpenAI: Build and configure your server as stated in the Prequisites section
224
+
225
+ For help on any server, use the --help flag:
226
+ ```bash
227
+ lightrag-server --help
228
+ ```
229
+
230
+ Note: If you don't need the API functionality, you can install the base package without API support using:
231
+ ```bash
232
+ pip install lightrag-hku
233
+ ```
234
+
235
+ ## API Endpoints
236
+
237
+ All servers (LoLLMs, Ollama, OpenAI and Azure OpenAI) provide the same REST API endpoints for RAG functionality.
238
+
239
+ ### Query Endpoints
240
+
241
+ #### POST /query
242
+ Query the RAG system with options for different search modes.
243
+
244
+ ```bash
245
+ curl -X POST "http://localhost:9621/query" \
246
+ -H "Content-Type: application/json" \
247
+ -d '{"query": "Your question here", "mode": "hybrid", ""}'
248
+ ```
249
+
250
+ #### POST /query/stream
251
+ Stream responses from the RAG system.
252
+
253
+ ```bash
254
+ curl -X POST "http://localhost:9621/query/stream" \
255
+ -H "Content-Type: application/json" \
256
+ -d '{"query": "Your question here", "mode": "hybrid"}'
257
+ ```
258
+
259
+ ### Document Management Endpoints
260
+
261
+ #### POST /documents/text
262
+ Insert text directly into the RAG system.
263
+
264
+ ```bash
265
+ curl -X POST "http://localhost:9621/documents/text" \
266
+ -H "Content-Type: application/json" \
267
+ -d '{"text": "Your text content here", "description": "Optional description"}'
268
+ ```
269
+
270
+ #### POST /documents/file
271
+ Upload a single file to the RAG system.
272
+
273
+ ```bash
274
+ curl -X POST "http://localhost:9621/documents/file" \
275
+ -F "file=@/path/to/your/document.txt" \
276
+ -F "description=Optional description"
277
+ ```
278
+
279
+ #### POST /documents/batch
280
+ Upload multiple files at once.
281
+
282
+ ```bash
283
+ curl -X POST "http://localhost:9621/documents/batch" \
284
+ -F "files=@/path/to/doc1.txt" \
285
+ -F "files=@/path/to/doc2.txt"
286
+ ```
287
+
288
+ #### DELETE /documents
289
+ Clear all documents from the RAG system.
290
+
291
+ ```bash
292
+ curl -X DELETE "http://localhost:9621/documents"
293
+ ```
294
+
295
+ ### Utility Endpoints
296
+
297
+ #### GET /health
298
+ Check server health and configuration.
299
+
300
+ ```bash
301
+ curl "http://localhost:9621/health"
302
+ ```
303
+
304
+ ## Development
305
+ Contribute to the project: [Guide](contributor-readme.MD)
306
+
307
+ ### Running in Development Mode
308
+
309
+ For LoLLMs:
310
+ ```bash
311
+ uvicorn lollms_lightrag_server:app --reload --port 9621
312
+ ```
313
+
314
+ For Ollama:
315
+ ```bash
316
+ uvicorn ollama_lightrag_server:app --reload --port 9621
317
+ ```
318
+
319
+ For OpenAI:
320
+ ```bash
321
+ uvicorn openai_lightrag_server:app --reload --port 9621
322
+ ```
323
+ For Azure OpenAI:
324
+ ```bash
325
+ uvicorn azure_openai_lightrag_server:app --reload --port 9621
326
+ ```
327
+ ### API Documentation
328
+
329
+ When any server is running, visit:
330
+ - Swagger UI: http://localhost:9621/docs
331
+ - ReDoc: http://localhost:9621/redoc
332
+
333
+ ### Testing API Endpoints
334
+
335
+ You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to:
336
+ 1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI)
337
+ 2. Start the RAG server
338
+ 3. Upload some documents using the document management endpoints
339
+ 4. Query the system using the query endpoints
340
+
341
+ ### Important Features
342
+
343
+ #### Automatic Document Vectorization
344
+ When starting any of the servers with the `--input-dir` parameter, the system will automatically:
345
+ 1. Scan the specified directory for documents
346
+ 2. Check for existing vectorized content in the database
347
+ 3. Only vectorize new documents that aren't already in the database
348
+ 4. Make all content immediately available for RAG queries
349
+
350
+ This intelligent caching mechanism:
351
+ - Prevents unnecessary re-vectorization of existing documents
352
+ - Reduces startup time for subsequent runs
353
+ - Preserves system resources
354
+ - Maintains consistency across restarts
355
+
356
+ **Important Notes:**
357
+ - The `--input-dir` parameter enables automatic document processing at startup
358
+ - Documents already in the database are not re-vectorized
359
+ - Only new documents in the input directory will be processed
360
+ - This optimization significantly reduces startup time for subsequent runs
361
+ - The working directory (`--working-dir`) stores the vectorized documents database
lightrag/api/README.md ADDED
@@ -0,0 +1,361 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Install with API Support
2
+
3
+ LightRAG provides optional API support through FastAPI servers that add RAG capabilities to existing LLM services. You can install LightRAG with API support in two ways:
4
+
5
+ ### 1. Installation from PyPI
6
+
7
+ ```bash
8
+ pip install "lightrag-hku[api]"
9
+ ```
10
+
11
+ ### 2. Installation from Source (Development)
12
+
13
+ ```bash
14
+ # Clone the repository
15
+ git clone https://github.com/HKUDS/lightrag.git
16
+
17
+ # Change to the repository directory
18
+ cd lightrag
19
+
20
+ # Install in editable mode with API support
21
+ pip install -e ".[api]"
22
+ ```
23
+
24
+ ### Prerequisites
25
+
26
+ Before running any of the servers, ensure you have the corresponding backend service running for both llm and embedding.
27
+ The new api allows you to mix different bindings for llm/embeddings.
28
+ For example, you have the possibility to use ollama for the embedding and openai for the llm.
29
+
30
+ #### For LoLLMs Server
31
+ - LoLLMs must be running and accessible
32
+ - Default connection: http://localhost:9600
33
+ - Configure using --llm-binding-host and/or --embedding-binding-host if running on a different host/port
34
+
35
+ #### For Ollama Server
36
+ - Ollama must be running and accessible
37
+ - Default connection: http://localhost:11434
38
+ - Configure using --ollama-host if running on a different host/port
39
+
40
+ #### For OpenAI Server
41
+ - Requires valid OpenAI API credentials set in environment variables
42
+ - OPENAI_API_KEY must be set
43
+
44
+ #### For Azure OpenAI Server
45
+ Azure OpenAI API can be created using the following commands in Azure CLI (you need to install Azure CLI first from [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)):
46
+ ```bash
47
+ # Change the resource group name, location and OpenAI resource name as needed
48
+ RESOURCE_GROUP_NAME=LightRAG
49
+ LOCATION=swedencentral
50
+ RESOURCE_NAME=LightRAG-OpenAI
51
+
52
+ az login
53
+ az group create --name $RESOURCE_GROUP_NAME --location $LOCATION
54
+ az cognitiveservices account create --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --kind OpenAI --sku S0 --location swedencentral
55
+ az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name gpt-4o --model-name gpt-4o --model-version "2024-08-06" --sku-capacity 100 --sku-name "Standard"
56
+ az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name text-embedding-3-large --model-name text-embedding-3-large --model-version "1" --sku-capacity 80 --sku-name "Standard"
57
+ az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --query "properties.endpoint"
58
+ az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_NAME
59
+
60
+ ```
61
+ The output of the last command will give you the endpoint and the key for the OpenAI API. You can use these values to set the environment variables in the `.env` file.
62
+
63
+
64
+ ## Configuration
65
+
66
+ LightRAG can be configured using either command-line arguments or environment variables. When both are provided, command-line arguments take precedence over environment variables.
67
+
68
+ ### Environment Variables
69
+
70
+ You can configure LightRAG using environment variables by creating a `.env` file in your project root directory. Here's a complete example of available environment variables:
71
+
72
+ ```env
73
+ # Server Configuration
74
+ HOST=0.0.0.0
75
+ PORT=9621
76
+
77
+ # Directory Configuration
78
+ WORKING_DIR=/app/data/rag_storage
79
+ INPUT_DIR=/app/data/inputs
80
+
81
+ # LLM Configuration
82
+ LLM_BINDING=ollama
83
+ LLM_BINDING_HOST=http://localhost:11434
84
+ LLM_MODEL=mistral-nemo:latest
85
+
86
+ # Embedding Configuration
87
+ EMBEDDING_BINDING=ollama
88
+ EMBEDDING_BINDING_HOST=http://localhost:11434
89
+ EMBEDDING_MODEL=bge-m3:latest
90
+
91
+ # RAG Configuration
92
+ MAX_ASYNC=4
93
+ MAX_TOKENS=32768
94
+ EMBEDDING_DIM=1024
95
+ MAX_EMBED_TOKENS=8192
96
+
97
+ # Security
98
+ LIGHTRAG_API_KEY=
99
+
100
+ # Logging
101
+ LOG_LEVEL=INFO
102
+
103
+ # Optional SSL Configuration
104
+ #SSL=true
105
+ #SSL_CERTFILE=/path/to/cert.pem
106
+ #SSL_KEYFILE=/path/to/key.pem
107
+
108
+ # Optional Timeout
109
+ #TIMEOUT=30
110
+ ```
111
+
112
+ ### Configuration Priority
113
+
114
+ The configuration values are loaded in the following order (highest priority first):
115
+ 1. Command-line arguments
116
+ 2. Environment variables
117
+ 3. Default values
118
+
119
+ For example:
120
+ ```bash
121
+ # This command-line argument will override both the environment variable and default value
122
+ python lightrag.py --port 8080
123
+
124
+ # The environment variable will override the default value but not the command-line argument
125
+ PORT=7000 python lightrag.py
126
+ ```
127
+
128
+ #### LightRag Server Options
129
+
130
+ | Parameter | Default | Description |
131
+ |-----------|---------|-------------|
132
+ | --host | 0.0.0.0 | Server host |
133
+ | --port | 9621 | Server port |
134
+ | --llm-binding | ollama | LLM binding to be used. Supported: lollms, ollama, openai |
135
+ | --llm-binding-host | (dynamic) | LLM server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
136
+ | --llm-model | mistral-nemo:latest | LLM model name |
137
+ | --embedding-binding | ollama | Embedding binding to be used. Supported: lollms, ollama, openai |
138
+ | --embedding-binding-host | (dynamic) | Embedding server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
139
+ | --embedding-model | bge-m3:latest | Embedding model name |
140
+ | --working-dir | ./rag_storage | Working directory for RAG storage |
141
+ | --input-dir | ./inputs | Directory containing input documents |
142
+ | --max-async | 4 | Maximum async operations |
143
+ | --max-tokens | 32768 | Maximum token size |
144
+ | --embedding-dim | 1024 | Embedding dimensions |
145
+ | --max-embed-tokens | 8192 | Maximum embedding token size |
146
+ | --timeout | None | Timeout in seconds (useful when using slow AI). Use None for infinite timeout |
147
+ | --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
148
+ | --key | None | API key for authentication. Protects lightrag server against unauthorized access |
149
+ | --ssl | False | Enable HTTPS |
150
+ | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
151
+ | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
152
+
153
+
154
+
155
+ For protecting the server using an authentication key, you can also use an environment variable named `LIGHTRAG_API_KEY`.
156
+ ### Example Usage
157
+
158
+ #### Running a Lightrag server with ollama default local server as llm and embedding backends
159
+
160
+ Ollama is the default backend for both llm and embedding, so by default you can run lightrag-server with no parameters and the default ones will be used. Make sure ollama is installed and is running and default models are already installed on ollama.
161
+
162
+ ```bash
163
+ # Run lightrag with ollama, mistral-nemo:latest for llm, and bge-m3:latest for embedding
164
+ lightrag-server
165
+
166
+ # Using specific models (ensure they are installed in your ollama instance)
167
+ lightrag-server --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-model nomic-embed-text --embedding-dim 1024
168
+
169
+ # Using an authentication key
170
+ lightrag-server --key my-key
171
+
172
+ # Using lollms for llm and ollama for embedding
173
+ lightrag-server --llm-binding lollms
174
+ ```
175
+
176
+ #### Running a Lightrag server with lollms default local server as llm and embedding backends
177
+
178
+ ```bash
179
+ # Run lightrag with lollms, mistral-nemo:latest for llm, and bge-m3:latest for embedding, use lollms for both llm and embedding
180
+ lightrag-server --llm-binding lollms --embedding-binding lollms
181
+
182
+ # Using specific models (ensure they are installed in your ollama instance)
183
+ lightrag-server --llm-binding lollms --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-binding lollms --embedding-model nomic-embed-text --embedding-dim 1024
184
+
185
+ # Using an authentication key
186
+ lightrag-server --key my-key
187
+
188
+ # Using lollms for llm and openai for embedding
189
+ lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
190
+ ```
191
+
192
+
193
+ #### Running a Lightrag server with openai server as llm and embedding backends
194
+
195
+ ```bash
196
+ # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
197
+ lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
198
+
199
+ # Using an authentication key
200
+ lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small --key my-key
201
+
202
+ # Using lollms for llm and openai for embedding
203
+ lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
204
+ ```
205
+
206
+ #### Running a Lightrag server with azure openai server as llm and embedding backends
207
+
208
+ ```bash
209
+ # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
210
+ lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
211
+
212
+ # Using an authentication key
213
+ lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding azure_openai --embedding-model text-embedding-3-small --key my-key
214
+
215
+ # Using lollms for llm and azure_openai for embedding
216
+ lightrag-server --llm-binding lollms --embedding-binding azure_openai --embedding-model text-embedding-3-small
217
+ ```
218
+
219
+ **Important Notes:**
220
+ - For LoLLMs: Make sure the specified models are installed in your LoLLMs instance
221
+ - For Ollama: Make sure the specified models are installed in your Ollama instance
222
+ - For OpenAI: Ensure you have set up your OPENAI_API_KEY environment variable
223
+ - For Azure OpenAI: Build and configure your server as stated in the Prequisites section
224
+
225
+ For help on any server, use the --help flag:
226
+ ```bash
227
+ lightrag-server --help
228
+ ```
229
+
230
+ Note: If you don't need the API functionality, you can install the base package without API support using:
231
+ ```bash
232
+ pip install lightrag-hku
233
+ ```
234
+
235
+ ## API Endpoints
236
+
237
+ All servers (LoLLMs, Ollama, OpenAI and Azure OpenAI) provide the same REST API endpoints for RAG functionality.
238
+
239
+ ### Query Endpoints
240
+
241
+ #### POST /query
242
+ Query the RAG system with options for different search modes.
243
+
244
+ ```bash
245
+ curl -X POST "http://localhost:9621/query" \
246
+ -H "Content-Type: application/json" \
247
+ -d '{"query": "Your question here", "mode": "hybrid", ""}'
248
+ ```
249
+
250
+ #### POST /query/stream
251
+ Stream responses from the RAG system.
252
+
253
+ ```bash
254
+ curl -X POST "http://localhost:9621/query/stream" \
255
+ -H "Content-Type: application/json" \
256
+ -d '{"query": "Your question here", "mode": "hybrid"}'
257
+ ```
258
+
259
+ ### Document Management Endpoints
260
+
261
+ #### POST /documents/text
262
+ Insert text directly into the RAG system.
263
+
264
+ ```bash
265
+ curl -X POST "http://localhost:9621/documents/text" \
266
+ -H "Content-Type: application/json" \
267
+ -d '{"text": "Your text content here", "description": "Optional description"}'
268
+ ```
269
+
270
+ #### POST /documents/file
271
+ Upload a single file to the RAG system.
272
+
273
+ ```bash
274
+ curl -X POST "http://localhost:9621/documents/file" \
275
+ -F "file=@/path/to/your/document.txt" \
276
+ -F "description=Optional description"
277
+ ```
278
+
279
+ #### POST /documents/batch
280
+ Upload multiple files at once.
281
+
282
+ ```bash
283
+ curl -X POST "http://localhost:9621/documents/batch" \
284
+ -F "files=@/path/to/doc1.txt" \
285
+ -F "files=@/path/to/doc2.txt"
286
+ ```
287
+
288
+ #### DELETE /documents
289
+ Clear all documents from the RAG system.
290
+
291
+ ```bash
292
+ curl -X DELETE "http://localhost:9621/documents"
293
+ ```
294
+
295
+ ### Utility Endpoints
296
+
297
+ #### GET /health
298
+ Check server health and configuration.
299
+
300
+ ```bash
301
+ curl "http://localhost:9621/health"
302
+ ```
303
+
304
+ ## Development
305
+ Contribute to the project: [Guide](contributor-readme.MD)
306
+
307
+ ### Running in Development Mode
308
+
309
+ For LoLLMs:
310
+ ```bash
311
+ uvicorn lollms_lightrag_server:app --reload --port 9621
312
+ ```
313
+
314
+ For Ollama:
315
+ ```bash
316
+ uvicorn ollama_lightrag_server:app --reload --port 9621
317
+ ```
318
+
319
+ For OpenAI:
320
+ ```bash
321
+ uvicorn openai_lightrag_server:app --reload --port 9621
322
+ ```
323
+ For Azure OpenAI:
324
+ ```bash
325
+ uvicorn azure_openai_lightrag_server:app --reload --port 9621
326
+ ```
327
+ ### API Documentation
328
+
329
+ When any server is running, visit:
330
+ - Swagger UI: http://localhost:9621/docs
331
+ - ReDoc: http://localhost:9621/redoc
332
+
333
+ ### Testing API Endpoints
334
+
335
+ You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to:
336
+ 1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI)
337
+ 2. Start the RAG server
338
+ 3. Upload some documents using the document management endpoints
339
+ 4. Query the system using the query endpoints
340
+
341
+ ### Important Features
342
+
343
+ #### Automatic Document Vectorization
344
+ When starting any of the servers with the `--input-dir` parameter, the system will automatically:
345
+ 1. Scan the specified directory for documents
346
+ 2. Check for existing vectorized content in the database
347
+ 3. Only vectorize new documents that aren't already in the database
348
+ 4. Make all content immediately available for RAG queries
349
+
350
+ This intelligent caching mechanism:
351
+ - Prevents unnecessary re-vectorization of existing documents
352
+ - Reduces startup time for subsequent runs
353
+ - Preserves system resources
354
+ - Maintains consistency across restarts
355
+
356
+ **Important Notes:**
357
+ - The `--input-dir` parameter enables automatic document processing at startup
358
+ - Documents already in the database are not re-vectorized
359
+ - Only new documents in the input directory will be processed
360
+ - This optimization significantly reduces startup time for subsequent runs
361
+ - The working directory (`--working-dir`) stores the vectorized documents database
lightrag/api/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ __api_version__ = "1.0.0"
lightrag/api/lightrag_server.py CHANGED
@@ -7,23 +7,27 @@ from lightrag.llm import lollms_model_complete, lollms_embed
7
  from lightrag.llm import ollama_model_complete, ollama_embed
8
  from lightrag.llm import openai_complete_if_cache, openai_embedding
9
  from lightrag.llm import azure_openai_complete_if_cache, azure_openai_embedding
 
10
 
11
  from lightrag.utils import EmbeddingFunc
12
- from typing import Optional, List, Union
13
  from enum import Enum
14
  from pathlib import Path
15
  import shutil
16
  import aiofiles
17
- from ascii_colors import trace_exception
18
  import os
19
 
20
  from fastapi import Depends, Security
21
  from fastapi.security import APIKeyHeader
22
  from fastapi.middleware.cors import CORSMiddleware
 
23
 
24
  from starlette.status import HTTP_403_FORBIDDEN
25
  import pipmaster as pm
26
 
 
 
27
 
28
  def get_default_host(binding_type: str) -> str:
29
  default_hosts = {
@@ -37,73 +41,256 @@ def get_default_host(binding_type: str) -> str:
37
  ) # fallback to ollama if unknown
38
 
39
 
40
- def parse_args():
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  parser = argparse.ArgumentParser(
42
  description="LightRAG FastAPI Server with separate working and input directories"
43
  )
44
 
45
- # Start by the bindings
46
  parser.add_argument(
47
  "--llm-binding",
48
- default="ollama",
49
- help="LLM binding to be used. Supported: lollms, ollama, openai (default: ollama)",
50
  )
51
  parser.add_argument(
52
  "--embedding-binding",
53
- default="ollama",
54
- help="Embedding binding to be used. Supported: lollms, ollama, openai (default: ollama)",
55
  )
56
 
57
- # Parse just these arguments first
58
  temp_args, _ = parser.parse_known_args()
59
 
60
- # Add remaining arguments with dynamic defaults for hosts
61
  # Server configuration
62
  parser.add_argument(
63
- "--host", default="0.0.0.0", help="Server host (default: 0.0.0.0)"
 
 
64
  )
65
  parser.add_argument(
66
- "--port", type=int, default=9621, help="Server port (default: 9621)"
 
 
 
67
  )
68
 
69
  # Directory configuration
70
  parser.add_argument(
71
  "--working-dir",
72
- default="./rag_storage",
73
- help="Working directory for RAG storage (default: ./rag_storage)",
74
  )
75
  parser.add_argument(
76
  "--input-dir",
77
- default="./inputs",
78
- help="Directory containing input documents (default: ./inputs)",
79
  )
80
 
81
  # LLM Model configuration
82
- default_llm_host = get_default_host(temp_args.llm_binding)
 
 
83
  parser.add_argument(
84
  "--llm-binding-host",
85
  default=default_llm_host,
86
- help=f"llm server host URL (default: {default_llm_host})",
87
  )
88
 
89
  parser.add_argument(
90
  "--llm-model",
91
- default="mistral-nemo:latest",
92
- help="LLM model name (default: mistral-nemo:latest)",
93
  )
94
 
95
  # Embedding model configuration
96
- default_embedding_host = get_default_host(temp_args.embedding_binding)
 
 
97
  parser.add_argument(
98
  "--embedding-binding-host",
99
  default=default_embedding_host,
100
- help=f"embedding server host URL (default: {default_embedding_host})",
101
  )
102
 
103
  parser.add_argument(
104
  "--embedding-model",
105
- default="bge-m3:latest",
106
- help="Embedding model name (default: bge-m3:latest)",
107
  )
108
 
109
  def timeout_type(value):
@@ -113,63 +300,74 @@ def parse_args():
113
 
114
  parser.add_argument(
115
  "--timeout",
116
- default=None,
117
  type=timeout_type,
118
  help="Timeout in seconds (useful when using slow AI). Use None for infinite timeout",
119
  )
 
120
  # RAG configuration
121
  parser.add_argument(
122
- "--max-async", type=int, default=4, help="Maximum async operations (default: 4)"
 
 
 
123
  )
124
  parser.add_argument(
125
  "--max-tokens",
126
  type=int,
127
- default=32768,
128
- help="Maximum token size (default: 32768)",
129
  )
130
  parser.add_argument(
131
  "--embedding-dim",
132
  type=int,
133
- default=1024,
134
- help="Embedding dimensions (default: 1024)",
135
  )
136
  parser.add_argument(
137
  "--max-embed-tokens",
138
  type=int,
139
- default=8192,
140
- help="Maximum embedding token size (default: 8192)",
141
  )
142
 
143
  # Logging configuration
144
  parser.add_argument(
145
  "--log-level",
146
- default="INFO",
147
  choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
148
- help="Logging level (default: INFO)",
149
  )
150
 
151
  parser.add_argument(
152
  "--key",
153
  type=str,
 
154
  help="API key for authentication. This protects lightrag server against unauthorized access",
155
- default=None,
156
  )
157
 
158
  # Optional https parameters
159
  parser.add_argument(
160
- "--ssl", action="store_true", help="Enable HTTPS (default: False)"
 
 
 
161
  )
162
  parser.add_argument(
163
  "--ssl-certfile",
164
- default=None,
165
  help="Path to SSL certificate file (required if --ssl is enabled)",
166
  )
167
  parser.add_argument(
168
  "--ssl-keyfile",
169
- default=None,
170
  help="Path to SSL private key file (required if --ssl is enabled)",
171
  )
172
- return parser.parse_args()
 
 
 
 
173
 
174
 
175
  class DocumentManager:
@@ -435,9 +633,10 @@ def create_app(args):
435
  else:
436
  logging.warning(f"No content extracted from file: {file_path}")
437
 
438
- @app.on_event("startup")
439
- async def startup_event():
440
- """Index all files in input directory during startup"""
 
441
  try:
442
  new_files = doc_manager.scan_directory()
443
  for file_path in new_files:
@@ -448,7 +647,6 @@ def create_app(args):
448
  logging.error(f"Error indexing file {file_path}: {str(e)}")
449
 
450
  logging.info(f"Indexed {len(new_files)} documents from {args.input_dir}")
451
-
452
  except Exception as e:
453
  logging.error(f"Error during startup indexing: {str(e)}")
454
 
@@ -521,6 +719,7 @@ def create_app(args):
521
  else:
522
  return QueryResponse(response=response)
523
  except Exception as e:
 
524
  raise HTTPException(status_code=500, detail=str(e))
525
 
526
  @app.post("/query/stream", dependencies=[Depends(optional_api_key)])
 
7
  from lightrag.llm import ollama_model_complete, ollama_embed
8
  from lightrag.llm import openai_complete_if_cache, openai_embedding
9
  from lightrag.llm import azure_openai_complete_if_cache, azure_openai_embedding
10
+ from lightrag.api import __api_version__
11
 
12
  from lightrag.utils import EmbeddingFunc
13
+ from typing import Optional, List, Union, Any
14
  from enum import Enum
15
  from pathlib import Path
16
  import shutil
17
  import aiofiles
18
+ from ascii_colors import trace_exception, ASCIIColors
19
  import os
20
 
21
  from fastapi import Depends, Security
22
  from fastapi.security import APIKeyHeader
23
  from fastapi.middleware.cors import CORSMiddleware
24
+ from contextlib import asynccontextmanager
25
 
26
  from starlette.status import HTTP_403_FORBIDDEN
27
  import pipmaster as pm
28
 
29
+ from dotenv import load_dotenv
30
+
31
 
32
  def get_default_host(binding_type: str) -> str:
33
  default_hosts = {
 
41
  ) # fallback to ollama if unknown
42
 
43
 
44
+ def get_env_value(env_key: str, default: Any, value_type: type = str) -> Any:
45
+ """
46
+ Get value from environment variable with type conversion
47
+
48
+ Args:
49
+ env_key (str): Environment variable key
50
+ default (Any): Default value if env variable is not set
51
+ value_type (type): Type to convert the value to
52
+
53
+ Returns:
54
+ Any: Converted value from environment or default
55
+ """
56
+ value = os.getenv(env_key)
57
+ if value is None:
58
+ return default
59
+
60
+ if isinstance(value_type, bool):
61
+ return value.lower() in ("true", "1", "yes")
62
+ try:
63
+ return value_type(value)
64
+ except ValueError:
65
+ return default
66
+
67
+
68
+ def display_splash_screen(args: argparse.Namespace) -> None:
69
+ """
70
+ Display a colorful splash screen showing LightRAG server configuration
71
+
72
+ Args:
73
+ args: Parsed command line arguments
74
+ """
75
+ # Banner
76
+ ASCIIColors.cyan(f"""
77
+ ╔══════════════════════════════════════════════════════════════╗
78
+ ║ 🚀 LightRAG Server v{__api_version__} ║
79
+ ║ Fast, Lightweight RAG Server Implementation ║
80
+ ╚══════════════════════════════════════════════════════════════╝
81
+ """)
82
+
83
+ # Server Configuration
84
+ ASCIIColors.magenta("\n📡 Server Configuration:")
85
+ ASCIIColors.white(" ├─ Host: ", end="")
86
+ ASCIIColors.yellow(f"{args.host}")
87
+ ASCIIColors.white(" ├─ Port: ", end="")
88
+ ASCIIColors.yellow(f"{args.port}")
89
+ ASCIIColors.white(" ├─ SSL Enabled: ", end="")
90
+ ASCIIColors.yellow(f"{args.ssl}")
91
+ if args.ssl:
92
+ ASCIIColors.white(" ├─ SSL Cert: ", end="")
93
+ ASCIIColors.yellow(f"{args.ssl_certfile}")
94
+ ASCIIColors.white(" └─ SSL Key: ", end="")
95
+ ASCIIColors.yellow(f"{args.ssl_keyfile}")
96
+
97
+ # Directory Configuration
98
+ ASCIIColors.magenta("\n📂 Directory Configuration:")
99
+ ASCIIColors.white(" ├─ Working Directory: ", end="")
100
+ ASCIIColors.yellow(f"{args.working_dir}")
101
+ ASCIIColors.white(" └─ Input Directory: ", end="")
102
+ ASCIIColors.yellow(f"{args.input_dir}")
103
+
104
+ # LLM Configuration
105
+ ASCIIColors.magenta("\n🤖 LLM Configuration:")
106
+ ASCIIColors.white(" ├─ Binding: ", end="")
107
+ ASCIIColors.yellow(f"{args.llm_binding}")
108
+ ASCIIColors.white(" ├─ Host: ", end="")
109
+ ASCIIColors.yellow(f"{args.llm_binding_host}")
110
+ ASCIIColors.white(" └─ Model: ", end="")
111
+ ASCIIColors.yellow(f"{args.llm_model}")
112
+
113
+ # Embedding Configuration
114
+ ASCIIColors.magenta("\n📊 Embedding Configuration:")
115
+ ASCIIColors.white(" ├─ Binding: ", end="")
116
+ ASCIIColors.yellow(f"{args.embedding_binding}")
117
+ ASCIIColors.white(" ├─ Host: ", end="")
118
+ ASCIIColors.yellow(f"{args.embedding_binding_host}")
119
+ ASCIIColors.white(" ├─ Model: ", end="")
120
+ ASCIIColors.yellow(f"{args.embedding_model}")
121
+ ASCIIColors.white(" └─ Dimensions: ", end="")
122
+ ASCIIColors.yellow(f"{args.embedding_dim}")
123
+
124
+ # RAG Configuration
125
+ ASCIIColors.magenta("\n⚙️ RAG Configuration:")
126
+ ASCIIColors.white(" ├─ Max Async Operations: ", end="")
127
+ ASCIIColors.yellow(f"{args.max_async}")
128
+ ASCIIColors.white(" ├─ Max Tokens: ", end="")
129
+ ASCIIColors.yellow(f"{args.max_tokens}")
130
+ ASCIIColors.white(" └─ Max Embed Tokens: ", end="")
131
+ ASCIIColors.yellow(f"{args.max_embed_tokens}")
132
+
133
+ # System Configuration
134
+ ASCIIColors.magenta("\n🛠️ System Configuration:")
135
+ ASCIIColors.white(" ├─ Log Level: ", end="")
136
+ ASCIIColors.yellow(f"{args.log_level}")
137
+ ASCIIColors.white(" ├─ Timeout: ", end="")
138
+ ASCIIColors.yellow(f"{args.timeout if args.timeout else 'None (infinite)'}")
139
+ ASCIIColors.white(" └─ API Key: ", end="")
140
+ ASCIIColors.yellow("Set" if args.key else "Not Set")
141
+
142
+ # Server Status
143
+ ASCIIColors.green("\n✨ Server starting up...\n")
144
+
145
+ # Server Access Information
146
+ protocol = "https" if args.ssl else "http"
147
+ if args.host == "0.0.0.0":
148
+ ASCIIColors.magenta("\n🌐 Server Access Information:")
149
+ ASCIIColors.white(" ├─ Local Access: ", end="")
150
+ ASCIIColors.yellow(f"{protocol}://localhost:{args.port}")
151
+ ASCIIColors.white(" ├─ Remote Access: ", end="")
152
+ ASCIIColors.yellow(f"{protocol}://<your-ip-address>:{args.port}")
153
+ ASCIIColors.white(" ├─ API Documentation (local): ", end="")
154
+ ASCIIColors.yellow(f"{protocol}://localhost:{args.port}/docs")
155
+ ASCIIColors.white(" └─ Alternative Documentation (local): ", end="")
156
+ ASCIIColors.yellow(f"{protocol}://localhost:{args.port}/redoc")
157
+
158
+ ASCIIColors.yellow("\n📝 Note:")
159
+ ASCIIColors.white(""" Since the server is running on 0.0.0.0:
160
+ - Use 'localhost' or '127.0.0.1' for local access
161
+ - Use your machine's IP address for remote access
162
+ - To find your IP address:
163
+ • Windows: Run 'ipconfig' in terminal
164
+ • Linux/Mac: Run 'ifconfig' or 'ip addr' in terminal
165
+ """)
166
+ else:
167
+ base_url = f"{protocol}://{args.host}:{args.port}"
168
+ ASCIIColors.magenta("\n🌐 Server Access Information:")
169
+ ASCIIColors.white(" ├─ Base URL: ", end="")
170
+ ASCIIColors.yellow(f"{base_url}")
171
+ ASCIIColors.white(" ├─ API Documentation: ", end="")
172
+ ASCIIColors.yellow(f"{base_url}/docs")
173
+ ASCIIColors.white(" └─ Alternative Documentation: ", end="")
174
+ ASCIIColors.yellow(f"{base_url}/redoc")
175
+
176
+ # Usage Examples
177
+ ASCIIColors.magenta("\n📚 Quick Start Guide:")
178
+ ASCIIColors.cyan("""
179
+ 1. Access the Swagger UI:
180
+ Open your browser and navigate to the API documentation URL above
181
+
182
+ 2. API Authentication:""")
183
+ if args.key:
184
+ ASCIIColors.cyan(""" Add the following header to your requests:
185
+ X-API-Key: <your-api-key>
186
+ """)
187
+ else:
188
+ ASCIIColors.cyan(" No authentication required\n")
189
+
190
+ ASCIIColors.cyan(""" 3. Basic Operations:
191
+ - POST /upload_document: Upload new documents to RAG
192
+ - POST /query: Query your document collection
193
+ - GET /collections: List available collections
194
+
195
+ 4. Monitor the server:
196
+ - Check server logs for detailed operation information
197
+ - Use healthcheck endpoint: GET /health
198
+ """)
199
+
200
+ # Security Notice
201
+ if args.key:
202
+ ASCIIColors.yellow("\n⚠️ Security Notice:")
203
+ ASCIIColors.white(""" API Key authentication is enabled.
204
+ Make sure to include the X-API-Key header in all your requests.
205
+ """)
206
+
207
+ ASCIIColors.green("Server is ready to accept connections! 🚀\n")
208
+
209
+
210
+ def parse_args() -> argparse.Namespace:
211
+ """
212
+ Parse command line arguments with environment variable fallback
213
+
214
+ Returns:
215
+ argparse.Namespace: Parsed arguments
216
+ """
217
+ # Load environment variables from .env file
218
+ load_dotenv()
219
+
220
  parser = argparse.ArgumentParser(
221
  description="LightRAG FastAPI Server with separate working and input directories"
222
  )
223
 
224
+ # Bindings (with env var support)
225
  parser.add_argument(
226
  "--llm-binding",
227
+ default=get_env_value("LLM_BINDING", "ollama"),
228
+ help="LLM binding to be used. Supported: lollms, ollama, openai (default: from env or ollama)",
229
  )
230
  parser.add_argument(
231
  "--embedding-binding",
232
+ default=get_env_value("EMBEDDING_BINDING", "ollama"),
233
+ help="Embedding binding to be used. Supported: lollms, ollama, openai (default: from env or ollama)",
234
  )
235
 
236
+ # Parse temporary args for host defaults
237
  temp_args, _ = parser.parse_known_args()
238
 
 
239
  # Server configuration
240
  parser.add_argument(
241
+ "--host",
242
+ default=get_env_value("HOST", "0.0.0.0"),
243
+ help="Server host (default: from env or 0.0.0.0)",
244
  )
245
  parser.add_argument(
246
+ "--port",
247
+ type=int,
248
+ default=get_env_value("PORT", 9621, int),
249
+ help="Server port (default: from env or 9621)",
250
  )
251
 
252
  # Directory configuration
253
  parser.add_argument(
254
  "--working-dir",
255
+ default=get_env_value("WORKING_DIR", "./rag_storage"),
256
+ help="Working directory for RAG storage (default: from env or ./rag_storage)",
257
  )
258
  parser.add_argument(
259
  "--input-dir",
260
+ default=get_env_value("INPUT_DIR", "./inputs"),
261
+ help="Directory containing input documents (default: from env or ./inputs)",
262
  )
263
 
264
  # LLM Model configuration
265
+ default_llm_host = get_env_value(
266
+ "LLM_BINDING_HOST", get_default_host(temp_args.llm_binding)
267
+ )
268
  parser.add_argument(
269
  "--llm-binding-host",
270
  default=default_llm_host,
271
+ help=f"llm server host URL (default: from env or {default_llm_host})",
272
  )
273
 
274
  parser.add_argument(
275
  "--llm-model",
276
+ default=get_env_value("LLM_MODEL", "mistral-nemo:latest"),
277
+ help="LLM model name (default: from env or mistral-nemo:latest)",
278
  )
279
 
280
  # Embedding model configuration
281
+ default_embedding_host = get_env_value(
282
+ "EMBEDDING_BINDING_HOST", get_default_host(temp_args.embedding_binding)
283
+ )
284
  parser.add_argument(
285
  "--embedding-binding-host",
286
  default=default_embedding_host,
287
+ help=f"embedding server host URL (default: from env or {default_embedding_host})",
288
  )
289
 
290
  parser.add_argument(
291
  "--embedding-model",
292
+ default=get_env_value("EMBEDDING_MODEL", "bge-m3:latest"),
293
+ help="Embedding model name (default: from env or bge-m3:latest)",
294
  )
295
 
296
  def timeout_type(value):
 
300
 
301
  parser.add_argument(
302
  "--timeout",
303
+ default=get_env_value("TIMEOUT", None, timeout_type),
304
  type=timeout_type,
305
  help="Timeout in seconds (useful when using slow AI). Use None for infinite timeout",
306
  )
307
+
308
  # RAG configuration
309
  parser.add_argument(
310
+ "--max-async",
311
+ type=int,
312
+ default=get_env_value("MAX_ASYNC", 4, int),
313
+ help="Maximum async operations (default: from env or 4)",
314
  )
315
  parser.add_argument(
316
  "--max-tokens",
317
  type=int,
318
+ default=get_env_value("MAX_TOKENS", 32768, int),
319
+ help="Maximum token size (default: from env or 32768)",
320
  )
321
  parser.add_argument(
322
  "--embedding-dim",
323
  type=int,
324
+ default=get_env_value("EMBEDDING_DIM", 1024, int),
325
+ help="Embedding dimensions (default: from env or 1024)",
326
  )
327
  parser.add_argument(
328
  "--max-embed-tokens",
329
  type=int,
330
+ default=get_env_value("MAX_EMBED_TOKENS", 8192, int),
331
+ help="Maximum embedding token size (default: from env or 8192)",
332
  )
333
 
334
  # Logging configuration
335
  parser.add_argument(
336
  "--log-level",
337
+ default=get_env_value("LOG_LEVEL", "INFO"),
338
  choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
339
+ help="Logging level (default: from env or INFO)",
340
  )
341
 
342
  parser.add_argument(
343
  "--key",
344
  type=str,
345
+ default=get_env_value("LIGHTRAG_API_KEY", None),
346
  help="API key for authentication. This protects lightrag server against unauthorized access",
 
347
  )
348
 
349
  # Optional https parameters
350
  parser.add_argument(
351
+ "--ssl",
352
+ action="store_true",
353
+ default=get_env_value("SSL", False, bool),
354
+ help="Enable HTTPS (default: from env or False)",
355
  )
356
  parser.add_argument(
357
  "--ssl-certfile",
358
+ default=get_env_value("SSL_CERTFILE", None),
359
  help="Path to SSL certificate file (required if --ssl is enabled)",
360
  )
361
  parser.add_argument(
362
  "--ssl-keyfile",
363
+ default=get_env_value("SSL_KEYFILE", None),
364
  help="Path to SSL private key file (required if --ssl is enabled)",
365
  )
366
+
367
+ args = parser.parse_args()
368
+ display_splash_screen(args)
369
+
370
+ return args
371
 
372
 
373
  class DocumentManager:
 
633
  else:
634
  logging.warning(f"No content extracted from file: {file_path}")
635
 
636
+ @asynccontextmanager
637
+ async def lifespan(app: FastAPI):
638
+ """Lifespan context manager for startup and shutdown events"""
639
+ # Startup logic
640
  try:
641
  new_files = doc_manager.scan_directory()
642
  for file_path in new_files:
 
647
  logging.error(f"Error indexing file {file_path}: {str(e)}")
648
 
649
  logging.info(f"Indexed {len(new_files)} documents from {args.input_dir}")
 
650
  except Exception as e:
651
  logging.error(f"Error during startup indexing: {str(e)}")
652
 
 
719
  else:
720
  return QueryResponse(response=response)
721
  except Exception as e:
722
+ trace_exception(e)
723
  raise HTTPException(status_code=500, detail=str(e))
724
 
725
  @app.post("/query/stream", dependencies=[Depends(optional_api_key)])