ParisNeo commited on
Commit
2efbee2
·
1 Parent(s): 41b651b

Moving extended api documentation to new doc folder

Browse files
Files changed (3) hide show
  1. README.md +0 -335
  2. docs/LightRagAPI.md +302 -0
  3. lightrag/api/README.md +302 -0
README.md CHANGED
@@ -921,342 +921,7 @@ def extract_queries(file_path):
921
  ```
922
  </details>
923
 
924
- ## Install with API Support
925
 
926
- LightRAG provides optional API support through FastAPI servers that add RAG capabilities to existing LLM services. You can install LightRAG with API support in two ways:
927
-
928
- ### 1. Installation from PyPI
929
-
930
- ```bash
931
- pip install "lightrag-hku[api]"
932
- ```
933
-
934
- ### 2. Installation from Source (Development)
935
-
936
- ```bash
937
- # Clone the repository
938
- git clone https://github.com/HKUDS/lightrag.git
939
-
940
- # Change to the repository directory
941
- cd lightrag
942
-
943
- # Install in editable mode with API support
944
- pip install -e ".[api]"
945
- ```
946
-
947
- ### Prerequisites
948
-
949
- Before running any of the servers, ensure you have the corresponding backend service running for both llm and embedding.
950
- The new api allows you to mix different bindings for llm/embeddings.
951
- For example, you have the possibility to use ollama for the embedding and openai for the llm.
952
-
953
- #### For LoLLMs Server
954
- - LoLLMs must be running and accessible
955
- - Default connection: http://localhost:9600
956
- - Configure using --llm-binding-host and/or --embedding-binding-host if running on a different host/port
957
-
958
- #### For Ollama Server
959
- - Ollama must be running and accessible
960
- - Default connection: http://localhost:11434
961
- - Configure using --ollama-host if running on a different host/port
962
-
963
- #### For OpenAI Server
964
- - Requires valid OpenAI API credentials set in environment variables
965
- - OPENAI_API_KEY must be set
966
-
967
- #### For Azure OpenAI Server
968
- Azure OpenAI API can be created using the following commands in Azure CLI (you need to install Azure CLI first from [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)):
969
- ```bash
970
- # Change the resource group name, location and OpenAI resource name as needed
971
- RESOURCE_GROUP_NAME=LightRAG
972
- LOCATION=swedencentral
973
- RESOURCE_NAME=LightRAG-OpenAI
974
-
975
- az login
976
- az group create --name $RESOURCE_GROUP_NAME --location $LOCATION
977
- az cognitiveservices account create --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --kind OpenAI --sku S0 --location swedencentral
978
- az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name gpt-4o --model-name gpt-4o --model-version "2024-08-06" --sku-capacity 100 --sku-name "Standard"
979
- az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name text-embedding-3-large --model-name text-embedding-3-large --model-version "1" --sku-capacity 80 --sku-name "Standard"
980
- az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --query "properties.endpoint"
981
- az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_NAME
982
-
983
- ```
984
- The output of the last command will give you the endpoint and the key for the OpenAI API. You can use these values to set the environment variables in the `.env` file.
985
-
986
-
987
-
988
- ### Configuration Options
989
-
990
- Each server has its own specific configuration options:
991
-
992
- #### LightRag Server Options
993
-
994
- | Parameter | Default | Description |
995
- |-----------|---------|-------------|
996
- | --host | 0.0.0.0 | Server host |
997
- | --port | 9621 | Server port |
998
- | --llm-binding | ollama | LLM binding to be used. Supported: lollms, ollama, openai |
999
- | --llm-binding-host | (dynamic) | LLM server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
1000
- | --llm-model | mistral-nemo:latest | LLM model name |
1001
- | --embedding-binding | ollama | Embedding binding to be used. Supported: lollms, ollama, openai |
1002
- | --embedding-binding-host | (dynamic) | Embedding server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
1003
- | --embedding-model | bge-m3:latest | Embedding model name |
1004
- | --working-dir | ./rag_storage | Working directory for RAG storage |
1005
- | --input-dir | ./inputs | Directory containing input documents |
1006
- | --max-async | 4 | Maximum async operations |
1007
- | --max-tokens | 32768 | Maximum token size |
1008
- | --embedding-dim | 1024 | Embedding dimensions |
1009
- | --max-embed-tokens | 8192 | Maximum embedding token size |
1010
- | --timeout | None | Timeout in seconds (useful when using slow AI). Use None for infinite timeout |
1011
- | --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
1012
- | --key | None | API key for authentication. Protects lightrag server against unauthorized access |
1013
- | --ssl | False | Enable HTTPS |
1014
- | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
1015
- | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
1016
-
1017
-
1018
-
1019
- For protecting the server using an authentication key, you can also use an environment variable named `LIGHTRAG_API_KEY`.
1020
- ### Example Usage
1021
-
1022
- #### Running a Lightrag server with ollama default local server as llm and embedding backends
1023
-
1024
- Ollama is the default backend for both llm and embedding, so by default you can run lightrag-server with no parameters and the default ones will be used. Make sure ollama is installed and is running and default models are already installed on ollama.
1025
-
1026
- ```bash
1027
- # Run lightrag with ollama, mistral-nemo:latest for llm, and bge-m3:latest for embedding
1028
- lightrag-server
1029
-
1030
- # Using specific models (ensure they are installed in your ollama instance)
1031
- lightrag-server --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-model nomic-embed-text --embedding-dim 1024
1032
-
1033
- # Using an authentication key
1034
- lightrag-server --key my-key
1035
-
1036
- # Using lollms for llm and ollama for embedding
1037
- lightrag-server --llm-binding lollms
1038
- ```
1039
-
1040
- #### Running a Lightrag server with lollms default local server as llm and embedding backends
1041
-
1042
- ```bash
1043
- # Run lightrag with lollms, mistral-nemo:latest for llm, and bge-m3:latest for embedding, use lollms for both llm and embedding
1044
- lightrag-server --llm-binding lollms --embedding-binding lollms
1045
-
1046
- # Using specific models (ensure they are installed in your ollama instance)
1047
- lightrag-server --llm-binding lollms --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-binding lollms --embedding-model nomic-embed-text --embedding-dim 1024
1048
-
1049
- # Using an authentication key
1050
- lightrag-server --key my-key
1051
-
1052
- # Using lollms for llm and openai for embedding
1053
- lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
1054
- ```
1055
-
1056
-
1057
- #### Running a Lightrag server with openai server as llm and embedding backends
1058
-
1059
- ```bash
1060
- # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
1061
- lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
1062
-
1063
- # Using an authentication key
1064
- lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small --key my-key
1065
-
1066
- # Using lollms for llm and openai for embedding
1067
- lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
1068
- ```
1069
-
1070
- #### Running a Lightrag server with azure openai server as llm and embedding backends
1071
-
1072
- ```bash
1073
- # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
1074
- lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
1075
-
1076
- # Using an authentication key
1077
- lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding azure_openai --embedding-model text-embedding-3-small --key my-key
1078
-
1079
- # Using lollms for llm and azure_openai for embedding
1080
- lightrag-server --llm-binding lollms --embedding-binding azure_openai --embedding-model text-embedding-3-small
1081
- ```
1082
-
1083
- **Important Notes:**
1084
- - For LoLLMs: Make sure the specified models are installed in your LoLLMs instance
1085
- - For Ollama: Make sure the specified models are installed in your Ollama instance
1086
- - For OpenAI: Ensure you have set up your OPENAI_API_KEY environment variable
1087
- - For Azure OpenAI: Build and configure your server as stated in the Prequisites section
1088
-
1089
- For help on any server, use the --help flag:
1090
- ```bash
1091
- lightrag-server --help
1092
- ```
1093
-
1094
- Note: If you don't need the API functionality, you can install the base package without API support using:
1095
- ```bash
1096
- pip install lightrag-hku
1097
- ```
1098
-
1099
- ## API Endpoints
1100
-
1101
- All servers (LoLLMs, Ollama, OpenAI and Azure OpenAI) provide the same REST API endpoints for RAG functionality.
1102
-
1103
- ### Query Endpoints
1104
-
1105
- #### POST /query
1106
- Query the RAG system with options for different search modes.
1107
-
1108
- ```bash
1109
- curl -X POST "http://localhost:9621/query" \
1110
- -H "Content-Type: application/json" \
1111
- -d '{"query": "Your question here", "mode": "hybrid", ""}'
1112
- ```
1113
-
1114
- #### POST /query/stream
1115
- Stream responses from the RAG system.
1116
-
1117
- ```bash
1118
- curl -X POST "http://localhost:9621/query/stream" \
1119
- -H "Content-Type: application/json" \
1120
- -d '{"query": "Your question here", "mode": "hybrid"}'
1121
- ```
1122
-
1123
- ### Document Management Endpoints
1124
-
1125
- #### POST /documents/text
1126
- Insert text directly into the RAG system.
1127
-
1128
- ```bash
1129
- curl -X POST "http://localhost:9621/documents/text" \
1130
- -H "Content-Type: application/json" \
1131
- -d '{"text": "Your text content here", "description": "Optional description"}'
1132
- ```
1133
-
1134
- #### POST /documents/file
1135
- Upload a single file to the RAG system.
1136
-
1137
- ```bash
1138
- curl -X POST "http://localhost:9621/documents/file" \
1139
- -F "file=@/path/to/your/document.txt" \
1140
- -F "description=Optional description"
1141
- ```
1142
-
1143
- #### POST /documents/batch
1144
- Upload multiple files at once.
1145
-
1146
- ```bash
1147
- curl -X POST "http://localhost:9621/documents/batch" \
1148
- -F "files=@/path/to/doc1.txt" \
1149
- -F "files=@/path/to/doc2.txt"
1150
- ```
1151
-
1152
- #### DELETE /documents
1153
- Clear all documents from the RAG system.
1154
-
1155
- ```bash
1156
- curl -X DELETE "http://localhost:9621/documents"
1157
- ```
1158
-
1159
- ### Utility Endpoints
1160
-
1161
- #### GET /health
1162
- Check server health and configuration.
1163
-
1164
- ```bash
1165
- curl "http://localhost:9621/health"
1166
- ```
1167
-
1168
- ## Development
1169
- Contribute to the project: [Guide](contributor-readme.MD)
1170
-
1171
- ### Running in Development Mode
1172
-
1173
- For LoLLMs:
1174
- ```bash
1175
- uvicorn lollms_lightrag_server:app --reload --port 9621
1176
- ```
1177
-
1178
- For Ollama:
1179
- ```bash
1180
- uvicorn ollama_lightrag_server:app --reload --port 9621
1181
- ```
1182
-
1183
- For OpenAI:
1184
- ```bash
1185
- uvicorn openai_lightrag_server:app --reload --port 9621
1186
- ```
1187
- For Azure OpenAI:
1188
- ```bash
1189
- uvicorn azure_openai_lightrag_server:app --reload --port 9621
1190
- ```
1191
- ### API Documentation
1192
-
1193
- When any server is running, visit:
1194
- - Swagger UI: http://localhost:9621/docs
1195
- - ReDoc: http://localhost:9621/redoc
1196
-
1197
- ### Testing API Endpoints
1198
-
1199
- You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to:
1200
- 1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI)
1201
- 2. Start the RAG server
1202
- 3. Upload some documents using the document management endpoints
1203
- 4. Query the system using the query endpoints
1204
-
1205
- ### Important Features
1206
-
1207
- #### Automatic Document Vectorization
1208
- When starting any of the servers with the `--input-dir` parameter, the system will automatically:
1209
- 1. Scan the specified directory for documents
1210
- 2. Check for existing vectorized content in the database
1211
- 3. Only vectorize new documents that aren't already in the database
1212
- 4. Make all content immediately available for RAG queries
1213
-
1214
- This intelligent caching mechanism:
1215
- - Prevents unnecessary re-vectorization of existing documents
1216
- - Reduces startup time for subsequent runs
1217
- - Preserves system resources
1218
- - Maintains consistency across restarts
1219
-
1220
- ### Example Usage
1221
-
1222
- #### LoLLMs RAG Server
1223
-
1224
- ```bash
1225
- # Start server with automatic document vectorization
1226
- # Only new documents will be vectorized, existing ones will be loaded from cache
1227
- lollms-lightrag-server --input-dir ./my_documents --port 8080
1228
- ```
1229
-
1230
- #### Ollama RAG Server
1231
-
1232
- ```bash
1233
- # Start server with automatic document vectorization
1234
- # Previously vectorized documents will be loaded from the database
1235
- ollama-lightrag-server --input-dir ./my_documents --port 8080
1236
- ```
1237
-
1238
- #### OpenAI RAG Server
1239
-
1240
- ```bash
1241
- # Start server with automatic document vectorization
1242
- # Existing documents are retrieved from cache, only new ones are processed
1243
- openai-lightrag-server --input-dir ./my_documents --port 9624
1244
- ```
1245
-
1246
- #### Azure OpenAI RAG Server
1247
-
1248
- ```bash
1249
- # Start server with automatic document vectorization
1250
- # Existing documents are retrieved from cache, only new ones are processed
1251
- azure-openai-lightrag-server --input-dir ./my_documents --port 9624
1252
- ```
1253
-
1254
- **Important Notes:**
1255
- - The `--input-dir` parameter enables automatic document processing at startup
1256
- - Documents already in the database are not re-vectorized
1257
- - Only new documents in the input directory will be processed
1258
- - This optimization significantly reduces startup time for subsequent runs
1259
- - The working directory (`--working-dir`) stores the vectorized documents database
1260
 
1261
  ## Star History
1262
 
 
921
  ```
922
  </details>
923
 
 
924
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
925
 
926
  ## Star History
927
 
docs/LightRagAPI.md ADDED
@@ -0,0 +1,302 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Install with API Support
2
+
3
+ LightRAG provides optional API support through FastAPI servers that add RAG capabilities to existing LLM services. You can install LightRAG with API support in two ways:
4
+
5
+ ### 1. Installation from PyPI
6
+
7
+ ```bash
8
+ pip install "lightrag-hku[api]"
9
+ ```
10
+
11
+ ### 2. Installation from Source (Development)
12
+
13
+ ```bash
14
+ # Clone the repository
15
+ git clone https://github.com/HKUDS/lightrag.git
16
+
17
+ # Change to the repository directory
18
+ cd lightrag
19
+
20
+ # Install in editable mode with API support
21
+ pip install -e ".[api]"
22
+ ```
23
+
24
+ ### Prerequisites
25
+
26
+ Before running any of the servers, ensure you have the corresponding backend service running for both llm and embedding.
27
+ The new api allows you to mix different bindings for llm/embeddings.
28
+ For example, you have the possibility to use ollama for the embedding and openai for the llm.
29
+
30
+ #### For LoLLMs Server
31
+ - LoLLMs must be running and accessible
32
+ - Default connection: http://localhost:9600
33
+ - Configure using --llm-binding-host and/or --embedding-binding-host if running on a different host/port
34
+
35
+ #### For Ollama Server
36
+ - Ollama must be running and accessible
37
+ - Default connection: http://localhost:11434
38
+ - Configure using --ollama-host if running on a different host/port
39
+
40
+ #### For OpenAI Server
41
+ - Requires valid OpenAI API credentials set in environment variables
42
+ - OPENAI_API_KEY must be set
43
+
44
+ #### For Azure OpenAI Server
45
+ Azure OpenAI API can be created using the following commands in Azure CLI (you need to install Azure CLI first from [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)):
46
+ ```bash
47
+ # Change the resource group name, location and OpenAI resource name as needed
48
+ RESOURCE_GROUP_NAME=LightRAG
49
+ LOCATION=swedencentral
50
+ RESOURCE_NAME=LightRAG-OpenAI
51
+
52
+ az login
53
+ az group create --name $RESOURCE_GROUP_NAME --location $LOCATION
54
+ az cognitiveservices account create --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --kind OpenAI --sku S0 --location swedencentral
55
+ az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name gpt-4o --model-name gpt-4o --model-version "2024-08-06" --sku-capacity 100 --sku-name "Standard"
56
+ az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name text-embedding-3-large --model-name text-embedding-3-large --model-version "1" --sku-capacity 80 --sku-name "Standard"
57
+ az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --query "properties.endpoint"
58
+ az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_NAME
59
+
60
+ ```
61
+ The output of the last command will give you the endpoint and the key for the OpenAI API. You can use these values to set the environment variables in the `.env` file.
62
+
63
+
64
+
65
+ ### Configuration Options
66
+
67
+ Each server has its own specific configuration options:
68
+
69
+ #### LightRag Server Options
70
+
71
+ | Parameter | Default | Description |
72
+ |-----------|---------|-------------|
73
+ | --host | 0.0.0.0 | Server host |
74
+ | --port | 9621 | Server port |
75
+ | --llm-binding | ollama | LLM binding to be used. Supported: lollms, ollama, openai |
76
+ | --llm-binding-host | (dynamic) | LLM server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
77
+ | --llm-model | mistral-nemo:latest | LLM model name |
78
+ | --embedding-binding | ollama | Embedding binding to be used. Supported: lollms, ollama, openai |
79
+ | --embedding-binding-host | (dynamic) | Embedding server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
80
+ | --embedding-model | bge-m3:latest | Embedding model name |
81
+ | --working-dir | ./rag_storage | Working directory for RAG storage |
82
+ | --input-dir | ./inputs | Directory containing input documents |
83
+ | --max-async | 4 | Maximum async operations |
84
+ | --max-tokens | 32768 | Maximum token size |
85
+ | --embedding-dim | 1024 | Embedding dimensions |
86
+ | --max-embed-tokens | 8192 | Maximum embedding token size |
87
+ | --timeout | None | Timeout in seconds (useful when using slow AI). Use None for infinite timeout |
88
+ | --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
89
+ | --key | None | API key for authentication. Protects lightrag server against unauthorized access |
90
+ | --ssl | False | Enable HTTPS |
91
+ | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
92
+ | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
93
+
94
+
95
+
96
+ For protecting the server using an authentication key, you can also use an environment variable named `LIGHTRAG_API_KEY`.
97
+ ### Example Usage
98
+
99
+ #### Running a Lightrag server with ollama default local server as llm and embedding backends
100
+
101
+ Ollama is the default backend for both llm and embedding, so by default you can run lightrag-server with no parameters and the default ones will be used. Make sure ollama is installed and is running and default models are already installed on ollama.
102
+
103
+ ```bash
104
+ # Run lightrag with ollama, mistral-nemo:latest for llm, and bge-m3:latest for embedding
105
+ lightrag-server
106
+
107
+ # Using specific models (ensure they are installed in your ollama instance)
108
+ lightrag-server --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-model nomic-embed-text --embedding-dim 1024
109
+
110
+ # Using an authentication key
111
+ lightrag-server --key my-key
112
+
113
+ # Using lollms for llm and ollama for embedding
114
+ lightrag-server --llm-binding lollms
115
+ ```
116
+
117
+ #### Running a Lightrag server with lollms default local server as llm and embedding backends
118
+
119
+ ```bash
120
+ # Run lightrag with lollms, mistral-nemo:latest for llm, and bge-m3:latest for embedding, use lollms for both llm and embedding
121
+ lightrag-server --llm-binding lollms --embedding-binding lollms
122
+
123
+ # Using specific models (ensure they are installed in your ollama instance)
124
+ lightrag-server --llm-binding lollms --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-binding lollms --embedding-model nomic-embed-text --embedding-dim 1024
125
+
126
+ # Using an authentication key
127
+ lightrag-server --key my-key
128
+
129
+ # Using lollms for llm and openai for embedding
130
+ lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
131
+ ```
132
+
133
+
134
+ #### Running a Lightrag server with openai server as llm and embedding backends
135
+
136
+ ```bash
137
+ # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
138
+ lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
139
+
140
+ # Using an authentication key
141
+ lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small --key my-key
142
+
143
+ # Using lollms for llm and openai for embedding
144
+ lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
145
+ ```
146
+
147
+ #### Running a Lightrag server with azure openai server as llm and embedding backends
148
+
149
+ ```bash
150
+ # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
151
+ lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
152
+
153
+ # Using an authentication key
154
+ lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding azure_openai --embedding-model text-embedding-3-small --key my-key
155
+
156
+ # Using lollms for llm and azure_openai for embedding
157
+ lightrag-server --llm-binding lollms --embedding-binding azure_openai --embedding-model text-embedding-3-small
158
+ ```
159
+
160
+ **Important Notes:**
161
+ - For LoLLMs: Make sure the specified models are installed in your LoLLMs instance
162
+ - For Ollama: Make sure the specified models are installed in your Ollama instance
163
+ - For OpenAI: Ensure you have set up your OPENAI_API_KEY environment variable
164
+ - For Azure OpenAI: Build and configure your server as stated in the Prequisites section
165
+
166
+ For help on any server, use the --help flag:
167
+ ```bash
168
+ lightrag-server --help
169
+ ```
170
+
171
+ Note: If you don't need the API functionality, you can install the base package without API support using:
172
+ ```bash
173
+ pip install lightrag-hku
174
+ ```
175
+
176
+ ## API Endpoints
177
+
178
+ All servers (LoLLMs, Ollama, OpenAI and Azure OpenAI) provide the same REST API endpoints for RAG functionality.
179
+
180
+ ### Query Endpoints
181
+
182
+ #### POST /query
183
+ Query the RAG system with options for different search modes.
184
+
185
+ ```bash
186
+ curl -X POST "http://localhost:9621/query" \
187
+ -H "Content-Type: application/json" \
188
+ -d '{"query": "Your question here", "mode": "hybrid", ""}'
189
+ ```
190
+
191
+ #### POST /query/stream
192
+ Stream responses from the RAG system.
193
+
194
+ ```bash
195
+ curl -X POST "http://localhost:9621/query/stream" \
196
+ -H "Content-Type: application/json" \
197
+ -d '{"query": "Your question here", "mode": "hybrid"}'
198
+ ```
199
+
200
+ ### Document Management Endpoints
201
+
202
+ #### POST /documents/text
203
+ Insert text directly into the RAG system.
204
+
205
+ ```bash
206
+ curl -X POST "http://localhost:9621/documents/text" \
207
+ -H "Content-Type: application/json" \
208
+ -d '{"text": "Your text content here", "description": "Optional description"}'
209
+ ```
210
+
211
+ #### POST /documents/file
212
+ Upload a single file to the RAG system.
213
+
214
+ ```bash
215
+ curl -X POST "http://localhost:9621/documents/file" \
216
+ -F "file=@/path/to/your/document.txt" \
217
+ -F "description=Optional description"
218
+ ```
219
+
220
+ #### POST /documents/batch
221
+ Upload multiple files at once.
222
+
223
+ ```bash
224
+ curl -X POST "http://localhost:9621/documents/batch" \
225
+ -F "files=@/path/to/doc1.txt" \
226
+ -F "files=@/path/to/doc2.txt"
227
+ ```
228
+
229
+ #### DELETE /documents
230
+ Clear all documents from the RAG system.
231
+
232
+ ```bash
233
+ curl -X DELETE "http://localhost:9621/documents"
234
+ ```
235
+
236
+ ### Utility Endpoints
237
+
238
+ #### GET /health
239
+ Check server health and configuration.
240
+
241
+ ```bash
242
+ curl "http://localhost:9621/health"
243
+ ```
244
+
245
+ ## Development
246
+ Contribute to the project: [Guide](contributor-readme.MD)
247
+
248
+ ### Running in Development Mode
249
+
250
+ For LoLLMs:
251
+ ```bash
252
+ uvicorn lollms_lightrag_server:app --reload --port 9621
253
+ ```
254
+
255
+ For Ollama:
256
+ ```bash
257
+ uvicorn ollama_lightrag_server:app --reload --port 9621
258
+ ```
259
+
260
+ For OpenAI:
261
+ ```bash
262
+ uvicorn openai_lightrag_server:app --reload --port 9621
263
+ ```
264
+ For Azure OpenAI:
265
+ ```bash
266
+ uvicorn azure_openai_lightrag_server:app --reload --port 9621
267
+ ```
268
+ ### API Documentation
269
+
270
+ When any server is running, visit:
271
+ - Swagger UI: http://localhost:9621/docs
272
+ - ReDoc: http://localhost:9621/redoc
273
+
274
+ ### Testing API Endpoints
275
+
276
+ You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to:
277
+ 1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI)
278
+ 2. Start the RAG server
279
+ 3. Upload some documents using the document management endpoints
280
+ 4. Query the system using the query endpoints
281
+
282
+ ### Important Features
283
+
284
+ #### Automatic Document Vectorization
285
+ When starting any of the servers with the `--input-dir` parameter, the system will automatically:
286
+ 1. Scan the specified directory for documents
287
+ 2. Check for existing vectorized content in the database
288
+ 3. Only vectorize new documents that aren't already in the database
289
+ 4. Make all content immediately available for RAG queries
290
+
291
+ This intelligent caching mechanism:
292
+ - Prevents unnecessary re-vectorization of existing documents
293
+ - Reduces startup time for subsequent runs
294
+ - Preserves system resources
295
+ - Maintains consistency across restarts
296
+
297
+ **Important Notes:**
298
+ - The `--input-dir` parameter enables automatic document processing at startup
299
+ - Documents already in the database are not re-vectorized
300
+ - Only new documents in the input directory will be processed
301
+ - This optimization significantly reduces startup time for subsequent runs
302
+ - The working directory (`--working-dir`) stores the vectorized documents database
lightrag/api/README.md ADDED
@@ -0,0 +1,302 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Install with API Support
2
+
3
+ LightRAG provides optional API support through FastAPI servers that add RAG capabilities to existing LLM services. You can install LightRAG with API support in two ways:
4
+
5
+ ### 1. Installation from PyPI
6
+
7
+ ```bash
8
+ pip install "lightrag-hku[api]"
9
+ ```
10
+
11
+ ### 2. Installation from Source (Development)
12
+
13
+ ```bash
14
+ # Clone the repository
15
+ git clone https://github.com/HKUDS/lightrag.git
16
+
17
+ # Change to the repository directory
18
+ cd lightrag
19
+
20
+ # Install in editable mode with API support
21
+ pip install -e ".[api]"
22
+ ```
23
+
24
+ ### Prerequisites
25
+
26
+ Before running any of the servers, ensure you have the corresponding backend service running for both llm and embedding.
27
+ The new api allows you to mix different bindings for llm/embeddings.
28
+ For example, you have the possibility to use ollama for the embedding and openai for the llm.
29
+
30
+ #### For LoLLMs Server
31
+ - LoLLMs must be running and accessible
32
+ - Default connection: http://localhost:9600
33
+ - Configure using --llm-binding-host and/or --embedding-binding-host if running on a different host/port
34
+
35
+ #### For Ollama Server
36
+ - Ollama must be running and accessible
37
+ - Default connection: http://localhost:11434
38
+ - Configure using --ollama-host if running on a different host/port
39
+
40
+ #### For OpenAI Server
41
+ - Requires valid OpenAI API credentials set in environment variables
42
+ - OPENAI_API_KEY must be set
43
+
44
+ #### For Azure OpenAI Server
45
+ Azure OpenAI API can be created using the following commands in Azure CLI (you need to install Azure CLI first from [https://docs.microsoft.com/en-us/cli/azure/install-azure-cli](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli)):
46
+ ```bash
47
+ # Change the resource group name, location and OpenAI resource name as needed
48
+ RESOURCE_GROUP_NAME=LightRAG
49
+ LOCATION=swedencentral
50
+ RESOURCE_NAME=LightRAG-OpenAI
51
+
52
+ az login
53
+ az group create --name $RESOURCE_GROUP_NAME --location $LOCATION
54
+ az cognitiveservices account create --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --kind OpenAI --sku S0 --location swedencentral
55
+ az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name gpt-4o --model-name gpt-4o --model-version "2024-08-06" --sku-capacity 100 --sku-name "Standard"
56
+ az cognitiveservices account deployment create --resource-group $RESOURCE_GROUP_NAME --model-format OpenAI --name $RESOURCE_NAME --deployment-name text-embedding-3-large --model-name text-embedding-3-large --model-version "1" --sku-capacity 80 --sku-name "Standard"
57
+ az cognitiveservices account show --name $RESOURCE_NAME --resource-group $RESOURCE_GROUP_NAME --query "properties.endpoint"
58
+ az cognitiveservices account keys list --name $RESOURCE_NAME -g $RESOURCE_GROUP_NAME
59
+
60
+ ```
61
+ The output of the last command will give you the endpoint and the key for the OpenAI API. You can use these values to set the environment variables in the `.env` file.
62
+
63
+
64
+
65
+ ### Configuration Options
66
+
67
+ Each server has its own specific configuration options:
68
+
69
+ #### LightRag Server Options
70
+
71
+ | Parameter | Default | Description |
72
+ |-----------|---------|-------------|
73
+ | --host | 0.0.0.0 | Server host |
74
+ | --port | 9621 | Server port |
75
+ | --llm-binding | ollama | LLM binding to be used. Supported: lollms, ollama, openai |
76
+ | --llm-binding-host | (dynamic) | LLM server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
77
+ | --llm-model | mistral-nemo:latest | LLM model name |
78
+ | --embedding-binding | ollama | Embedding binding to be used. Supported: lollms, ollama, openai |
79
+ | --embedding-binding-host | (dynamic) | Embedding server host URL. Defaults based on binding: http://localhost:11434 (ollama), http://localhost:9600 (lollms), https://api.openai.com/v1 (openai) |
80
+ | --embedding-model | bge-m3:latest | Embedding model name |
81
+ | --working-dir | ./rag_storage | Working directory for RAG storage |
82
+ | --input-dir | ./inputs | Directory containing input documents |
83
+ | --max-async | 4 | Maximum async operations |
84
+ | --max-tokens | 32768 | Maximum token size |
85
+ | --embedding-dim | 1024 | Embedding dimensions |
86
+ | --max-embed-tokens | 8192 | Maximum embedding token size |
87
+ | --timeout | None | Timeout in seconds (useful when using slow AI). Use None for infinite timeout |
88
+ | --log-level | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
89
+ | --key | None | API key for authentication. Protects lightrag server against unauthorized access |
90
+ | --ssl | False | Enable HTTPS |
91
+ | --ssl-certfile | None | Path to SSL certificate file (required if --ssl is enabled) |
92
+ | --ssl-keyfile | None | Path to SSL private key file (required if --ssl is enabled) |
93
+
94
+
95
+
96
+ For protecting the server using an authentication key, you can also use an environment variable named `LIGHTRAG_API_KEY`.
97
+ ### Example Usage
98
+
99
+ #### Running a Lightrag server with ollama default local server as llm and embedding backends
100
+
101
+ Ollama is the default backend for both llm and embedding, so by default you can run lightrag-server with no parameters and the default ones will be used. Make sure ollama is installed and is running and default models are already installed on ollama.
102
+
103
+ ```bash
104
+ # Run lightrag with ollama, mistral-nemo:latest for llm, and bge-m3:latest for embedding
105
+ lightrag-server
106
+
107
+ # Using specific models (ensure they are installed in your ollama instance)
108
+ lightrag-server --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-model nomic-embed-text --embedding-dim 1024
109
+
110
+ # Using an authentication key
111
+ lightrag-server --key my-key
112
+
113
+ # Using lollms for llm and ollama for embedding
114
+ lightrag-server --llm-binding lollms
115
+ ```
116
+
117
+ #### Running a Lightrag server with lollms default local server as llm and embedding backends
118
+
119
+ ```bash
120
+ # Run lightrag with lollms, mistral-nemo:latest for llm, and bge-m3:latest for embedding, use lollms for both llm and embedding
121
+ lightrag-server --llm-binding lollms --embedding-binding lollms
122
+
123
+ # Using specific models (ensure they are installed in your ollama instance)
124
+ lightrag-server --llm-binding lollms --llm-model adrienbrault/nous-hermes2theta-llama3-8b:f16 --embedding-binding lollms --embedding-model nomic-embed-text --embedding-dim 1024
125
+
126
+ # Using an authentication key
127
+ lightrag-server --key my-key
128
+
129
+ # Using lollms for llm and openai for embedding
130
+ lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
131
+ ```
132
+
133
+
134
+ #### Running a Lightrag server with openai server as llm and embedding backends
135
+
136
+ ```bash
137
+ # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
138
+ lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
139
+
140
+ # Using an authentication key
141
+ lightrag-server --llm-binding openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small --key my-key
142
+
143
+ # Using lollms for llm and openai for embedding
144
+ lightrag-server --llm-binding lollms --embedding-binding openai --embedding-model text-embedding-3-small
145
+ ```
146
+
147
+ #### Running a Lightrag server with azure openai server as llm and embedding backends
148
+
149
+ ```bash
150
+ # Run lightrag with lollms, GPT-4o-mini for llm, and text-embedding-3-small for embedding, use openai for both llm and embedding
151
+ lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding openai --embedding-model text-embedding-3-small
152
+
153
+ # Using an authentication key
154
+ lightrag-server --llm-binding azure_openai --llm-model GPT-4o-mini --embedding-binding azure_openai --embedding-model text-embedding-3-small --key my-key
155
+
156
+ # Using lollms for llm and azure_openai for embedding
157
+ lightrag-server --llm-binding lollms --embedding-binding azure_openai --embedding-model text-embedding-3-small
158
+ ```
159
+
160
+ **Important Notes:**
161
+ - For LoLLMs: Make sure the specified models are installed in your LoLLMs instance
162
+ - For Ollama: Make sure the specified models are installed in your Ollama instance
163
+ - For OpenAI: Ensure you have set up your OPENAI_API_KEY environment variable
164
+ - For Azure OpenAI: Build and configure your server as stated in the Prequisites section
165
+
166
+ For help on any server, use the --help flag:
167
+ ```bash
168
+ lightrag-server --help
169
+ ```
170
+
171
+ Note: If you don't need the API functionality, you can install the base package without API support using:
172
+ ```bash
173
+ pip install lightrag-hku
174
+ ```
175
+
176
+ ## API Endpoints
177
+
178
+ All servers (LoLLMs, Ollama, OpenAI and Azure OpenAI) provide the same REST API endpoints for RAG functionality.
179
+
180
+ ### Query Endpoints
181
+
182
+ #### POST /query
183
+ Query the RAG system with options for different search modes.
184
+
185
+ ```bash
186
+ curl -X POST "http://localhost:9621/query" \
187
+ -H "Content-Type: application/json" \
188
+ -d '{"query": "Your question here", "mode": "hybrid", ""}'
189
+ ```
190
+
191
+ #### POST /query/stream
192
+ Stream responses from the RAG system.
193
+
194
+ ```bash
195
+ curl -X POST "http://localhost:9621/query/stream" \
196
+ -H "Content-Type: application/json" \
197
+ -d '{"query": "Your question here", "mode": "hybrid"}'
198
+ ```
199
+
200
+ ### Document Management Endpoints
201
+
202
+ #### POST /documents/text
203
+ Insert text directly into the RAG system.
204
+
205
+ ```bash
206
+ curl -X POST "http://localhost:9621/documents/text" \
207
+ -H "Content-Type: application/json" \
208
+ -d '{"text": "Your text content here", "description": "Optional description"}'
209
+ ```
210
+
211
+ #### POST /documents/file
212
+ Upload a single file to the RAG system.
213
+
214
+ ```bash
215
+ curl -X POST "http://localhost:9621/documents/file" \
216
+ -F "file=@/path/to/your/document.txt" \
217
+ -F "description=Optional description"
218
+ ```
219
+
220
+ #### POST /documents/batch
221
+ Upload multiple files at once.
222
+
223
+ ```bash
224
+ curl -X POST "http://localhost:9621/documents/batch" \
225
+ -F "files=@/path/to/doc1.txt" \
226
+ -F "files=@/path/to/doc2.txt"
227
+ ```
228
+
229
+ #### DELETE /documents
230
+ Clear all documents from the RAG system.
231
+
232
+ ```bash
233
+ curl -X DELETE "http://localhost:9621/documents"
234
+ ```
235
+
236
+ ### Utility Endpoints
237
+
238
+ #### GET /health
239
+ Check server health and configuration.
240
+
241
+ ```bash
242
+ curl "http://localhost:9621/health"
243
+ ```
244
+
245
+ ## Development
246
+ Contribute to the project: [Guide](contributor-readme.MD)
247
+
248
+ ### Running in Development Mode
249
+
250
+ For LoLLMs:
251
+ ```bash
252
+ uvicorn lollms_lightrag_server:app --reload --port 9621
253
+ ```
254
+
255
+ For Ollama:
256
+ ```bash
257
+ uvicorn ollama_lightrag_server:app --reload --port 9621
258
+ ```
259
+
260
+ For OpenAI:
261
+ ```bash
262
+ uvicorn openai_lightrag_server:app --reload --port 9621
263
+ ```
264
+ For Azure OpenAI:
265
+ ```bash
266
+ uvicorn azure_openai_lightrag_server:app --reload --port 9621
267
+ ```
268
+ ### API Documentation
269
+
270
+ When any server is running, visit:
271
+ - Swagger UI: http://localhost:9621/docs
272
+ - ReDoc: http://localhost:9621/redoc
273
+
274
+ ### Testing API Endpoints
275
+
276
+ You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to:
277
+ 1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI)
278
+ 2. Start the RAG server
279
+ 3. Upload some documents using the document management endpoints
280
+ 4. Query the system using the query endpoints
281
+
282
+ ### Important Features
283
+
284
+ #### Automatic Document Vectorization
285
+ When starting any of the servers with the `--input-dir` parameter, the system will automatically:
286
+ 1. Scan the specified directory for documents
287
+ 2. Check for existing vectorized content in the database
288
+ 3. Only vectorize new documents that aren't already in the database
289
+ 4. Make all content immediately available for RAG queries
290
+
291
+ This intelligent caching mechanism:
292
+ - Prevents unnecessary re-vectorization of existing documents
293
+ - Reduces startup time for subsequent runs
294
+ - Preserves system resources
295
+ - Maintains consistency across restarts
296
+
297
+ **Important Notes:**
298
+ - The `--input-dir` parameter enables automatic document processing at startup
299
+ - Documents already in the database are not re-vectorized
300
+ - Only new documents in the input directory will be processed
301
+ - This optimization significantly reduces startup time for subsequent runs
302
+ - The working directory (`--working-dir`) stores the vectorized documents database