Update README.md
Browse files
README.md
CHANGED
|
@@ -115,6 +115,26 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
|
| 115 |
print(response)
|
| 116 |
```
|
| 117 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 118 |
### vllm
|
| 119 |
|
| 120 |
For instance, you can easily start a service using [vLLM](https://github.com/vllm-project/vllm)
|
|
|
|
| 115 |
print(response)
|
| 116 |
```
|
| 117 |
|
| 118 |
+
|
| 119 |
+
### ZhiLight
|
| 120 |
+
|
| 121 |
+
You can easily start a service using [ZhiLight](https://github.com/zhihu/ZhiLight)
|
| 122 |
+
|
| 123 |
+
```bash
|
| 124 |
+
docker run -it --net=host --gpus='"device=0"' -v /path/to/model:/mnt/models --entrypoints="" ghcr.io/zhihu/zhilight/zhilight:0.4.21-cu124 python -m zhilight.server.openai.entrypoints.api_server --model-path /mnt/models --port 8000 --enable-reasoning --reasoning-parser deepseek-r1 --served-model-name Zhi-Create-Qwen3-32B
|
| 125 |
+
|
| 126 |
+
curl http://localhost:8000/v1/completions \
|
| 127 |
+
-H "Content-Type: application/json" \
|
| 128 |
+
-d '{
|
| 129 |
+
"model": "Zhi-Create-Qwen3-32B",
|
| 130 |
+
"prompt": "请你以鲁迅的口吻,写一篇介绍西湖醋鱼的文章",
|
| 131 |
+
"max_tokens": 4096,
|
| 132 |
+
"temperature": 0.6,
|
| 133 |
+
"top_p": 0.95
|
| 134 |
+
}'
|
| 135 |
+
```
|
| 136 |
+
|
| 137 |
+
|
| 138 |
### vllm
|
| 139 |
|
| 140 |
For instance, you can easily start a service using [vLLM](https://github.com/vllm-project/vllm)
|