[successs] launched at 2xR9700
#1
by
djdeniro
- opened
Inference speed:
INFO 09-16 08:11:04 [loggers.py:123] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 44.0 tokens/s, Running: 2 reqs, Waiting: 0 reqs, GPU KV cache usage: 2.2%, Prefix cache hit rate: 15.7%
INFO 09-16 08:00:44 [loggers.py:123] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 20.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 9.1%, Prefix cache hit rate: 0.0%
this model launched with no using docker image: rocm/vllm-dev:nightly_main_20250914
Congratulations! 🎉