how to accelerate the inference speed
#22
by
tobywang
- opened
Is there any frameworks which can accelerate the inference speed of this model
Hello, does vllm work for you? I tried vllm but found that the generation quality is degraded and the model simply outputs repetitive words.