Add Token Probability Deviation Percentiles benchmark

Files changed (3) hide show

README.md CHANGED Viewed

@@ -122,6 +122,8 @@ In first tests with `llama-sweep-bench` I'm getting over 1600 tok/sec PP and 105
 ![Benchmarks showing these peak 1600 tok/sec PP and 105 tok/sec TG fully offloaded on 3090TI FE 24GB VRAM](images/benchmarks-01.png "Benchmarks showing these peak 1600 tok/sec PP and 105 tok/sec TG fully offloaded on 3090TI FE 24GB VRAM")
 ## References
 * [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/)
 * [ik_llama.cpp Getting Started Guide](https://github.com/ikawrakow/ik_llama.cpp/discussions/258)

 ![Benchmarks showing these peak 1600 tok/sec PP and 105 tok/sec TG fully offloaded on 3090TI FE 24GB VRAM](images/benchmarks-01.png "Benchmarks showing these peak 1600 tok/sec PP and 105 tok/sec TG fully offloaded on 3090TI FE 24GB VRAM")
+![Benchmarks showing Token Probability Deviation Percentiles](images/qwen3-30b-fig-09.png "Benchmarks showing Token Probability Deviation Percentiles")
 ## References
 * [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/)
 * [ik_llama.cpp Getting Started Guide](https://github.com/ikawrakow/ik_llama.cpp/discussions/258)

images/benchmarks-01.png ADDED Viewed

images/qwen3-30b-fig-09.png ADDED Viewed