ubergarm commited on
Commit
62cc0bc
·
1 Parent(s): b3d51f2

Add Token Probability Deviation Percentiles benchmark

Browse files
README.md CHANGED
@@ -122,6 +122,8 @@ In first tests with `llama-sweep-bench` I'm getting over 1600 tok/sec PP and 105
122
 
123
  ![Benchmarks showing these peak 1600 tok/sec PP and 105 tok/sec TG fully offloaded on 3090TI FE 24GB VRAM](images/benchmarks-01.png "Benchmarks showing these peak 1600 tok/sec PP and 105 tok/sec TG fully offloaded on 3090TI FE 24GB VRAM")
124
 
 
 
125
  ## References
126
  * [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/)
127
  * [ik_llama.cpp Getting Started Guide](https://github.com/ikawrakow/ik_llama.cpp/discussions/258)
 
122
 
123
  ![Benchmarks showing these peak 1600 tok/sec PP and 105 tok/sec TG fully offloaded on 3090TI FE 24GB VRAM](images/benchmarks-01.png "Benchmarks showing these peak 1600 tok/sec PP and 105 tok/sec TG fully offloaded on 3090TI FE 24GB VRAM")
124
 
125
+ ![Benchmarks showing Token Probability Deviation Percentiles](images/qwen3-30b-fig-09.png "Benchmarks showing Token Probability Deviation Percentiles")
126
+
127
  ## References
128
  * [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/)
129
  * [ik_llama.cpp Getting Started Guide](https://github.com/ikawrakow/ik_llama.cpp/discussions/258)
images/benchmarks-01.png ADDED

Git LFS Details

  • SHA256: dabbd41c26c413d250223d1bc60a7b16e7b0207d4493de2c270cd57abd724956
  • Pointer size: 131 Bytes
  • Size of remote file: 314 kB
images/qwen3-30b-fig-09.png ADDED

Git LFS Details

  • SHA256: 51c520dd2a1e807c06724048813010189d389c459350f1b7e862883cf00e2dde
  • Pointer size: 131 Bytes
  • Size of remote file: 229 kB