Spaces:
Running
Running
Commit History
docs : replace typo "]"with ")" in README (#3179) 5e8b0f0 unverified
Alpaim commited on
whisper : remove redundant assignments (#3178) ec40497 unverified
whisper : update CMakeLists.txt to handle deprecated gpu Warnings (#3163) 2ee9c36 unverified
Jugal Haresh Sheth Jugal Sheth commited on
ruby : add GGML_SYCL_DNN option to ruby bindings (#3172) 94d5ce3 unverified
talk-llama : sync llama.cpp 44ee199
sync : ggml b16623d
CANN: Support MOE Model MUL_MAT_ID (llama/13042) f013e2d
Chenguang Li commited on
cmake: use the current build config for vulkan-shaders-gen (llama/13595) 7681e32
Gilad S. commited on
vulkan: move common FA code to flash_attn_base.comp (llama/13556) ad8b504
vulkan: use scalar FA rather than coopmat2 when N==1 (llama/13554) 97d9aa6
metal : add FA-vec kernel for head size 64 (llama/13583) 36a3b4e
sycl : fixed compilation warnings (llama/13582) 5037d84
Łukasz Ślusarczyk commited on
gguf : use ggml log system (llama/13571) a2211c9
Diego Devesa commited on
sycl: simplify bin_bcast_kernel (llama/13383) c39b646
Atharva Dubey commited on
sycl: reordered Q4_K MMVQ (llama/13109) 6ca3a47
Svetlozar Georgiev commited on
sycl: use oneDNN for matrices multiplication (llama/12972) 2008e08
Łukasz Ślusarczyk commited on
arm64: optimize q6_k_q8_k kernel with i8mm (llama/13519) 03048ea
Yibo Cai commited on
CUDA: fix crash on large batch size for quant. MoE (llama/13537) df90a14
CUDA: faster Deepseek FA, add Turing support (llama/13435) ace16dc
cmake: simplify vulkan shader test logic (llama/13263) f8fd66d
bandoti commited on
vulkan: KHR_coopmat flash attention (llama/13506) 4d1bd4f
vulkan: workaround FA compile failures on macos (llama/13517) 06833bc
metal : use FA-vec kernel up to batch size 20 (llama/13496) e925f17
metal : optimize multi-sequence FA vec kernel (llama/13493) d2f915d
ggml-cpu: Update KleidiAI to v1.6 and fix include directives (llama/13509) 7463545
Dan Johansson commited on
mnist: fix segmentation fault (ggml/1227) 341f451
ggml : fix apple OS check in ggml_print_backtrace (ggml/1229) 5c0b540
Diego Devesa commited on
ggml : Fix missing backtrace on Linux (ggml/1228) 82ee857
Daniel Tang commited on
examples : add vad-speech-segments to win warns [no ci] (#3170) 90d9ecb unverified
vad : return early if no vad segments are detected (#3158) a28f11e unverified
vad : store VAD context in whisper_state (#3156) 821d05f unverified
whisper : add build_*/ to .gitignore [no ci] (#3157) 1374002 unverified
examples : add --print-confidence option to cli (#3150) 2d83266 unverified
vad : add download-vad-model scripts (#3149) a40b758 unverified
server : add --flash-attn usage output (#3152) 8e966a8 unverified
talk-llama : sync llama.cpp 05d6d9c
whisper : update to ggml-backend changes (#0) b12517c
sync : ggml 60acbd5
ggml : add mrope kernel for metal (llama/13457) 27b32e6
metal : optimize MoE for large batches (llama/13388) d51c0d3
opencl: remove unnecessary assert for `add` (llama/13257) a245fbf
lhez commited on
llama/ggml: add LLM training support (llama/10544) 8d3b3c1
ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel (llama/13053) 0612f1f
Dan Johansson Charles Xu commited on
CUDA: fix misaligned synchronization in FA (llama/13469) 40840d0
enable dpcpp nightly builds with libraries (llama/13406) c9c1196
Atharva Dubey commited on
CUDA: fix crash with partial offloading of MoE (llama/13439) 26820f6
Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B (llama/13386) 418769d
David Huang commited on