ggml-cpu: x86 feature detection is specific to x86 (llama/13811) d86ba47 Christian Kastner commited on May 27, 2025
ggml : allow CUDA graphs when using pipeline parallelism (llama/13814) b85e3c0 Diego Devesa commited on May 27, 2025
SYCL: Add non contiguous support in RMS_NORM and NORM kernels (llama/13611) 5de15cd qnixsynapse commited on May 26, 2025
vulkan: mark IM2COL as supporting non-contig (llama/13783) 09c03ad jeffbolznv commited on May 26, 2025
CANN: Add the basic supports of Flash Attention kernel (llama/13627) 112c144 Bizhao Shi commited on May 26, 2025
SYCL: revert "sycl: simplify bin_bcast_kernel (ggml/13383)" (llama/13752) 8c2a700 qnixsynapse commited on May 25, 2025
ggml-cpu : set openmp wait time if not set (llama/13758) 276d920 Diego Devesa commited on May 24, 2025
ggml : add ggml_gelu_erf() CUDA kernel (llama/13719) b154325 ngxson HF Staff commited on May 24, 2025
CUDA: fix race condition in FA vector kernels (llama/13742) 38a702a JohannesGaessler commited on May 24, 2025
CANN: Support MUL_MAT_ID for q8_0 and q4_0 (llama/13705) 6a9f9dc Chenguang Li commited on May 23, 2025
vulkan: support CPY from any type to itself (llama/13695) f5f766b jeffbolznv commited on May 23, 2025
vulkan: Disable coopmat/coopmat2/bfloat extensions if glslc doesn't support it (llama/13696) 69679f5 jeffbolznv commited on May 23, 2025
sycl : Remove waits from function calls (llama/13702) b9bf6b6 Nicolò Scipione commited on May 22, 2025
SYCL: Avoid using with SYCL-Graph for unsupported nodes (llama/13587) 7eb0e6e Ewan Crawford commited on May 22, 2025
opencl: Add support for multiple devices (llama/12622) b6cddb5 Henry Linjamäki commited on May 21, 2025
musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (llama/13647) 9506ebb yeahdongcn JohannesGaessler commited on May 21, 2025
CUDA: skip fully masked-out KV in FA vec kernel (llama/13584) e1f825c JohannesGaessler commited on May 20, 2025
sycl: disable reorder for sycl mulmat (llama/13536) e023dc2 Svetlozar Georgiev commited on May 20, 2025
metal : fix typo in FA kernel comments (llama/13651) 4c32ada ggerganov HF Staff commited on May 20, 2025
sycl : Overcoming workaround for mmap() allocation on Windows (llama/13482) bf74ede Nicolò Scipione commited on May 20, 2025
Vulkan: Add f32 accumulator support to quantized mul mat to fix GLM4 32B incoherence (llama/13607) dfa38af OccamRazor commited on May 19, 2025
docs : convert README_sycl.md to utf8 format [no ci] (#3191) 2384106 unverified danbev commited on May 27, 2025
node : enable no_prints to suppress all output (#3189) 1b2bc05 unverified danbev commited on May 27, 2025
talk-llama : fix for swedish umlauts + expose model inference settings in talk-llama.cpp (#3187) 1473e33 unverified matteng1 ggerganov HF Staff commited on May 26, 2025
docs : fix VAD section heading levels (#3186) a7bcfbf unverified KitaitiMakoto commited on May 23, 2025
ci : use dynamic libopenblas.dll for window-blas (#3177) bafccd1 unverified danbev commited on May 23, 2025
docs : add VAD model download instructions [no ci] (#3180) e789f73 unverified danbev commited on May 22, 2025
whisper : update CMakeLists.txt to handle deprecated gpu Warnings (#3163) 2ee9c36 unverified Jugal Haresh Sheth Jugal Sheth commited on May 20, 2025
ruby : add GGML_SYCL_DNN option to ruby bindings (#3172) 94d5ce3 unverified danbev commited on May 19, 2025
cmake: use the current build config for vulkan-shaders-gen (llama/13595) 7681e32 Gilad S. commited on May 17, 2025
vulkan: move common FA code to flash_attn_base.comp (llama/13556) ad8b504 jeffbolznv commited on May 17, 2025
vulkan: use scalar FA rather than coopmat2 when N==1 (llama/13554) 97d9aa6 jeffbolznv commited on May 17, 2025
metal : add FA-vec kernel for head size 64 (llama/13583) 36a3b4e ggerganov HF Staff commited on May 16, 2025