cmake: Fix ggml backend dependencies and installation (llama/11818) c6c2a2c Vladimir Vuksanovic commited on Feb 27, 2025
vulkan: fix assertion when qy_needs_dequant (llama/12068) 271c7e4 jeffbolznv commited on Feb 25, 2025
cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129) f959b90 cmdr2 commited on Feb 28, 2025
cuda/cpu: Increase support for fp16 unary operations (ggml/1125) 67e8c32 cmdr2 commited on Feb 28, 2025
Told cmake to install ggml-cpp.h as a public header file. (ggml/1126) 3d4f29c petterreinholdtsen Petter Reinholdtsen commited on Feb 26, 2025
common : more general m_audio_len update logic (#2855) 4674264 unverified Ivy233 Ivy233 commited on Mar 7, 2025
common : fix audio loading by miniaudio (#2862) 494fb84 unverified Dmitry Atamanov commited on Mar 4, 2025
whisper : support GGML_BACKEND_DL (#2843) 2e6437e unverified Diego Devesa ggerganov commited on Feb 27, 2025
examples : use miniaudio for direct decoding flac, mp3, ogg and wav (#2759) 7a280a4 unverified Dmitry Atamanov commited on Feb 27, 2025
stream : stop on ^C when no audio is received (#2822) 45399ad unverified petterreinholdtsen Petter Reinholdtsen commited on Feb 27, 2025
Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121) 2b94a24 cmdr2 commited on Feb 25, 2025
metal : copy kernels for quant to F32/F16 conversions (llama/12017) 6c8e7ec Garf ggerganov commited on Feb 25, 2025
opencl: fix for small models (llama/11950) 4532dc6 lhez Shawn Gu Skyler Szot commited on Feb 24, 2025
Optimize mul_mat for Q4_0 on Intel GPU (llama/12035) 14fd317 Neo Zhang Jianyu arthw commited on Feb 24, 2025
ggml-cpu: Support s390x SIMD Instruction Set (llama/12019) 4aa54ec Aaron Teo Jinyang He junchao-zhao commited on Feb 22, 2025
CUDA: app option to compile without FlashAttention (llama/12025) fbc5f16 JohannesGaessler commited on Feb 22, 2025
CUDA: optimize FA for GQA + large batches (llama/12014) 6662d54 JohannesGaessler commited on Feb 22, 2025
cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (llama/12000) 6cb8158 Garf commited on Feb 22, 2025
CUDA: correct the lowest Maxwell supported by CUDA 12 (llama/11984) 6641178 PureJourney JohannesGaessler commited on Feb 21, 2025
MUSA: support ARM64 and enable dp4a .etc (llama/11843) ab96dac Bodhi Bodhi Hu commited on Feb 21, 2025
ggml-cpu: Add CPU backend support for KleidiAI library (llama/11390) 9de6d81 Charles Xu commited on Feb 20, 2025
ggml: aarch64: implement SVE kernels for q3_K_q8_K vector dot (llama/11917) 1a1acd2 Prashant Vithule vithulep ggerganov commited on Feb 20, 2025
CUDA: use async data loading for FlashAttention (llama/11894) 5b9980d JohannesGaessler Diego Devesa commited on Feb 17, 2025
vulkan: implement several ops relevant for ggml_opt (llama/11769) 3c2171d Rémy O commited on Feb 17, 2025
vulkan: support multi/vision rope, and noncontiguous rope (llama/11902) 1c7a669 jeffbolznv commited on Feb 16, 2025
metal : fix the crash caused by the lack of residency set support on Intel Macs. (llama/11904) afbd891 Hale Chan commited on Feb 16, 2025
vulkan: initial support for IQ1_S and IQ1_M quantizations (llama/11528) 0d2e888 Rémy O commited on Feb 15, 2025
cuda : add ampere to the list of default architectures (llama/11870) 1d19dec Diego Devesa commited on Feb 14, 2025
ggml: optimize some vec dot functions for LoongArch ASX (llama/11842) e3acbfc Jinyang He commited on Feb 14, 2025
llamafile: use member variable instead of constant for iq4nlt (llama/11780) 0cb2d04 jmorganca commited on Feb 13, 2025
ggml-cpu : add chunking support to mul_mat_id (llama/11666) e59d9a7 Diego Devesa commited on Feb 13, 2025
ggml : x2 speed for WASM by optimizing SIMD (llama/11453) 464a186 Xuan-Son Nguyen camel-cdr commited on Feb 12, 2025
HIP: Remove GCN from list of devices that avoid MMQ (llama/11831) 78aed55 uvos commited on Feb 12, 2025
HIP: Switch to std::vector in rocblas version check (llama/11820) e144c94 uvos commited on Feb 12, 2025