Commit History

vulkan: matmul dequantization improvements (llama/12015)
ffdf466

Eve commited on

vulkan: improve im2col (llama/11826)
f6cff0a

Daniele commited on

cmake: Fix ggml backend dependencies and installation (llama/11818)
c6c2a2c

Vladimir Vuksanovic commited on

vulkan: fix assertion when qy_needs_dequant (llama/12068)
271c7e4

jeffbolznv commited on

ggml-cpu: Fix build with sve (llama/12059)
4be146e

mollysama commited on

cuda: unary ops as float + de-duplicate (ggml/1130)
4bec2e4

cmdr2 commited on

cuda/vulkan: specify fp32-only support for some operations in supports_op (ggml/1129)
f959b90

cmdr2 commited on

cuda/cpu: Increase support for fp16 unary operations (ggml/1125)
67e8c32

cmdr2 commited on

Told cmake to install ggml-cpp.h as a public header file. (ggml/1126)
3d4f29c

petterreinholdtsen Petter Reinholdtsen commited on

common : more general m_audio_len update logic (#2855)
4674264
unverified

Ivy233 Ivy233 commited on

go : improve model download (#2756)
168712d
unverified

Ryan Johnson commited on

common : fix audio loading by miniaudio (#2862)
494fb84
unverified

Dmitry Atamanov commited on

fix: missing include common-whisper (#2858)
2271d56
unverified

Lin Xiaodong commited on

ruby : follow audio library change (#2851)
b94e7d3
unverified

KitaitiMakoto commited on

whisper : support GGML_BACKEND_DL (#2843)
2e6437e
unverified

Diego Devesa ggerganov commited on

common : separate whisper sources (#2846)
0447b9d
unverified

ggerganov commited on

common : fix build min/max (#2845)
07533a2
unverified

ggerganov commited on

examples : use miniaudio for direct decoding flac, mp3, ogg and wav (#2759)
7a280a4
unverified

Dmitry Atamanov commited on

stream : stop on ^C when no audio is received (#2822)
45399ad
unverified

petterreinholdtsen Petter Reinholdtsen commited on

sync : ggml
7926873

ggerganov commited on

Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)
2b94a24

cmdr2 commited on

metal : copy kernels for quant to F32/F16 conversions (llama/12017)
6c8e7ec

Garf ggerganov commited on

opencl: fix for small models (llama/11950)
4532dc6

lhez Shawn Gu Skyler Szot commited on

Optimize mul_mat for Q4_0 on Intel GPU (llama/12035)
14fd317

Neo Zhang Jianyu arthw commited on

SYCL: Fix GGML_SYCL_DEBUG macro (llama/11995)
310a36c

qnixsynapse commited on

ggml-cpu: Support s390x SIMD Instruction Set (llama/12019)
4aa54ec

Aaron Teo Jinyang He junchao-zhao commited on

CUDA: app option to compile without FlashAttention (llama/12025)
fbc5f16

JohannesGaessler commited on

CUDA: optimize FA for GQA + large batches (llama/12014)
6662d54

JohannesGaessler commited on

cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (llama/12000)
6cb8158

Garf commited on

CUDA: correct the lowest Maxwell supported by CUDA 12 (llama/11984)
6641178

PureJourney JohannesGaessler commited on

MUSA: support ARM64 and enable dp4a .etc (llama/11843)
ab96dac

Bodhi Bodhi Hu commited on

ggml-cpu: Add CPU backend support for KleidiAI library (llama/11390)
9de6d81

Charles Xu commited on

ggml: aarch64: implement SVE kernels for q3_K_q8_K vector dot (llama/11917)
1a1acd2

Prashant Vithule vithulep ggerganov commited on

CUDA: use async data loading for FlashAttention (llama/11894)
5b9980d

JohannesGaessler Diego Devesa commited on

vulkan: implement several ops relevant for ggml_opt (llama/11769)
3c2171d

Rémy O commited on

vulkan: support multi/vision rope, and noncontiguous rope (llama/11902)
1c7a669

jeffbolznv commited on

metal : fix the crash caused by the lack of residency set support on Intel Macs. (llama/11904)
afbd891

Hale Chan commited on

metal : optimize dequant q6_K kernel (llama/11892)
376cbe6

Adrian Kretz commited on

repo : update links to new url (llama/11886)
9705bb5

ggerganov commited on

vulkan: initial support for IQ1_S and IQ1_M quantizations (llama/11528)
0d2e888

Rémy O commited on

opencl: Fix rope and softmax (llama/11833)
bf3b6f8

lhez commited on

cuda : add ampere to the list of default architectures (llama/11870)
1d19dec

Diego Devesa commited on

ggml: optimize some vec dot functions for LoongArch ASX (llama/11842)
e3acbfc

Jinyang He commited on

vulkan: linux builds + small subgroup size fixes (llama/11767)
e3f0e78

Eve commited on

llamafile: use member variable instead of constant for iq4nlt (llama/11780)
0cb2d04

jmorganca commited on

musa: bump MUSA SDK version to rc3.1.1 (llama/11822)
ff2d3eb

R0CKSTAR commited on

ggml-cpu : add chunking support to mul_mat_id (llama/11666)
e59d9a7

Diego Devesa commited on

ggml : x2 speed for WASM by optimizing SIMD (llama/11453)
464a186

Xuan-Son Nguyen camel-cdr commited on

HIP: Remove GCN from list of devices that avoid MMQ (llama/11831)
78aed55

uvos commited on

HIP: Switch to std::vector in rocblas version check (llama/11820)
e144c94

uvos commited on