Commit History

ggml-cpu: x86 feature detection is specific to x86 (llama/13811)
d86ba47

Christian Kastner commited on

ggml : allow CUDA graphs when using pipeline parallelism (llama/13814)
b85e3c0

Diego Devesa commited on

cuda : avoid cuGetErrorString (llama/13791)
cdf95d3

ggerganov HF Staff commited on

SYCL: Add non contiguous support in RMS_NORM and NORM kernels (llama/13611)
5de15cd

qnixsynapse commited on

sycl: Add more debug prints (llama/13640)
4da3fb6

Romain Biessy commited on

vulkan: mark IM2COL as supporting non-contig (llama/13783)
09c03ad

jeffbolznv commited on

CANN: Add the basic supports of Flash Attention kernel (llama/13627)
112c144

Bizhao Shi commited on

SYCL: revert "sycl: simplify bin_bcast_kernel (ggml/13383)" (llama/13752)
8c2a700

qnixsynapse commited on

ggml-cpu : set openmp wait time if not set (llama/13758)
276d920

Diego Devesa commited on

ggml : add ggml_gelu_erf() CUDA kernel (llama/13719)
b154325

ngxson HF Staff commited on

CUDA: fix race condition in FA vector kernels (llama/13742)
38a702a

JohannesGaessler commited on

CANN: Support MUL_MAT_ID for q8_0 and q4_0 (llama/13705)
6a9f9dc

Chenguang Li commited on

ggml : fix the order of ggml_unary_op (llama/13718)
bdae2b3

ngxson HF Staff commited on

vulkan: support CPY from any type to itself (llama/13695)
f5f766b

jeffbolznv commited on

vulkan: Disable coopmat/coopmat2/bfloat extensions if glslc doesn't support it (llama/13696)
69679f5

jeffbolznv commited on

use LOG_WARN to replace `std::cerr` (llama/13657)
6975ec2

Judd commited on

sycl : Remove waits from function calls (llama/13702)
b9bf6b6

Nicolò Scipione commited on

SYCL: Avoid using with SYCL-Graph for unsupported nodes (llama/13587)
7eb0e6e

Ewan Crawford commited on

opencl: Add support for multiple devices (llama/12622)
b6cddb5

Henry Linjamäki commited on

opencl: fix couple crashes (llama/12795)
2eea73d

Henry Linjamäki commited on

ggml : add ggml_gelu_erf() (llama/13667)
6c9cd9a

ngxson HF Staff commited on

musa: Upgrade MUSA SDK version to rc4.0.1 and use mudnn::Unary::IDENTITY op to accelerate D2D memory copy (llama/13647)
9506ebb

yeahdongcn JohannesGaessler commited on

vulkan: fix warnings (llama/13626)
8602d10

Eve commited on

CUDA: skip fully masked-out KV in FA vec kernel (llama/13584)
e1f825c

JohannesGaessler commited on

sycl: disable reorder for sycl mulmat (llama/13536)
e023dc2

Svetlozar Georgiev commited on

metal : fix typo in FA kernel comments (llama/13651)
4c32ada

ggerganov HF Staff commited on

sycl : Overcoming workaround for mmap() allocation on Windows (llama/13482)
bf74ede

Nicolò Scipione commited on

Vulkan: Add f32 accumulator support to quantized mul mat to fix GLM4 32B incoherence (llama/13607)
dfa38af

OccamRazor commited on

sync : ggml
3b09d20

ggerganov HF Staff commited on

docs : convert README_sycl.md to utf8 format [no ci] (#3191)
2384106
unverified

danbev commited on

node : enable no_prints to suppress all output (#3189)
1b2bc05
unverified

danbev commited on

talk-llama : fix for swedish umlauts + expose model inference settings in talk-llama.cpp (#3187)
1473e33
unverified

matteng1 ggerganov HF Staff commited on

docs : fix VAD section heading levels (#3186)
a7bcfbf
unverified

KitaitiMakoto commited on

ci : use dynamic libopenblas.dll for window-blas (#3177)
bafccd1
unverified

danbev commited on

server : Add k6 Load Testing Script (#3175)
9a681c7
unverified

sachaarbonel commited on

docs : add VAD model download instructions [no ci] (#3180)
e789f73
unverified

danbev commited on

docs : replace typo "]"with ")" in README (#3179)
5e8b0f0
unverified

Alpaim commited on

whisper : remove redundant assignments (#3178)
ec40497
unverified

danbev commited on

whisper : update CMakeLists.txt to handle deprecated gpu Warnings (#3163)
2ee9c36
unverified

Jugal Haresh Sheth Jugal Sheth commited on

ruby : add GGML_SYCL_DNN option to ruby bindings (#3172)
94d5ce3
unverified

danbev commited on

talk-llama : sync llama.cpp
44ee199

ggerganov HF Staff commited on

sync : ggml
b16623d

ggerganov HF Staff commited on

CANN: Support MOE Model MUL_MAT_ID (llama/13042)
f013e2d

Chenguang Li commited on

cmake: use the current build config for vulkan-shaders-gen (llama/13595)
7681e32

Gilad S. commited on

vulkan: move common FA code to flash_attn_base.comp (llama/13556)
ad8b504

jeffbolznv commited on

vulkan: use scalar FA rather than coopmat2 when N==1 (llama/13554)
97d9aa6

jeffbolznv commited on

metal : add FA-vec kernel for head size 64 (llama/13583)
36a3b4e

ggerganov HF Staff commited on

sycl : fixed compilation warnings (llama/13582)
5037d84

Łukasz Ślusarczyk commited on

gguf : use ggml log system (llama/13571)
a2211c9

Diego Devesa commited on

sycl: simplify bin_bcast_kernel (llama/13383)
c39b646

Atharva Dubey commited on