Spaces:
Running
Running
Commit History
rpc : better caching of the base buffer pointer (llama/11331) 81a6cae
metal : fix out-of-bounds write (llama/11314) 1101050
vulkan: fix coopmat2 validation failures (llama/11284) f2cc7e9
SYCL: Introducing memory host pool (llama/11251) aedb0b3
Nicolò Scipione commited on
cmake : add sanitizer flags for llama.cpp (llama/11279) 3547979
vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama/11281) e0e73fa
rpc : early register backend devices (llama/11262) 4134077
vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (llama/11166) 3bb9e77
vulkan: optimize coopmat2 q4_k/q5_k dequant functions. (llama/11206) ee122d3
vulkan: optimize coopmat2 q2_k dequant function (llama/11130) d49a569
CUDA: backwards pass for misc. ops, add tests (llama/11257) 2fbcec1
ggml: aarch64: implement SVE kernels for q4_K_q8_K vector dot (llama/11227) bf3dc93
vulkan: scale caching for k quants + misc fixes (llama/11081) 03ab36f
Eve commited on
fix: ggml: fix vulkan-shaders-gen build (llama/10448) ad8f031
RoPE: fix back, CUDA support for back + noncont. (llama/11240) 131a21e
SYCL: Add gated linear attention kernel (llama/11175) fdb1fe5
ggml : add option to not print stack on abort (ggml/1081) 9b2706e
William Tambellini Diego Devesa commited on
ggml-cpu : fix ggml_graph_compute_thread did not terminate on abort. (ggml/1065) 8e57313
issixx issi commited on
ci : dummy commit to trigger CI 600a548 unverified
ruby : Make context accept initial parameters, API to retrieve a segment and more (#2749) 7cb9a0e unverified
whisper.objc : fix build and CI 9cbd99a unverified
Corey Earwood commited on
talk-llama : sync llama.cpp 16d40d7
sync : ggml d50f71a
GGUF: C++ refactor, backend support, misc fixes (skip) (llama/11030) 92311a3
ggml : add opencl backend (skip) (llama/10693) 226358f
lhez Skyler Szot Shangqing Gu Alexander Angus Hongqiang Wang Max Krasnyansky commited on
cuda : CUDA Graph Compute Function Refactor (precursor for performance improvements) (llama/11042) 25882f6
Andreas Kieslinger slaren commited on
ggml : do not define GGML_USE_CUDA when building with GGML_BACKEND_DL (llama/11211) 79f750d
Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (llama/11161) 5ad3f1d
SYCL: Refactor ggml_sycl_compute_forward (llama/11121) fa23a38
fix: add missing msg in static_assert (llama/11143) 8c60d6a
llamafile : ppc64le MMA INT8 implementation (llama/10912) 6f18eed
amritahs-ibm commited on
Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (llama/11117) 623b74d
fix: Vulkan shader gen binary path when Cross-compiling (llama/11096) 966a7bb
ag2s20150909 commited on
GGUF: C++ refactor, backend support, misc fixes (llama/11030) 21c5b64
ggml-backend : only offload from host buffers (fix) (llama/11124) 9ac3c7e
Diego Devesa commited on
ggml-backend : only offload from host buffers (llama/11120) 1ca87a8
Diego Devesa commited on
rpc : code cleanup (llama/11107) a0fb22d
SYCL: Use get_multi_ptr instead of deprecated get_pointer in wkv6 (llama/11087) 4ed93cc
CUDA: add BF16 support (llama/11093) 961ef57
Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (llama/11074) 4d90c3d
Support for models with non-512-aligned tensors over RPC. (llama/11047) 895a3a2
fix: Vulkan shader gen binary path (llama/11037) 7008fb8
Gilad S. commited on