Spaces:
Sleeping
Sleeping
Commit History
android : fix builds (#0) 4043835
sync : ggml a890a8c
files : remove old sources (part 2) c1c9908
sync : ggml 43cbdf7
files : remove old sources e4ae8c6
talk-llama : sync llama.cpp 5ef1601
sync : ggml 6ac9e73
metal : use less stack memory in FA kernel (llama/14088) 014afb6
ggml-cpu : split arch-specific implementations (llama/13892) 8c833e9
cuda : fix device sync on buffer clear (llama/14033) 8f2e8d6
Diego Devesa commited on
CANN: Simplify the environment variable setting(#13104) f1535d7
sycl: Add reorder to Q6_K mmvq implementation (llama/13885) 56f0e48
Nicolò Scipione commited on
cuda : fix buffer type check with integrated GPUs (llama/14069) 747ad97
Diego Devesa commited on
SYCL: Implement few same quantized type copy kernels (llama/13739) 4c88a27
vulkan: Enable VK_KHR_cooperative_matrix extension for Intel Xe2 GPUs (llama/14001) e5107fe
llama : allow using mmap without PrefetchVirtualMemory, apply GGML_WIN_VER to llama.cpp sources (llama/14013) f0a0ac8
Diego Devesa commited on
vulkan: automatically deduce size of push constants (llama/13936) 00a9e2f
ggml-vulkan: adds support for op CONV_TRANSPOSE_1D (llama/13813) 32985b0
releases : use dl backend for linux release, remove arm64 linux release (llama/13996) 9896625
Diego Devesa commited on
CUDA: fix FTZ in FA for Gemma 3 (llama/13991) 40fc316
vulkan: fix warnings in perf logger querypool code (llama/13937) 11bac96
opencl: add `backend_synchronize` (llama/13939) a9ce9a8
lhez commited on
OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat (llama/13840) 5ff8785
rmatif commited on
metal : use F32 accumulators in FA kernels (llama/13975) b86860f
cmake : Handle mixed-case 'Power' strings in POWER CPU detection (llama/13966) bc1415b
sycl: quantize and reorder the input to q8_1 when reorder is enabled (llama/13826) c4e62cd
Atharva Dubey Alberto Cabrera Pérez commited on
gguf: fix failure on version == 0 (llama/13956) 73547ad
ggml: check if non-native endian model is being loaded (llama/13943) a2e9ccb
Add in-build ggml::ggml ALIAS library (ggml/1260) faef029
Kai Pastor commited on
ruby : output format (#3237) 63cab25 unverified
ci : build and publish main-intel image (#3231) 2c4b2dd unverified
藍+85CD commited on
docker : add main-intel dockerfile (#3229) 23d5a5c unverified
藍+85CD commited on
ruby : Add parallel transcription support (#3222) acad667 unverified
ci : add mirror for ports.ubuntu.com (ARM packages) (#3221) 17ba7f5 unverified
bindings.java : apply whisperParams in fullTranscribeWithTime instead of ignoring them (#3201) 18fb7d6 unverified
Joas Dev commited on
musa: correct MUSA SDK rc4.0.1 download URL (#3217) 90efe84 unverified
R0CKSTAR commited on
ci : use mirrors.kernel.org for Ubuntu packages (#3220) 62dd144 unverified
node : add language detection support (#3190) 9994342 unverified
talk-llama : sync llama.cpp 58220b6
sync : ggml 337f4d9
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (llama/12995) d5d55f2
Max Krasnyansky Diego Devesa commited on
CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dGPU in cuda (#13856) (llama/13895) a75e157
CUDA: fix typo in FlashAttention code (llama/13926) 6fb9674
sched : avoid changing cur_copy when a graph is already allocated (llama/13922) 1c0a5c0
Diego Devesa commited on
cuda : prevent using split buffers with 3d/4d matrices (llama/13919) 6b6155b
Diego Devesa commited on
SYCL: Add mrope kernel (llama/13755) e4b1812
cmake: Guard GGML_CPU_ALL_VARIANTS by architecture (llama/13890) a434936
Christian Kastner commited on
arm64: optimize q4_k_q8_k kernel with i8mm (llama/13886) 026ea5b
Yibo Cai commited on
cmake: Factor out CPU architecture detection (llama/13883) b436dcc
Christian Kastner commited on