Spaces:
Running
Running
Commit History
whisper : support GGML_BACKEND_DL (#2843)
2e6437e
unverified
common : separate whisper sources (#2846)
0447b9d
unverified
common : fix build min/max (#2845)
07533a2
unverified
examples : use miniaudio for direct decoding flac, mp3, ogg and wav (#2759)
7a280a4
unverified
Dmitry Atamanov
commited on
stream : stop on ^C when no audio is received (#2822)
45399ad
unverified
petterreinholdtsen
Petter Reinholdtsen
commited on
sync : ggml
7926873
Support pure float16 add/sub/mul/div operations in the CUDA (and CPU) backend (ggml/1121)
2b94a24
cmdr2
commited on
opencl: fix for small models (llama/11950)
4532dc6
lhez
Shawn Gu
Skyler Szot
commited on
Optimize mul_mat for Q4_0 on Intel GPU (llama/12035)
14fd317
Neo Zhang Jianyu
arthw
commited on
SYCL: Fix GGML_SYCL_DEBUG macro (llama/11995)
310a36c
ggml-cpu: Support s390x SIMD Instruction Set (llama/12019)
4aa54ec
Aaron Teo
Jinyang He
junchao-zhao
commited on
CUDA: app option to compile without FlashAttention (llama/12025)
fbc5f16
CUDA: optimize FA for GQA + large batches (llama/12014)
6662d54
cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (llama/12000)
6cb8158
CUDA: correct the lowest Maxwell supported by CUDA 12 (llama/11984)
6641178
MUSA: support ARM64 and enable dp4a .etc (llama/11843)
ab96dac
Bodhi
Bodhi Hu
commited on
ggml-cpu: Add CPU backend support for KleidiAI library (llama/11390)
9de6d81
Charles Xu
commited on
ggml: aarch64: implement SVE kernels for q3_K_q8_K vector dot (llama/11917)
1a1acd2
CUDA: use async data loading for FlashAttention (llama/11894)
5b9980d
vulkan: implement several ops relevant for ggml_opt (llama/11769)
3c2171d
Rémy O
commited on
vulkan: support multi/vision rope, and noncontiguous rope (llama/11902)
1c7a669
metal : fix the crash caused by the lack of residency set support on Intel Macs. (llama/11904)
afbd891
Hale Chan
commited on
metal : optimize dequant q6_K kernel (llama/11892)
376cbe6
Adrian Kretz
commited on
repo : update links to new url (llama/11886)
9705bb5
vulkan: initial support for IQ1_S and IQ1_M quantizations (llama/11528)
0d2e888
Rémy O
commited on
opencl: Fix rope and softmax (llama/11833)
bf3b6f8
lhez
commited on
cuda : add ampere to the list of default architectures (llama/11870)
1d19dec
Diego Devesa
commited on
ggml: optimize some vec dot functions for LoongArch ASX (llama/11842)
e3acbfc
Jinyang He
commited on
vulkan: linux builds + small subgroup size fixes (llama/11767)
e3f0e78
Eve
commited on
llamafile: use member variable instead of constant for iq4nlt (llama/11780)
0cb2d04
musa: bump MUSA SDK version to rc3.1.1 (llama/11822)
ff2d3eb
R0CKSTAR
commited on
ggml-cpu : add chunking support to mul_mat_id (llama/11666)
e59d9a7
Diego Devesa
commited on
ggml : x2 speed for WASM by optimizing SIMD (llama/11453)
464a186
Xuan-Son Nguyen
camel-cdr
commited on
HIP: Remove GCN from list of devices that avoid MMQ (llama/11831)
78aed55
uvos
commited on
HIP: Switch to std::vector in rocblas version check (llama/11820)
e144c94
uvos
commited on
cleanup: fix compile warnings associated with gnu_printf (llama/11811)
ef6a968
bandoti
commited on
ggml : fix multi-threaded clamp_f32 (llama/11824)
1b1d6a8
Richard
commited on
ggml-cpu: Fix duplicate MATMUL_INT8 (llama/11817)
05b9e78
CUDA: fix CUDART_VERSION checks (llama/11821)
04f123a
Fix #11802: Compile bug - RegQueryValueExA changed to RegQueryValueEx (llama/11803)
86969ac
Sheldon Robinson
commited on
CUDA: use arch list for compatibility check (llama/11775)
b88e163
fix: typos in documentation files (llama/11791)
5c6d350
Maxim Evtush
commited on
vulkan: Make Vulkan optional at runtime (ggml/11493). (llama/11494)
762f497
vulkan: add environment variable GGML_VK_PREFER_HOST_MEMORY to avoid VRAM allocation (llama/11592)
f9fd130
Wagner Bruna
commited on
vulkan: account for lookup tables when checking shared memory size (llama/11502)
758970f
ggml: Fix data race in ggml threadpool (llama/11736)
5554d5f
Karol Kontny
commited on