add AIBOM
#18 opened 5 months ago
by
RiccardoDav
ds-v2-chat
#17 opened 6 months ago
by
Elon7111
NAN issue using FP16 to load the model
#15 opened about 1 year ago
by
joeltseng
ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run `pip install flash_attn`
👍
3
#14 opened over 1 year ago
by
kang1
How much memory is needed if you make the 128k context length
1
#13 opened over 1 year ago
by
ggbondcxk
Implement MLA inference optimizations to DeepseekV2Attention
🤗
🔥
7
#12 opened over 1 year ago
by
sy-chen
Can you provide a sample code for training with DeepSpeed ZeRO3?
2
#10 opened over 1 year ago
by
SupercarryNg
Ollama support
👍
1
1
#9 opened over 1 year ago
by
Dao3
MoE offloading strategy?
2
#8 opened over 1 year ago
by
Minami-su
Update README.md
#7 opened over 1 year ago
by
VanishingPsychopath
kv cache
👀
2
3
#6 opened over 1 year ago
by
FrankWu
function/tool calling support
8
#5 opened over 1 year ago
by
kaijietti
fail to run the example
8
#4 opened over 1 year ago
by
Leymore
GPTQ plz
10
#3 opened over 1 year ago
by
Parkerlambert123
vllm support
7
#2 opened over 1 year ago
by
Sihangli
llama.cpp support
👍
➕
18
5
#1 opened over 1 year ago
by
cpumaxx