New discussion

DeepSeek Training Support

#34 opened 6 days ago by
SuperXr

Questions on MoE Hash Routing

#22 opened about 1 month ago by
mattduerrmeier

120B model?

👍 3
2
#21 opened about 1 month ago by
jacek2024

Is 158B or 284b params ?

6
#17 opened about 1 month ago by
celsowm

Add chat template

🔥 4
6
#16 opened about 1 month ago by
Rocketknight1

bsarpel

#14 opened about 1 month ago by
bsarpel

终于来了~

#7 opened about 1 month ago by
zhubao315

量化相关

1
#6 opened about 1 month ago by
Paulzhou

起立,立正!

🔥 7
3
#5 opened about 1 month ago by
lizhooh

前排前排!合影

1
#4 opened about 1 month ago by
hakupro

Sick

#3 opened about 1 month ago by
Green-eyedDevil

support inference by sglang or vllm ?

3
#2 opened about 1 month ago by
howtain