This is a 9B model whose architecture is deepseek-v3, trained from scratch using 350B+ tokens from fully open-source, English-only datasets. It is designed for development and debugging purposes within the open-source community.
- Downloads last month
- 14,933