Instructions to use deepseek-ai/DeepSeek-R1-0528-Qwen3-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use deepseek-ai/DeepSeek-R1-0528-Qwen3-8B with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("deepseek-ai/DeepSeek-R1-0528-Qwen3-8B", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Any plans on 30B-A3B model?
#1
by xxx777xxxASD - opened
This would be amazing, Qwen3-32B too
imagine if they distill the
Qwen3-235B-A22B
imagine if they distill the
Qwen3-235B-A22B
We dream
llama 4 scout?
Mistra 3.1 24B 2503?
Qwen still hasn't released the base models of the 32B and the 235B models so I don't know if they can do post-training on top of instruct.
May be distilled 32B or 70B would get a good quality.