Sparsh
spsbosch
ยท
AI & ML interests
None yet
Recent Activity
new activity
about 2 months ago
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B:Can you please release how you post-train qwen3 on deepseek?
upvoted
a
paper
3 months ago
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
new activity
3 months ago
nvidia/Nemotron-H-8B-Base-8K:RL/ Instruct Models wen ?
Organizations
None yet