brucewan666's picture

2 10 1

brucewan666

brucewan666

https://github.com/SUSTechBruce

SUSTechBruce

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning

commented on a paper about 2 months ago

Reinforcement Pre-Training

published a model 5 months ago

brucewan666/DeepSeek-R1-Distill-Qwen-1.5B-GRPO

View all activity

Organizations

None yet

models 2

brucewan666/DeepSeek-R1-Distill-Qwen-1.5B-GRPO

brucewan666/Qwen2.5-1.5B-Open-R1-Distill

datasets 0

None public yet