Ayush Sharma
Ayush173
ยท
AI & ML interests
Machine learning, alignment research
Recent Activity
published
a model
about 2 months ago
Ayush173/Qwen2.5-VL-3B-Instruct-trl-mpo-rlaif-v
upvoted
an
article
8 months ago
Illustrating Reinforcement Learning from Human Feedback (RLHF)
updated
a model
8 months ago
Ayush173/SmolLM2-FT-MyDataset