arxiv:2310.00149
Jiarui Feng PRO
WFRaain
AI & ML interests
None yet
Recent Activity
updated
a dataset
21 days ago
WFRaain/TAG_datasets
upvoted
a
paper
28 days ago
EPO: Entropy-regularized Policy Optimization for LLM Agents
Reinforcement Learning
upvoted
a
paper
2 months ago
Self-Rewarding Vision-Language Model via Reasoning Decomposition
Organizations
None yet