Test organization

non-profit

AI & ML interests

None defined yet.

Recent Activity

KaituoFeng authored a paper 11 days ago

Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback

KaituoFeng authored a paper 11 days ago

Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing

KaituoFeng authored a paper 11 days ago

SpaceVista: All-Scale Visual Spatial Reasoning from mm to km

View all activity

KaituoFeng

authored 4 papers 11 days ago

Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback

Paper • 2506.03106 • Published Jun 3 • 6

Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing

Paper • 2506.09965 • Published Jun 11 • 3

SpaceVista: All-Scale Visual Spatial Reasoning from mm to km

Paper • 2510.09606 • Published Oct 10 • 17

OneThinker: All-in-one Reasoning Model for Image and Video

Paper • 2512.03043 • Published 13 days ago • 30

KaituoFeng

authored a paper 15 days ago

Architecture Decoupling Is Not All You Need For Unified Multimodal Model

Paper • 2511.22663 • Published 18 days ago • 28

KaituoFeng

authored 2 papers 7 months ago

MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs

Paper • 2505.21327 • Published May 27 • 83

SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward

Paper • 2505.17018 • Published May 22 • 15

KaituoFeng

updated a dataset 8 months ago

Testorganize/Evaluation-fkt

Viewer • Updated Apr 19 • 10.3k

kxgong

authored a paper 9 months ago

Video-R1: Reinforcing Video Reasoning in MLLMs

Paper • 2503.21776 • Published Mar 27 • 79

KaituoFeng

authored a paper 9 months ago

Video-R1: Reinforcing Video Reasoning in MLLMs

Paper • 2503.21776 • Published Mar 27 • 79

BreakLee

authored 2 papers 9 months ago

VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI

Paper • 2410.11623 • Published Oct 15, 2024 • 49

Video-R1: Reinforcing Video Reasoning in MLLMs

Paper • 2503.21776 • Published Mar 27 • 79

KaituoFeng

published a dataset 9 months ago

Testorganize/Evaluation-fkt

Viewer • Updated Apr 19 • 10.3k

KaituoFeng

updated a dataset 9 months ago

Testorganize/Video-fkt

Viewer • Updated Mar 10 • 61.2k • 5

Xidong

authored a paper 12 months ago

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 104

kxgong

authored a paper about 1 year ago

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Paper • 2412.02611 • Published Dec 3, 2024 • 26

KaituoFeng

authored a paper about 1 year ago

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Paper • 2412.02611 • Published Dec 3, 2024 • 26

BreakLee

authored a paper about 1 year ago

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Paper • 2412.02611 • Published Dec 3, 2024 • 26

Xidong

authored a paper about 1 year ago

Roadmap towards Superhuman Speech Understanding using Large Language Models

Paper • 2410.13268 • Published Oct 17, 2024 • 34

Xidong

authored a paper over 1 year ago

Huatuo-26M, a Large-scale Chinese Medical QA Dataset

Paper • 2305.01526 • Published May 2, 2023