Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31, 2025 • 301
MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs Paper • 2505.21327 • Published May 27, 2025 • 83
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published May 8, 2025 • 86
MMaDA: Multimodal Large Diffusion Language Models Paper • 2505.15809 • Published May 21, 2025 • 97
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14, 2025 • 98
view article Article Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers Nov 3, 2022 • 340
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9, 2024 • 44