DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment Paper • 2507.02768 • Published 28 days ago • 3
STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for Spoken Language Models Paper • 2507.15375 • Published 10 days ago • 25
Audio-Aware Large Language Models as Judges for Speaking Styles Paper • 2506.05984 • Published Jun 6 • 15
Can Large Language Models Be an Alternative to Human Evaluations? Paper • 2305.01937 • Published May 3, 2023 • 2
A Closer Look into Automatic Evaluation Using Large Language Models Paper • 2310.05657 • Published Oct 9, 2023
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR Paper • 2402.03988 • Published Feb 6, 2024
Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations Paper • 2402.12786 • Published Feb 20, 2024