AudioRAG is becoming real! Just built a demo with ColQwen-Omni that does semantic search on raw audio, no transcription needed.
Drop in a podcast, ask your question, and it finds the exact chunks where it happens. You can also get a written answer.
What’s exciting: it skips transcription, making it faster and better at capturing emotion, ambient sound, and tone, surfacing results text search would miss.
AudioRAG is becoming real! Just built a demo with ColQwen-Omni that does semantic search on raw audio, no transcription needed.
Drop in a podcast, ask your question, and it finds the exact chunks where it happens. You can also get a written answer.
What’s exciting: it skips transcription, making it faster and better at capturing emotion, ambient sound, and tone, surfacing results text search would miss.