CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion
Paper
•
2512.19535
•
Published
•
10
Multilingual Natural Language Processing - Summarization - Entity Extraction - Speech to Text - Text classification - Generative AI