Align and Attend: Multimodal Summarization with Dual Contrastive Losses Paper • 2303.07284 • Published Mar 13, 2023
MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos Paper • 2306.04216 • Published Jun 7, 2023
MHMS: Multimodal Hierarchical Multimedia Summarization Paper • 2204.03734 • Published Apr 7, 2022 • 1
Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition Paper • 2403.12339 • Published Mar 19, 2024
Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment Paper • 2210.04722 • Published Oct 10, 2022 • 1
LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos Paper • 2210.05840 • Published Oct 12, 2022 • 1
Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models? Paper • 2301.09017 • Published Jan 21, 2023
Can Brain Signals Reveal Inner Alignment with Human Languages? Paper • 2208.06348 • Published Aug 10, 2022
Automated Cardiovascular Record Retrieval by Multimodal Learning between Electrocardiogram and Clinical Report Paper • 2304.06286 • Published Apr 13, 2023
Evaluating Durability: Benchmark Insights into Multimodal Watermarking Paper • 2406.03728 • Published Jun 6, 2024
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning Paper • 2505.24871 • Published May 30 • 22
MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models Paper • 2507.12806 • Published 15 days ago • 16
Interpolation for Robust Learning: Data Augmentation on Geodesics Paper • 2302.02092 • Published Feb 4, 2023 • 1
Embodied Executable Policy Learning with Language-based Scene Summarization Paper • 2306.05696 • Published Jun 9, 2023 • 3