Multi-Modal Understanding Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper • 2504.00557 • Published Apr 1 • 15
Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper • 2504.00557 • Published Apr 1 • 15
SERs speechbrain/emotion-recognition-wav2vec2-IEMOCAP Audio Classification • Updated Jul 23, 2024 • 703k • 165 CAiRE/SER-wav2vec2-large-xlsr-53-eng-zho-all-age Audio Classification • Updated Jun 27, 2023 • 131 • 4
speechbrain/emotion-recognition-wav2vec2-IEMOCAP Audio Classification • Updated Jul 23, 2024 • 703k • 165
CAiRE/SER-wav2vec2-large-xlsr-53-eng-zho-all-age Audio Classification • Updated Jun 27, 2023 • 131 • 4
Multi-Modal Understanding Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper • 2504.00557 • Published Apr 1 • 15
Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper • 2504.00557 • Published Apr 1 • 15
SERs speechbrain/emotion-recognition-wav2vec2-IEMOCAP Audio Classification • Updated Jul 23, 2024 • 703k • 165 CAiRE/SER-wav2vec2-large-xlsr-53-eng-zho-all-age Audio Classification • Updated Jun 27, 2023 • 131 • 4
speechbrain/emotion-recognition-wav2vec2-IEMOCAP Audio Classification • Updated Jul 23, 2024 • 703k • 165
CAiRE/SER-wav2vec2-large-xlsr-53-eng-zho-all-age Audio Classification • Updated Jun 27, 2023 • 131 • 4