arxiv:2503.22668
Sindhu Hegde
sindhuhegde
AI & ML interests
Computer Vision, Multimodal Learning: Vision + Speech/Language, Deep Learning, Machine Learning
Recent Activity
reacted
to
DmitryRyumin's
post
with π₯
about 23 hours ago
πππ New Research Alert - ICCV 2025 (Oral)! ππ€π
π Title: Understanding Co-speech Gestures in-the-wild π
π Description: JEGAL is a tri-modal model that learns from gestures, speech and text simultaneously, enabling devices to interpret co-speech gestures in the wild.
π₯ Authors: @sindhuhegde, K R Prajwal, Taein Kwon, and Andrew Zisserman
π
Conference: ICCV, 19 β 23 Oct, 2025 | Honolulu, Hawai'i, USA πΊπΈ
π Paper: https://huggingface.co/papers/2503.22668
π Web Page: https://www.robots.ox.ac.uk/~vgg/research/jegal
π Repository: https://github.com/Sindhu-Hegde/jegal
πΊ Video: https://www.youtube.com/watch?v=TYFOLKfM-rM
π ICCV-2023-25-Papers: https://github.com/DmitryRyumin/ICCV-2023-25-Papers
π Added to the Human Modeling Section: https://github.com/DmitryRyumin/ICCV-2023-25-Papers/blob/main/sections/2025/main/human-modeling.md
π More Papers: more cutting-edge research presented at other conferences in the https://huggingface.co/spaces/DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin
π Keywords: #CoSpeechGestures #GestureUnderstanding #TriModalRepresentation #MultimodalLearning #AI #ICCV2025 #ResearchHighlight
new activity
2 months ago
sindhuhegde/avs-spot:Update task categories to `video-text-to-text`
updated
a dataset
2 months ago
sindhuhegde/avs-spot
Organizations
None yet