AdapTac-Dex: Adaptive Visuo-Tactile Fusion with Predictive Force Attention for Dexterous Manipulation
This repository contains the official checkpoint for AdapTac-Dex, a model presented in the paper Adaptive Visuo-Tactile Fusion with Predictive Force Attention for Dexterous Manipulation.
AdapTac-Dex introduces a novel approach to robotic dexterous manipulation by adaptively utilizing multi-sensory data. It proposes a force-guided attention fusion module that dynamically adjusts the weights of visual and tactile features. This is further reinforced by a self-supervised future force prediction auxiliary task, which helps improve data imbalance and encourages proper attention adjustment during various manipulation stages.
- π Paper: Adaptive Visuo-Tactile Fusion with Predictive Force Attention for Dexterous Manipulation
- π Project Page: https://adaptac-dex.github.io/
- π» Code / GitHub Repository: https://github.com/tianhaowuhz/3dtacdex
- π₯ Video Demo: https://www.youtube.com/watch?v=Aq34cDWNBE8
Abstract
Effectively utilizing multi-sensory data is important for robots to generalize across diverse tasks. However, the heterogeneous nature of these modalities makes fusion challenging. Existing methods propose strategies to obtain comprehensively fused features but often ignore the fact that each modality requires different levels of attention at different manipulation stages. To address this, we propose a force-guided attention fusion module that adaptively adjusts the weights of visual and tactile features without human labeling. We also introduce a self-supervised future force prediction auxiliary task to reinforce the tactile modality, improve data imbalance, and encourage proper adjustment. Our method achieves an average success rate of 93% across three fine-grained, contact-rich tasks in real-world experiments. Further analysis shows that our policy appropriately adjusts attention to each modality at different manipulation stages.
Usage
For detailed instructions on setting up the environment, running the code, and utilizing pre-trained checkpoints, please refer to the official GitHub repository.
Citation
If you find this work beneficial, please consider citing our paper:
@article{li2025adaptive,
title={Adaptive Visuo-Tactile Fusion with Predictive Force Attention for Dexterous Manipulation},
author={Li, Jinzhou and Wu, Tianhao and Zhang, Jiyao and Chen, Zeyuan and Jin, Haotian and Wu, Mingdong and Shen, Yujun and Yang, Yaodong and Dong, Hao},
journal={arXiv preprint arXiv:2505.13982},
year={2025}
}