DeepGesture: A conversational gesture synthesis system based on emotions and semantics
DeepGesture is a diffusion-based gesture synthesis framework for generating expressive co-speech gestures conditioned on multimodal signals—text, speech, emotion, and seed motion. Built upon the DiffuseStyleGesture model, DeepGesture introduces novel architectural enhancements that improve semantic alignment and emotional expressiveness in generated gestures. Specifically, it integrates fast text transcriptions as semantic conditioning and implements emotion-guided classifier-free diffusion to support controllable gesture generation across affective states. This system supports interpolation between emotional states and demonstrates generalization to out-of-distribution speech, marking a step forward toward fully multimodal, emotionally aware digital humans.
- Paper: DeepGesture: A conversational gesture synthesis system based on emotions and semantics
- Project Page: https://deepgesture.github.io
- Code: https://github.com/DeepGesture/DeepGesture
Usage
To get started with the DeepGesture model, follow the instructions below for setting up your environment and running the core script.
Environment Setup
Anaconda is recommended for managing the Python environment and dependencies.
conda create -n DeepGesture python=3.12 numpy scipy matplotlib
conda activate DeepGesture
conda install -c anaconda scikit-learn
conda install pytorch torchvision torchaudio cudatoolkit -c pytorch
Running the Model
After installing the requirements, you can run the model by executing the Network.py
script:
python Network.py
The model expects input data as a matrix where each row represents a data sample containing joint velocity features. Each data sample has a shape of k*d
, where k
is the number of joints (e.g., 26) and d
represents the 3 XYZ velocity values transformed into the local root space of the character.
For more in-depth details on data preparation, specific input/output formats, and integration with external tools like Unity for visualization, please refer to the official GitHub repository.