DeepGesture: A conversational gesture synthesis system based on emotions and semantics

DeepGesture is a diffusion-based gesture synthesis framework for generating expressive co-speech gestures conditioned on multimodal signals—text, speech, emotion, and seed motion. Built upon the DiffuseStyleGesture model, DeepGesture introduces novel architectural enhancements that improve semantic alignment and emotional expressiveness in generated gestures. Specifically, it integrates fast text transcriptions as semantic conditioning and implements emotion-guided classifier-free diffusion to support controllable gesture generation across affective states. This system supports interpolation between emotional states and demonstrates generalization to out-of-distribution speech, marking a step forward toward fully multimodal, emotionally aware digital humans.

Usage

To get started with the DeepGesture model, follow the instructions below for setting up your environment and running the core script.

Environment Setup

Anaconda is recommended for managing the Python environment and dependencies.

conda create -n DeepGesture python=3.12 numpy scipy matplotlib
conda activate DeepGesture
conda install -c anaconda scikit-learn
conda install pytorch torchvision torchaudio cudatoolkit -c pytorch

Running the Model

After installing the requirements, you can run the model by executing the Network.py script:

python Network.py

The model expects input data as a matrix where each row represents a data sample containing joint velocity features. Each data sample has a shape of k*d, where k is the number of joints (e.g., 26) and d represents the 3 XYZ velocity values transformed into the local root space of the character.

For more in-depth details on data preparation, specific input/output formats, and integration with external tools like Unity for visualization, please refer to the official GitHub repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support