MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model

Abstract

This work introduces MotionLCM, extending controllable motion generation to a real-time level. Existing methods for spatial-temporal control in text-conditioned motion generation suffer from significant runtime inefficiency. To address this issue, we first propose the motion latent consistency model (MotionLCM) for motion generation, building on the motion latent diffusion model. By adopting one-step (or few-step) inference, we further improve the runtime efficiency of the motion latent diffusion model for motion generation. To ensure effective controllability, we incorporate a motion ControlNet within the latent space of MotionLCM and enable explicit control signals (i.e., initial motions) in the vanilla motion space to further provide supervision for the training process. By employing these techniques, our approach can generate human motions with text and control signals in real-time. Experimental results demonstrate the remarkable generation and controlling capabilities of MotionLCM while maintaining real-time runtime efficiency.

BibTeX

@inproceedings{motionlcm, title={Motionlcm: Real-time controllable motion generation via latent consistency model}, author={Dai, Wenxun and Chen, Ling-Hao and Wang, Jingbo and Liu, Jinpeng and Dai, Bo and Tang, Yansong}, booktitle={ECCV}, pages={390--408}, year={2025} }

@article{motionlcm-v2, title={Real-time Controllable Motion Generation via Latent Consistency Model}, author={Dai, Wenxun and Chen, Ling-Hao and Huo, Yufei and Wang, Jingbo and Liu, Jinpeng and Dai, Bo and Tang, Yansong} }

MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model

Poster

Abstract

Video

Text-to-Motion (1-step, ~30ms/sample)

Motion Control (1-step, ~34ms/sample, Dense signals on pelvis)

Motion Control (1-step, ~34ms/sample, Sparse signals on pelvis)

Pipeline

Results

BibTeX