Infinite Motion: Extended Motion Generation via Long Text Instructions

Mengtian Li1,2,4, Chengshuo Zhai1, Shengxiang Yao1, Zhifeng Xie1,2*, Keyu Chen3*, Yu-Gang Jiang4,
*Corresponding author
1Shanghai University, 2Shanghai Engineering Research Center of Motion Picture Special Effects, 3Tavus Inc., 4Fudan University,

Abstract

In the realm of motion generation, the creation of long-duration, high-quality motion sequences remains a significant challenge. This paper presents our groundbreaking work on "Infinite Motion", a novel approach that leverages long text to extended motion generation, effectively bridging the gap between short and long-duration motion synthesis. Our core insight is the strategic extension and reassembly of existing high-quality text-motion datasets, which has led to the creation of a novel benchmark dataset to facilitate the training of models for extended motion sequences. A key innovation of our model is its ability to accept arbitrary lengths of text as input, enabling the generation of motion sequences tailored to specific narratives or scenarios. Furthermore, we incorporate the timestamp design for text which allows precise editing of local segments within the generated sequences, offering unparalleled control and flexibility in motion synthesis. We further demonstrate the versatility and practical utility of "Infinite Motion" through three specific applications: natural language interactive editing, motion sequence editing within long sequences and splicing of independent motion sequences. Each application highlights the adaptability of our approach and broadens the spectrum of possibilities for research and development in motion generation. Through extensive experiments, we demonstrate the superior performance of our model in generating long sequence motions compared to existing methods.

Method

Infinite Motion Pipeline: Our model consists of two stages. In Stage I, the model undergoes a diffusion process in the latent space, generating multiple segments of short motion sequences simultaneously. In Stage II, the timestamp stitcher concatenates these short motion segments to form an infinite sequence of motions.

HumanML3D-Extend dataset

Demos

Edit

BibTeX


      @inproceedings{Li2024InfiniteME,
        title={Infinite Motion: Extended Motion Generation via Long Text Instructions},
        author={Mengtian Li and Chengshuo Zhai and Shengxiang Yao and Zhifeng Xie and Keyu Chen and Yu-Gang Jiang},
        year={2024},
        url={https://api.semanticscholar.org/CorpusID:271097653}
      }