The results below show the 4 frame animation when a single index $s_i \in \{0, 1, 2, ... , 511\}$ (in this case $i = 2$ and $i = 5$ ) are decoded using VQ-VAE's decoder. The VQ-VAE helps quantize the dataset into 4 frames worth of animation that can be concatenated to get a longer and more coherent animation.
Taking a step forward with left leg | Taking a step forward with right leg |
---|---|
The following results showcase the capabilities of the T2M-GPT model. It is worth noting that the transformer in the T2M-GPT model performs poorly on longer text prompts and fails to give the correct ordered set of index sequences, thereby producing incorrect or incomplete animations. The last figure shows how we can simply concatenate the results of two distinct text prompts, i.e., the output of the decoder (produced by the indicies given by transformer) is concatenated together. Then this concatenated motion is processed such that the global angular velocities are concatenated to find the global orientation, the global linear velocities are concatenated to find the global location in the X-Z plane and the joint locations are calculated appropriately using the kinematic chain.
Action 1 | Action 2 | Combination of Actions using Transformer | Concatenation of Actions |
---|---|---|---|
Motion generated from a long text prompt using T2M-GPT are shown below.
Example 1 | Concatenated Example 1 | Example 2 | Concatenated Example 2 |
---|---|---|---|
Randomly sampled index sequence from label 1 (run) | Randomly sampled index sequence from label 2 (special) |
---|---|
decoding each index at a time | decoding 15 indices at a time | decoding all the indices at once |
---|---|---|
Path of human walking forward | A human walking forward |
---|---|
Path of human walking backward | A human walking backward |
---|---|
Path of human walking left | A human walking left |
---|---|
Path of human walking right | A human walking right |
---|---|
Path of human walking forward and then right | A human walking forward and then right |
---|---|
A few more examples of the path fit are shown below
Path of Example 1 | Gif of Example 1 |
---|---|
Path of Example 2 | Gif of Example 2 |
---|---|
Path of Example 3 | Gif of Example 3 |
---|---|