Animation retargeting involves applying a sparse motion description (e.g., 2D/3D keypoint sequences) to a given character mesh to produce a semantically plausible and temporally coherent full-body motion. This brings the characters to life. Given its practical relevance, It remains a highly desired tool in any digital character workflow. An ideal data-driven solution to this problem should be able to work without templates, {\em without} access to corrective keyframes, and still generalize to novel characters and unseen motions. Existing approaches come with a mix of restrictions -- they require annotated training data, assume access to template-based shape priors or artist-designed deformation rigs, suffer from limited generalization to unseen motion and/or shapes, or exhibit motion jitter. We propose Self-supervised Motion Fields (SMF) as a self-supervised framework that can be robustly trained with sparse motion representations, without requiring dataset specific annotations, templates, or rigs. At the heart of our method are Kinetic Codes, a novel autoencoder-based sparse motion encoding, that exposes a semantically rich latent space simplifying large-scale training. Our architecture comprises of dedicated spatial and temporal gradient predictors, which are trained end-to-end. The resultant network, regularized by the Kinetic Codes's latent space, has good generalization across shapes and motions. We evaluated our method on unseen motion sampled from AMASS, D4D, Mixamo, and raw monocular video for animation transfer on various characters with varying shapes and topology. We report a new SoTA on the AMASS dataset in the context of generalization to unseen motion. (Source code will be released.)
We present results on motions sampled from Mixamo, extract 3D keypoints and transfer it to diverse unseen shapes. We interpolate the keypoint locations from Mixamo to roughly align with AMASS keypoint locations. Although it's a noisy approximation, SMF's Kinetic Codes enable smooth transfer, further proving the versatality of our approach as aligning keypoints using interpolation is simple and can be automated.
All Mixamo motions are completely unseen (ours and all baselines except Skeleton-free are only trained on AMASS). Note, Skeleton-free requires the complete mesh as input and it is additionally trained on Mixamo data.
Compared to ours, other methods fail to transfer the motion faithfully and/or result in numerous artifacts.
We compare our method on a wide variety of unseen motion transfer from AMASS to varied meshes with significantly different shape and topology. All results show transferred (unseen) motion to an unseen shape without rigging. We see strong generalization including to non-humaoid meshes such as a reptile (shown in running).
Notice the incorrect pose and presence of artifacts for baseline methods. Skeleton-Free Pose Transfer suffers from artifacts and unnatural deformations seen in the arms, pelvic and stretching of the neck (please zoom-in). Errors are highlighted in red. Note: Skeleton-Free Pose Transfer requires the complete source mesh as input compared to our sparse keypoints setup which is easier to author but more challenging to transfer.
Click the below button to switch between different motions.
Motion transfer to in-the-wild characters gathered from Mixamo, Sketchfab, etc.
Our Kinetic Code offers a smooth latent space for realistic motion interpolation. We mix different motion categories and transfer it to different characters. In contrast, we see interpolation of mesh vertices (ground truth source) in the Euclidean space leads to flattening of the hands in Knees + Shake Arms as they move in different directions. In One Leg Jump + Punching, we see the punching being preserved compared to the downwards punching motion in euclidean space.
We transfer motion from 2D monocular capture to an unseen in-the-wild 3D mesh without any rigging or template. This is a very challenging scenario and we use our 2D motion representation for solving this. While the transferred motion is not perfect and there are artifacts, ours is the first method capable of performing it without requiring a template shape.
@misc{muralikrishnan2025smftemplatefreerigfreeanimation,
title={SMF: Template-free and Rig-free Animation Transfer using Kinetic Codes},
author={Sanjeev Muralikrishnan and Niladri Shekhar Dutt and Niloy J. Mitra},
year={2025},
eprint={2504.04831},
archivePrefix={arXiv},
primaryClass={cs.GR},
url={https://arxiv.org/abs/2504.04831},
}