Implementation of the IK Rig proposed by A. Bareznyak and why it should be a good way of encoding poses for ML
You might have seen this presentation before. In it, Alexender Bareznyak, of Ubisoft, explains what he calls the IK Rig and the benefits that encoding motion capture animation as a set of IK chains has over good old FK chains.
I’ve just released an open source implementation of the IK Rig, or at least its most essential components. You can find the repo here. In this post I’ll go over the most important parts of the code and why I find the IK Rig appealing for ML.
What you’ll learn
- What the IK Rig is
- How to implement the IK Rig in Maya
- Why describing poses as IK data may be useful for ML applications
The IK Rig
As described in the original presentation the IK Rig is a system that enables real-time character retargeting and motion editing. As a system, it is made up of several parts, but its crucial component is the conversion of regular FK animation into the parameters that can control IK chains for all parts of the human body.
So what parameters are these?
- Chain root position
- Chain up-vector (or pole-vector)
- Effector position
- Effector orientation
With those parameters, you can control any IK chain of any size and number of joints. Alexander doesn’t mention it explicitly in the video, but since the whole point of the system is to be able to change the character’s proportions, I assume these values are normalized by the character’s limb sizes.
So, in a nutshell, this most essential part of the IK Rig consists of calculating those four parameters for six joint chains in the human body (spine, neck, arms, and legs), and then normalizing them by size.
Implementing the IK Rig in Maya
You can deploy the IK Rig in Maya using only vanilla nodes in a Cult of Rig Season 0 fashion. It is true; I have done it! I did it to a couple of limbs to prototype the implementation.
But I found it a lot more practical to implement this as Python DG Nodes. The implementation is composed of two such nodes, one (ikrig_encode) for encoding FK values into an array of normalized IK values, and the other (ikrig_decode) for decoding that array of values into positions and orientations for controlling the IK chains of a character of any size and proportions. If you have never implemented a Python DG node, I recommend you read this post first.
Encoding FK data into normalized IK data
There are a lot of inputs and output attributes to set up in each node. In the ikrig_encode we get the world matrices of most joints in the FK rig, plus a hips world matrix at rest pose, the hips height to the floor and the size of each of the character’s limbs.
The first thing we compute in the ikrig_encode is the global position and orientation of the character projected on a 2d plane:
# Global xfo with default hips height and 2d orientation # get xfo from rest pose (1) mat_hips_delta = mat_hips_rest.inverse() * mat_hips # use xfo to transform direction vector (2) direction = om.MVector([.0,.0,1.])*mat_hips_delta # constraint direction to xz plane (3) direction.y = 0 # get all vectors (4) zaxis = direction.normal() yaxis = om.MVector([.0,1.,.0]) xaxis = yaxis^zaxis.normal()
(1) We get the hips’ offset orientation to the rest pose and use that to (2) transform a vector pointing in the Z+ direction. (3) We flatten out any movement in the Y axis caused by that transformation and normalize the vector to unity. (4) Since we know our Z and Y, and since the X axis should be perpendicular to those two, all we need to do is take the cross product.
With the unity vectors of X, Y, and Z we can compose our global matrix (g_mat). We also translate the g_mat to the same X, Z position of the hips and scale is by the hips’ height. We’ll use the g_mat to localize the world matrices of all joints making our data invariant to global position and orientation (good stuff I talk more about later). But why scaling it to hips’ height? I did this to make the root positions of the IK chains invariant to the scale of the character; I arbitrarily chose the hips’ height; alternatively, you could choose the whole character’s height, for example. Bear in mind two characters of the same size may have different proportions, and thus, their steps will be of different lengths; two characters with the same hips’ heights should have steps of about the same length 😉.
Now, for the actual encoding I’ve created a function to apply the same operation for each one of the six chains:
def FK2encoded(g_mat, root_jnt, dir_jnt, eff_jnt, chain_length): # (1) Get normalized IK root position (Leg L) l_root_jnt = root_jnt * g_mat.inverse() ik_root = MMat2Trans(l_root_jnt) # (2) Get normalized effector position vec_root = MMat2Trans(root_jnt) vec_eff = MMat2Trans(eff_jnt) l_vec_eff = vec_eff - vec_root ik_eff = l_vec_eff * g_mat.homogenize().inverse() ik_eff /= chain_length # (3) Get IK direction vec_dir = MMat2Trans(dir_jnt) l_vec_dir = vec_root - vec_dir ik_upv = (l_vec_dir ^ l_vec_eff) ^ l_vec_eff # (3.1) localize direction to g_mat orientation ik_upv *= g_mat.inverse() ik_upv = ik_upv.normal() # (4) Localize eff rotation ik_eff_rot = om.MQuaternion() mat = eff_jnt * g_mat.inverse() ik_eff_rot.setValue(mat.homogenize()) return ik_root, ik_eff, ik_upv
We start (1) by localizing the root position to our g_mat [MMat2Trans only gets the translation vector from any given MMAtrix]. (2) Then we get the delta from the effector to the root and normalize that by the chain_length. (3) We get the cross product between the vector pointing to the effector and the one pointing to the knee; then we get the cross product between that resulting vector and the effector vector which results in the actual chain up-vector (see image below). And that is that; the orientation of the effector should be just whatever world space orientation it has localized to our g_mat.
Decoding normalized IK data into actual positions and orientations
The inputs for the decode node consist of one long stream of scalar values describing all IK parameters, the hips’ height, and the length of the chains. The outputs are the root positions, up-vectors, effector positions and orientations, and the character’s global position and orientation on a 2d plane.
The parameters for every chain go into the encoded2IK function:
def encoded2IK(encoded_pose_array, char_scale, chain_scale): ik_chain_root = om.MVector(encoded_pose_array * char_scale, encoded_pose_array * char_scale, encoded_pose_array * char_scale) ik_chain_eff = om.MVector(encoded_pose_array * chain_scale, encoded_pose_array * chain_scale, encoded_pose_array * chain_scale) ik_chain_upv = om.MVector(encoded_pose_array, encoded_pose_array, encoded_pose_array) quat = om.MQuaternion(encoded_pose_array, encoded_pose_array, encoded_pose_array, encoded_pose_array) ik_chain_eff_rot = quat.asEulerRotation() return ik_chain_root, ik_chain_eff, ik_chain_upv, ik_chain_eff_rot
The chain root’s position is multiplied by the character scale (hips’ height), since that is the length it has been normalized by. The effector’s position is multiplied by the chain scale, for the same reason. The up-vector and rotation need no scaling. It is that simple. Note that I then output those values to locators and parent them up in Maya, but you could add up their positions if you wanted to output world space values.
The IK Rig in Machine Learning
Encoding kinematic poses for machine learning purposes can be done in many fashions some good others bad, as I outline in this video. You don’t want, for example, to store the poses as a set of local joint positions because a change in character’s proportions would look like a change in pose for your model. Thus, most people use orientations to describe this type of data. Orientations are invariant to the character’s size and proportions.
But so is the IK Rig. Not only that but the IK Rig is invariant to the number of bones in a rig, and to the way, the orientations of each kinematic component have initially been set up.
Finally, if you are creating a generative model, the benefit is twofold: first, you don’t suffer from the cumulation of errors, since the model will err a bit in every output and since the orientation of the extremities is hierarchically dependent on all other orientations it is common to see smoothing and instability in the skeleton’s extremities; second, you start with a better output that is simpler to retarget to rigs of any number of bones and limbs’ proportions.
Inverse Kinematics has been around for a long time. Still, most of our motion capture data is stored in FK. Storing motion data purely as IK parameters seems to have many benefits for both the animation pipeline, as described by Alexander Bareznyak, and potentially has additional benefits when applied to machine learning applications.
Check out the IK Rig repo and see what you think. How does this tool fit into your pipeline?
Like this content?
Join our mailing list!
By joining our mailing list, you will be notified about new content as we post it. You will be able to give me feedback, so I can tailor new content to your needs. Sign up!