Indexing Large Human-Motion Databases. VLDB 2004: pp 780-791

Eamonn Keogh, Themistoklis Palpanas, Victor B. Zordan, Dimitrios Gunopulos, Marc Cardle

This page presents a set of example animations comparing the performance of scale-invariant matching versus Dynamic Time Warping (DTW) matching. Shrinking as well as stretching invariance are demonstrated.

Given a query motion capture sequence, our system finds the closest matches either using DTW or Scale-invariant matching in our motion capture database. Matching is carried out on the upper-body[1] only -therefore the leg motions should be ignored by the reader. The upper body motion is defined by a set of pre-defined bone lengths and a set of time-varying joint rotations. A joint rotation is defined by 3 Euler angle curves – rotation around X axis, Y axis and Z axis.

The distance between two character poses (i.e. upperbody configurations) is defined as the weighted Euclidean distance of all the joint orientations describing the upperbody. A weighted sum is used since some joints have more visual impact then others. For example, the shoulder rotation has more visual impact than the hand rotation.

Below are two extracts from an example animation:

The top left plot shows the pose of the character for the query motion at the time current instant.

The top right plot shows the pose of the character for the current match motion for the same time instant. The matches are displayed from the best to the worst (i.e. Match1 = best match, and Match3 = worst of best matches). The best matches for DTW are first displayed, followed by the best matches for scale-invariant matching.

The bottom left plot shows the current query and match poses overlapped for easier assessment.

The bottom right plot shows the evolution of the upperbody barycentre. The 3D barycentre is defined as the weighted sum of the joint positions for all joints in the upperbody. It is used to give a good intuition of overall movement. The barycentre for the query (in blue) and current match (in red) are plotted along-side for comparison. A 3D plot of the XYZ position of the barycentre is first displayed, followed by three 1D plots of each dimension. The green markers indicate the query’s and current match’s barycentre position along the overall barycentric trajectory at the current animation frame.

Note that in all pose plots, to improve clarity the character’s global rotation and translation (defined at the hips) in cancelled out. This is because keeping the global rotation/translation would make visual comparisons difficult. Consequently, some motions might appear unnatural which is not the case when we reapply the global parameters.

Click on the following links below to view the example videos. The videos require the free DivX video codec to decompress and display correctly.

Example 1: Arm extension (please recall leg motions should be ignored)

Example 2: Sitting (please recall leg motions should be ignored)

Example 3: Standing up (please recall leg motions should be ignored)

Notice that the top matches under scaling are spatially closer to the query motion. The matches found with DTW accounts for small fluctuations in the time axis. However, DTW cannot capture better matches that are only slightly temporally longer/shorter, but spatially closer as demonstrated in our examples. In the top right plot, shorter matches have less animation frames than the query, and inversely for the longer matches.

All the test datasets used in these paper are also available here. Note that the exact queries and candidate sequences are available, so our experiments can be exactly reproduced.

[1] the head, neck, shoulders, elbows, back, hands