A Body Part Embedding Model with Datasets for Measuring 2D Human Motion Similarity

Jonghyuk Park†   Sukhyun Cho†   Dongwoo Kim‡   Oleksandr Bailo‡   Heewoong Park†   Sanghoon Hong‡   Jonghun Park†

Seoul National University†   Kakao Brain‡

in IEEE Access

Paper | Code | BibTeX

SARA Dataset

NTU RGB+D 120 Similiarity Annotations

Abstract

Abstract

Human motion similarity is practiced in many fields, including action recognition, anomaly detection, and human performance evaluation. While many computer vision tasks have benefited from deep learning, measuring motion similarity has attracted less attention, particularly due to the lack of large datasets. To address this problem, we introduce two datasets: a synthetic motion dataset for model training and a dataset containing human annotations of real-world video clip pairs for motion similarity evaluation. Furthermore, in order to compute the motion similarity from these datasets, we propose a deep learning model that produces motion embeddings suitable for measuring the similarity between different motions of each human body part. The network is trained with the proposed motion variation loss to robustly distinguish even subtly different motions. The proposed approach outperforms the other baselines considered in terms of correlations between motion similarity predictions and human annotations while being suitable for real-time action analysis. Both datasets and codes will be released to the public.
Method

Method

First, we separate the motion attribute from the skeleton and view by body part.




Then, simliarity is calculated only with motion embedding.



See paper for more details.

datasets

Datasets

We created two datasets for learning and evaluating motion similarity.

  1. For model training, a motion sequence was generated using the Adobe Mixamo framework. (dataset link)

  2. For model evaluation, similarity annotations between motion video pairs of NTU RGB+D 120 were collected. (dataset link)

application

Application

Here are examples of measuring the similarity between two dances!

acknowledgements

Acknowledgments

  • This work was supported by Kakao and Kakao Brain corporations.
  • Model implementation code borrows heavily from 2D-Motion-Retargeting.
  • Portions of the research used the NTU RGB+D 120 Action Recognition Dataset made available by the ROSE Lab at the Nanyang Technological University, Singapore.