DeepNAG: Deep Non-Adversarial Gesture Generation

Abstract

Synthetic data generation to improve classification performance (data augmentation) is a well-studied problem. Recently, generative adversarial networks (GAN) have shown superior image data augmentation performance, but their suitability in gesture synthesis has received inadequate attention. Further, GANs prohibitively require simultaneous generator and discriminator network training. We tackle both issues in this work. We first discuss a novel, device-agnostic GAN model for gesture synthesis called DeepGAN. Thereafter, we formulate DeepNAG by introducing a new differentiable loss function based on dynamic time warping and the average Hausdorff distance, which allows us to train DeepGAN’s generator without requiring a discriminator. Through evaluations, we compare the utility of DeepGAN and DeepNAG against two alternative techniques for training five recognizers using data augmentation over six datasets. We further investigate the perceived quality of synthesized samples via an Amazon Mechanical Turk user study based on the HYPE∞ benchmark. We find that DeepNAG outperforms DeepGAN in accuracy, training time (up to 17x faster), and realism, thereby opening the door to a new line of research in generator network design and training for gesture synthesis.

Our paper won the IUI 2021 Honorable Mention Award!

Paper, Citation and Code

DeepNAG

The PDF version of our paper is available here (original arXiv draft). Our source code is available on GitHub. If you find this work helpful, please cite the following publications:

@misc{maghoumi2020deepnag,
    title={{DeepNAG: Deep Non-Adversarial Gesture Generation}}, 
    author={Mehran Maghoumi and Eugene M. Taranta II au2 and Joseph J. LaViola Jr au2},
    year={2020},
    eprint={2011.09149},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

@phdthesis{maghoumi2020dissertation,
title={{Deep Recurrent Networks for Gesture Recognition and Synthesis}},
author={Mehran Maghoumi},
year={2020},
school={University of Central Florida Orlando, Florida}
}

Soft DTW for PyTorch in CUDA

As a part of this project, we implemented the soft dynamic time warping algorithm in PyTorch with CUDA. Our implementation is up to 100x faster than a comparable CPU implementation. Our CUDA implementation of the sDTW algorithm is available on GitHub.

JitGRU: GRU with PyTorch’s TorchScript

Using PyTorch’s JIT framework, we implemented GRU units that support second-order derivatives. Our implementation is available on GitHub.

Demo Video

Here’s a demo video showing some synthetic gestures generated using DeepNAG: