Human Activity Recognition (HAR) systems have been extensively studied b...
To properly assist humans in their needs, human activity recognition (HA...
Video-based dialog task is a challenging multimodal learning task that h...
We introduce a Transformer based 6D Object Pose Estimation framework
Vid...