Synthesizing physically plausible human motions in 3D scenes is a challe...
Panoptic Scene Graph (PSG) generation aims to generate scene graph
Video-text retrieval is a class of cross-modal representation learning
While contrastive learning greatly advances the representation of senten...
While large scale pre-training has achieved great achievements in bridgi...
Given the increase in the use of personal data for training Deep Neural
Automatic speech verification (ASV) is the technology to determine the
Feature attributions are a popular tool for explaining the behavior of D...
XDeep is an open-source Python package developed to interpret deep model...
Recently, more and more attention has been drawn into the internal mecha...
We introduce a new model-agnostic explanation technique which explains t...
Head pose estimation, which computes the intrinsic Euler angles (yaw, pi...