Current methods for Knowledge-Based Question Answering (KBQA) usually re...
In this paper, we propose a novel framework, Tracking-free Relightable A...
Cross-speaker style transfer in speech synthesis aims at transferring a ...
Cross-speaker style transfer in speech synthesis aims at transferring a ...
Image super-resolution (SR) is a technique to recover lost high-frequenc...
Video understanding is an important task in short video business platfor...
Conversion of Chinese Grapheme-to-Phoneme (G2P) plays an important role ...
Online encyclopedias, such as Wikipedia, have been well-developed and
re...
Video language pre-training methods have mainly adopted sparse sampling
...
Text-driven image manipulation is developed since the vision-language mo...
Contrastive learning has been extensively studied in sentence embedding
...
Most existing methods in vision-language retrieval match two modalities ...
Dense passage retrieval aims to retrieve the relevant passages of a quer...
A general numerical method using sum of squares programming is proposed ...
Deepfake face not only violates the privacy of personal identity, but al...
The base learners and labeled samples (shots) in an ensemble few-shot
cl...
Unpaired image-to-image translation is to translate an image from a sour...
Before entering the neural network, a token is generally converted to th...
Contrastive learning has been proven suitable for learning sentence
embe...
As one of the challenging problems in video search, Person-Action Instan...
Recently, face super-resolution (FSR) methods either feed whole face ima...
Contrastive learning has been attracting much attention for learning
uns...
Contrastive learning has been gradually applied to learn high-quality
un...
The task of multi-label image classification is to recognize all the obj...
Since Transformer has found widespread use in NLP, the potential of
Tran...
Most recent video super-resolution (SR) methods either adopt an iterativ...
Video-Text Retrieval has been a hot research topic with the explosion of...
Detecting facial forgery images and videos is an increasingly important ...
The existing face recognition datasets usually lack occlusion samples, w...
We propose a novel task, Multi-Document Driven Dialogue (MD3), in which ...
Most existing approaches to disfluency detection heavily rely on
human-a...
Recently, conversational recommender system (CRS) has become an emerging...
Recently, significant progress has been made in sequential recommendatio...
In this paper we examine the problem of inverse rendering of real face
i...
In order to effectively prevent the spread of COVID-19 virus, almost eve...
Aspect-based sentiment analysis (ABSA) aims to predict fine-grained
sent...
Knowledge graphs capture interlinked information between entities and th...
Knowledge graphs capture structured information and relations between a ...
Knowledge graphs capture structured information and relations between a ...
As an unsupervised dimensionality reduction method, principal component
...