Out-of-distribution (OOD) detection aims to detect "unknown" data whose
...
Stable Diffusion (SD) customization approaches enable users to personali...
Car-hailing services have become a prominent data source for urban traff...
We propose SE-Bridge, a novel method for speech enhancement (SE). After
...
Fashion image retrieval task aims to search relevant clothing items of a...
Compared to query-based black-box attacks, transfer-based black-box atta...
To let the state-of-the-art end-to-end ASR model enjoy data efficiency, ...
Diffusion model, as a new generative model which is very popular in imag...
SpecAugment is a very effective data augmentation method for both HMM an...
This paper analyzes the power imbalance issue in power-domain NOMA (PD-N...
Intermediate layer output (ILO) regularization by means of multitask tra...
Internal Language Model Estimation (ILME) based language model (LM) fusi...
Speech emotion recognition (SER) is an essential part of human-computer
...
Sound event detection (SED) is an interesting but challenging task due t...
In this paper, we present a high-order energy-preserving scheme for solv...
This paper analyzes the power imbalance factor on the uplink of a 2-user...
Low resource speech recognition has been long-suffering from insufficien...
The performance of current Scene Graph Generation models is severely ham...
In Uyghur speech, consonant and vowel reduction are often encountered,
e...
Consonant and vowel reduction are often encountered in Uyghur speech, wh...
Feature selection techniques are essential for high-dimensional data
ana...
Requirements driven search-based testing (also known as falsification) h...
With the popularity of 3D sensors in self-driving and other robotics
app...
Analyzing the structure of proteins is a key part of understanding their...
Non-linear (large) time warping is a challenging source of nuisance in
t...
Malicious application of deepfakes (i.e., technologies can generate targ...
Nowadays, general object detectors like YOLO and Faster R-CNN as well as...
Man-in-The-Middle (MiTM) attacks present numerous threats to a smart gri...
Cyberattacks can cause a severe impact on power systems unless detected
...
Automatic speech recognition (ASR) for under-represented named-entity (U...
We report our NTU-AISG Text-to-speech (TTS) entry systems for the Blizza...
Human can perform multi-task recognition from speech. For instance, huma...
Many graph embedding approaches have been proposed for knowledge graph
c...
Cross-community collaboration can exploit the expertise and knowledges o...
Accent conversion (AC) transforms a non-native speaker's accent into a n...
In this paper, we present a series of complementary approaches to improv...
Conventional deformable registration methods aim at solving a specifical...
Event ticket price prediction is important to marketing strategy for any...
In this paper, we show that every (2^n-1+1)-vertex induced subgraph of
t...
Video action recognition, as a critical problem towards video understand...
The advent of isogeometric analysis has prompted a need for methods to
g...
We propose an Encoder-Classifier framework to model the Mandarin tones u...