
-
Creation and Evaluation of a Pre-tertiary Artificial Intelligence (AI) Curriculum
Contributions: The Chinese University of Hong Kong (CUHK)-Jockey Club AI...
read it
-
Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation
Dialog systems enriched with external knowledge can handle user queries ...
read it
-
Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network
By using deep learning approaches, Speech Emotion Recog-nition (SER) on ...
read it
-
Syntactic representation learning for neural network based TTS with syntactic parse tree traversal
Syntactic structure of a sentence text is correlated with the prosodic s...
read it
-
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input
Non-autoregressive (NAR) transformer models have achieved significantly ...
read it
-
Replay and Synthetic Speech Detection with Res2net Architecture
Existing approaches for replay and synthetic speech detection still lack...
read it
-
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
This paper proposes an any-to-many location-relative, sequence-to-sequen...
read it
-
Neural Architecture Search for Speech Recognition
Deep neural networks (DNNs) based automatic speech recognition (ASR) sys...
read it
-
Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams
Generating 3D speech-driven talking head has received more and more atte...
read it
-
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification
Recently adversarial attacks on automatic speaker verification (ASV) sys...
read it
-
Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification
Speaker verification systems usually suffer from the mismatch problem be...
read it
-
Defense against adversarial attacks on spoofing countermeasures of ASV
Various forefront countermeasure methods for automatic speaker verificat...
read it
-
Deep segmental phonetic posterior-grams based discovery of non-categories in L2 English speech
Second language (L2) speech is often labeled with the native, phone cate...
read it
-
Audio-visual Recognition of Overlapped speech for the LRS2 dataset
Automatic recognition of overlapped speech remains a highly challenging ...
read it
-
Adversarial Attacks on GMM i-vector based Speaker Verification Systems
This work investigates the vulnerability of Gaussian Mix-ture Model (GMM...
read it
-
Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks
Self-attention network (SAN) can benefit significantly from the bi-direc...
read it
-
Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification
High-performance spoofing countermeasure systems for automatic speaker v...
read it
-
Semi-Supervised Graph Classification: A Hierarchical Graph Perspective
Node classification and graph classification are two graph learning prob...
read it
-
Study on Feature Subspace of Archetypal Emotions for Speech Emotion Recognition
Feature subspace selection is an important part in speech emotion recogn...
read it