
-
Pushing the Limits of Non-Autoregressive Speech Recognition
We combine recent advancements in end-to-end speech recognition to non-a...
read it
-
Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models
Although end-to-end automatic speech recognition (e2e ASR) models are wi...
read it
-
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network
We present SpeechStew, a speech recognition model that is trained on a c...
read it
-
Using Simulation to Aid the Design and Optimization of Intelligent User Interfaces for Quality Assurance Processes in Machine Learning
Many mission-critical applications of machine learning (ML) in the real-...
read it
-
On the limits of algorithmic prediction across the globe
The impact of predictive algorithms on people's lives and livelihoods ha...
read it
-
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
This paper introduces PnG BERT, a new encoder model for neural TTS. This...
read it
-
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
This paper introduces Parallel Tacotron 2, a non-autoregressive neural t...
read it
-
Residual Energy-Based Models for End-to-End Speech Recognition
End-to-end models with auto-regressive decoders have shown impressive re...
read it
-
Learning Word-Level Confidence For Subword End-to-End ASR
We study the problem of word-level confidence estimation in subword-base...
read it
-
Self-supervised Low Light Image Enhancement and Denoising
This paper proposes a self-supervised low light image enhancement method...
read it
-
Reinforcement Learning of Beam Codebooks in Millimeter Wave and Terahertz MIMO Systems
Millimeter wave (mmWave) and terahertz MIMO systems rely on pre-defined ...
read it
-
Echo State Speech Recognition
We propose automatic speech recognition (ASR) models inspired by echo st...
read it
-
Reinforcement Learning for Beam Pattern Design in Millimeter Wave and Massive MIMO Systems
Employing large antenna arrays is a key characteristic of millimeter wav...
read it
-
MATCH: Metadata-Aware Text Classification in A Large Hierarchy
Multi-label text classification refers to the problem of assigning each ...
read it
-
Multi-Objective Meta Learning
Meta learning with multiple objectives can be formulated as a Multi-Obje...
read it
-
Joint Transmit Precoding and Reflect Beamforming Design for IRS-Assisted MIMO Cognitive Radio Systems
Cognitive radio (CR) is an effective solution to improve the spectral ef...
read it
-
Multiple Structural Priors Guided Self Attention Network for Language Understanding
Self attention networks (SANs) have been widely utilized in recent NLP s...
read it
-
A Survey on Neural Network Interpretability
Along with the great success of deep neural networks, there is also grow...
read it
-
Improving EEG Decoding via Clustering-based Multi-task Feature Learning
Accurate electroencephalogram (EEG) pattern decoding for specific mental...
read it
-
Different Approaches Towards Vertical Track Irregularity Prediction – A Comparative Study
Railway systems require regular manual maintenance, a large part of whic...
read it
-
Optical Wavelength Guided Self-Supervised Feature Learning For Galaxy Cluster Richness Estimate
Most galaxies in the nearby Universe are gravitationally bound to a clus...
read it
-
A Better and Faster End-to-End Model for Streaming ASR
End-to-end (E2E) models have shown to outperform state-of-the-art conven...
read it
-
Multi-Task Adversarial Attack
Deep neural networks have achieved impressive performance in various are...
read it
-
Effective, Efficient and Robust Neural Architecture Search
Recent advances in adversarial attacks show the vulnerability of deep ne...
read it
-
Barycode-based GJK Algorithm
In this paper, we present a more efficient GJK algorithm to solve the co...
read it
-
Domain Concretization from Examples: Addressing Missing Domain Knowledge via Robust Planning
The assumption of complete domain knowledge is not warranted for robot p...
read it
-
Large-scale multilingual audio visual dubbing
We describe a system for large-scale audiovisual translation and dubbing...
read it
-
Fault Detection for Covered Conductors With High-Frequency Voltage Signals: From Local Patterns to Global Features
The detection and characterization of partial discharge (PD) are crucial...
read it
-
Hierarchical Metadata-Aware Document Categorization under Weak Supervision
Categorizing documents into a given label hierarchy is intuitively appea...
read it
-
Unsupervised Learning of Disentangled Speech Content and Style Representation
We present an approach for unsupervised learning of speech representatio...
read it
-
Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data
Streaming end-to-end automatic speech recognition (ASR) models are widel...
read it
-
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Although neural end-to-end text-to-speech models can synthesize highly n...
read it
-
Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
For various speech-related tasks, confidence scores from a speech recogn...
read it
-
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
We employ a combination of recent developments in semi-supervised learni...
read it
-
A Survey of Non-Volatile Main Memory Technologies: State-of-the-Arts, Practices, and Future Directions
Non-Volatile Main Memories (NVMMs) have recently emerged as promising te...
read it
-
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
This paper presents Non-Attentive Tacotron based on the Tacotron 2 text-...
read it
-
Weakly-Supervised Feature Learning via Text and Image Matching
When training deep neural networks for medical image classification, obt...
read it
-
A Large-Scale Mixed-Methods Analysis of Live Streaming Based Remote Education Experience in Chinese Colleges During the COVID-19 Pandemic
The COVID-19 global pandemic and resulted lockdown policies have forced ...
read it
-
Cross-Modal Alignment with Mixture Experts Neural Network for Intral-City Retail Recommendation
In this paper, we introduce Cross-modal Alignment with mixture experts N...
read it
-
Boosting Retailer Revenue by Generated Optimized Combined Multiple Digital Marketing Campaigns
Campaign is a frequently employed instrument in lifting up the GMV (Gros...
read it
-
Improved Trainable Calibration Method for Neural Networks on Medical Imaging Classification
Recent works have shown that deep neural networks can achieve super-huma...
read it
-
WaveGrad: Estimating Gradients for Waveform Generation
This paper introduces WaveGrad, a conditional model for waveform generat...
read it
-
An End-to-End Attack on Text-based CAPTCHAs Based on Cycle-Consistent Generative Adversarial Network
As a widely deployed security scheme, text-based CAPTCHAs have become mo...
read it
-
Better Than Reference In Low Light Image Enhancement: Conditional Re-Enhancement Networks
Low light images suffer from severe noise, low brightness, low contrast,...
read it
-
Location Information Aided Multiple Intelligent Reflecting Surface Systems
This paper proposes a novel location information aided multiple intellig...
read it
-
Fast and Accurate Neural CRF Constituency Parsing
Estimating probability distribution is one of the core issues in the NLP...
read it
-
MiNet: Mixed Interest Network for Cross-Domain Click-Through Rate Prediction
Click-through rate (CTR) prediction is a critical task in online adverti...
read it
-
Multi-source Heterogeneous Domain Adaptation with Conditional Weighting Adversarial Network
Heterogeneous domain adaptation (HDA) tackles the learning of cross-doma...
read it
-
A Study on Evaluation Standard for Automatic Crack Detection Regard the Random Fractal
A reasonable evaluation standard underlies construction of effective dee...
read it
-
Deep Image Clustering with Category-Style Representation
Deep clustering which adopts deep neural networks to obtain optimal repr...
read it