
Progressive MultiGranularity Training for NonAutoregressive Translation
Nonautoregressive translation (NAT) significantly accelerates the infer...
Rejuvenating LowFrequency Words: Making the Most of Parallel Data in NonAutoregressive Translation
Knowledge distillation (KD) is commonly used to construct synthetic data...
SelfGuided Curriculum Learning for Neural Machine Translation
In the field of machine learning, the welltrained model is assumed to b...
Bridging the Gap Between Clean Data Training and RealWorld Inference for Spoken Language Understanding
Spoken language understanding (SLU) system usually consists of various p...
Towards Efficiently Diversifying Dialogue Generation via Embedding Augmentation
Dialogue generation models face the challenge of producing generic and r...
SLUA: A Super Lightweight Unsupervised Word Alignment Model via CrossLingual Contrastive Learning
Word alignment is essential for the downstreaming crosslingual languag...
Understanding and Improving Encoder Layer Fusion in SequencetoSequence Learning
Encoder layer fusion (EncoderFusion) is a technique to fuse all the enco...
Understanding and Improving Lexical Choice in NonAutoregressive Translation
Knowledge distillation (KD) is essential for training nonautoregressive...
Predicting Terrain Mechanical Properties in Sight for Planetary Rovers with Semantic Clues
Nongeometric mobility hazards such as rover slippage and sinkage posing...
ContextAware CrossAttention for NonAutoregressive Translation
Nonautoregressive translation (NAT) significantly accelerates the infer...
Sample and Computationally Efficient Simulation Metamodeling in High Dimensions
Stochastic kriging has been widely employed for simulation metamodeling ...
ZeroShot Translation Quality Estimation with Explicit CrossLingual Patterns
This paper describes our submission of the WMT 2020 Shared Task on Sente...
SlotRefine: A Fast NonAutoregressive Model for Joint Intent Detection and Slot Filling
Slot filling and intent detection are two main tasks in spoken language ...
A projected gradient method for αℓ_1βℓ_2 sparsity regularization
The nonconvex α·_ℓ_1β·_ℓ_2 (α≥β≥0) regularization has attracted attent...
αℓ_1βℓ_2 sparsity regularization for nonlinear illposed problems
In this paper, we consider the α·_ℓ_1β·_ℓ_2 sparsity regularization wit...
Fault Tolerant Free Gait and Footstep Planning for Hexapod Robot Based on MonteCarlo Tree
Legged robots can pass through complex field environments by selecting g...
Overcoming the Curse of Dimensionality in Density Estimation with Mixed Sobolev GANs
We propose a novel GAN framework for nonparametric density estimation w...
SelfAttention with CrossLingual Position Representation
Position encoding (PE), an essential part of selfattention networks (SA...
Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features
Despite their success, kernel methods suffer from a massive computationa...
BdryGP: a new Gaussian process model for incorporating boundary information
Gaussian processes (GPs) are widely used as surrogate models for emulati...
Recurrent Graph Syntax Encoder for Neural Machine Translation
Syntaxincorporated machine translation models have been proven successf...
The University of Sydney's Machine Translation System for WMT19
This paper describes the University of Sydney's submission of the WMT 20...
Knowledge Gradient for Selection with Covariates: Consistency and Computation
Knowledge gradient is a design principle for developing Bayesian sequent...
Scalable Stochastic Kriging with Markovian Covariances
Stochastic kriging is a popular technique for simulation metamodeling du...
Efficient Learning of Optimal Markov Network Topology with kTree Modeling
The seminal work of Chow and Liu (1968) shows that approximation of a fi...
