
-
A Cross-Level Information Transmission Network for Predicting Phenotype from New Genotype: Application to Cancer Precision Medicine
An unsolved fundamental problem in biology and ecology is to predict obs...
read it
-
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Normalization plays an important role in the optimization of deep neural...
read it
-
Taking Notes on the Fly Helps BERT Pre-training
How to make unsupervised language pre-training more efficient and less r...
read it
-
Transferred Discrepancy: Quantifying the Difference Between Representations
Understanding what information neural networks capture is an essential p...
read it
-
Rethinking Positional Encoding in Language Pre-training
How to explicitly encode positional information into neural networks is ...
read it
-
Rethinking the Positional Encoding in Language Pre-training
How to explicitly encode positional information into neural networks is ...
read it
-
MC-BERT: Efficient Language Pre-Training via a Meta Controller
Pre-trained contextual representations (e.g., BERT) have become the foun...
read it
-
Invertible Image Rescaling
High-resolution digital images are usually downscaled to fit various dis...
read it
-
Incorporating BERT into Neural Machine Translation
The recently proposed BERT has shown great power on a variety of natural...
read it
-
On Layer Normalization in the Transformer Architecture
The Transformer is widely used in natural language processing tasks. To ...
read it
-
MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius
Adversarial training is one of the most popular ways to learn robust mod...
read it
-
Defective Convolutional Layers Learn Robust CNNs
Robustness of convolutional neural networks has recently been highlighte...
read it
-
Fast Structured Decoding for Sequence Models
Autoregressive sequence models achieve state-of-the-art performance in d...
read it
-
On the Anomalous Generalization of GANs
Generative models, especially Generative Adversarial Networks (GANs), ha...
read it
-
Hint-Based Training for Non-Autoregressive Machine Translation
Due to the unparallelizable nature of the autoregressive factorization, ...
read it
-
Multilingual Neural Machine Translation with Language Clustering
Multilingual neural machine translation (NMT), which translates multiple...
read it
-
Representation Degeneration Problem in Training Natural Language Generation Models
We study an interesting problem in training neural network-based models ...
read it
-
Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View
The Transformer architecture is widely used in natural language processi...
read it
-
Adversarially Robust Generalization Just Requires More Unlabeled Data
Neural network robustness has recently been highlighted by the existence...
read it
-
A Gram-Gauss-Newton Method Learning Overparameterized Deep Neural Networks for Regression Problems
First-order methods such as stochastic gradient descent (SGD) are curren...
read it
-
Multilingual Neural Machine Translation with Knowledge Distillation
Multilingual machine translation, which translates multiple languages wi...
read it
-
Non-Autoregressive Machine Translation with Auxiliary Regularization
As a new neural machine translation approach, Non-Autoregressive machine...
read it
-
Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input
Non-autoregressive translation (NAT) models, which remove the dependence...
read it
-
Sentence-wise Smooth Regularization for Sequence to Sequence Learning
Maximum-likelihood estimation (MLE) is widely used in sequence to sequen...
read it
-
When CTC Training Meets Acoustic Landmarks
Connectionist temporal classification (CTC) training criterion provides ...
read it
-
Augmenting Input Method Language Model with user Location Type Information
Geo-tags from micro-blog posts have been shown to be useful in many data...
read it
-
FRAGE: Frequency-Agnostic Word Representation
Continuous word representation (aka word embedding) is a basic building ...
read it
-
Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter
Neural machine translation usually adopts autoregressive models and suff...
read it
-
Double Path Networks for Sequence to Sequence Learning
Encoder-decoder based Sequence to Sequence learning (S2S) has made remar...
read it
-
Towards Binary-Valued Gates for Robust LSTM Training
Long Short-Term Memory (LSTM) is one of the most widely used recurrent s...
read it
-
Dense Information Flow for Neural Machine Translation
Recently, neural machine translation has achieved remarkable progress by...
read it
-
Improved ASR for Under-Resourced Languages Through Multi-Task Learning with Acoustic Landmarks
Furui first demonstrated that the identity of both consonant and vowel c...
read it
-
Acoustic Landmarks Contain More Information About the Phone String than Other Frames for Automatic Speech Recognition with Deep Neural Network Acoustic Model
Most mainstream Automatic Speech Recognition (ASR) systems consider all ...
read it
-
Acoustic Landmarks Contain More Information About the Phone String than Other Frames
Most mainstream Automatic Speech Recognition (ASR) systems consider all ...
read it
-
Dual Learning for Machine Translation
While neural machine translation (NMT) is making good progress in the pa...
read it
-
Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves
We propose Sentence Level Recurrent Topic Model (SLRTM), a new topic mod...
read it
-
A Game-theoretic Machine Learning Approach for Revenue Maximization in Sponsored Search
Sponsored search is an important monetization channel for search engines...
read it
-
Generalized Second Price Auction with Probabilistic Broad Match
Generalized Second Price (GSP) auctions are widely used by search engine...
read it
-
A Theoretical Analysis of NDCG Type Ranking Measures
A central problem in ranking is to design a ranking measure for evaluati...
read it