
-
Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization
In this paper, we introduce Apollo, a quasi-Newton method for nonconvex ...
read it
-
Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages
Cross-lingual transfer learning has become an important weapon to battle...
read it
-
FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow
Most sequence-to-sequence (seq2seq) models are autoregressive; they gene...
read it
-
Handling Syntactic Divergence in Low-resource Machine Translation
Despite impressive empirical successes of neural machine translation (NM...
read it
-
Choosing Transfer Languages for Cross-Lingual Learning
Cross-lingual transfer, where a high-resource transfer language is used ...
read it
-
Density Matching for Bilingual Word Embedding
Recent approaches to cross-lingual word embedding have generally been ba...
read it
-
MaCow: Masked Convolutional Generative Flow
Flow-based generative models, conceptually attractive due to tractabilit...
read it
-
MAE: Mutual Posterior-Divergence Regularization for Variational AutoEncoders
Variational Autoencoder (VAE), a simple and effective deep generative mo...
read it
-
Near or Far, Wide Range Zero-Shot Cross-Lingual Dependency Parsing
Cross-lingual transfer is the major means toleverage knowledge from high...
read it
-
Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
We introduce Texar, an open-source toolkit aiming to support the broad s...
read it
-
Stack-Pointer Networks for Dependency Parsing
We introduce a novel architecture for dependency parsing: stack-pointer ...
read it
-
Softmax Q-Distribution Estimation for Structured Prediction: A Theoretical Interpretation for RAML
Reward augmented maximum likelihood (RAML), a simple and effective learn...
read it
-
An Interpretable Knowledge Transfer Model for Knowledge Base Completion
Knowledge bases are important resources for a variety of natural languag...
read it
-
Neural Probabilistic Model for Non-projective MST Parsing
In this paper, we propose a probabilistic parsing model, which defines a...
read it
-
Dropout with Expectation-linear Regularization
Dropout, a simple and effective way to train deep neural networks, has l...
read it
-
Harnessing Deep Neural Networks with Logic Rules
Combining deep neural networks with structured logic rules is desirable ...
read it
-
Unsupervised Ranking Model for Entity Coreference Resolution
Coreference resolution is one of the first stages in deep language under...
read it
-
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
State-of-the-art sequence labeling systems traditionally require large a...
read it
-
Probabilistic Models for High-Order Projective Dependency Parsing
This paper presents generalized probabilistic models for high-order proj...
read it