-
DICT-MLM: Improved Multilingual Pre-Training using Bilingual Dictionaries
Pre-trained multilingual language models such as mBERT have shown immens...
read it
-
DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling
Pre-trained models like BERT (Devlin et al., 2018) have dominated NLP / ...
read it
-
Sampled Softmax with Random Fourier Features
The computational cost of training with softmax cross entropy loss grows...
read it
-
Distinct Sampling on Streaming Data with Near-Duplicates
In this paper we study how to perform distinct sampling in the streaming...
read it
-
Stochastic Negative Mining for Learning with Large Output Spaces
We consider the problem of retrieving the most relevant labels for a giv...
read it
-
A Practical Algorithm for Distributed Clustering and Outlier Detection
We study the classic k-means/median clustering, which are fundamental pr...
read it
-
Tight Bounds for Collaborative PAC Learning via Multiplicative Weights
We study the collaborative PAC learning problem recently proposed in Blu...
read it

Jiecao Chen
is this you? claim profile