Large language models achieve state-of-the-art performance on sequence
g...
It has been found that residual networks are an Euler discretization of
...
Recently, deep models have shown tremendous improvements in neural machi...
Unsupervised Bilingual Dictionary Induction methods based on the
initial...
Deep encoders have been proven to be effective in improving neural machi...
Knowledge distillation has been proven to be effective in model accelera...