Solving crossword puzzles requires diverse reasoning capabilities, acces...
Multiple studies have shown that BERT is remarkably robust to pruning, y...
Transformer-based models are now widely used in NLP, but we still do not...
BERT-based architectures currently give state-of-the-art performance on ...