Large-scale pre-training has made progress in many fields of natural lan...
With the increase in the scale of Deep Learning (DL) training workloads ...
Despite the widespread success of Transformers on NLP tasks, recent work...
Compositional generalization is a fundamental trait in humans, allowing ...
The problem of designing NLP solvers for math word problems (MWP) has se...
While recurrent models have been effective in NLP tasks, their performan...
Transformers have supplanted recurrent models in a large number of NLP t...
Transformers are being used extensively across several sequence modeling...
In this paper, we examine and analyze the challenges associated with
dev...