Executing machine learning inference tasks on resource-constrained edge
...
Repeated off-chip memory accesses to DRAM drive up operating power for
d...
The memory wall bottleneck is a key challenge across many data-intensive...
Transformer-based language models such as BERT provide significant accur...
The current trend for domain-specific architectures (DSAs) has led to re...