The hybrid architecture of convolution neural networks (CNN) and Transfo...
The hybrid architecture of convolutional neural networks (CNNs) and
Tran...
Popular Transformer networks have been successfully applied to remote se...
We propose Conditional Adapter (CoDA), a parameter-efficient transfer
le...
Multi-vector retrieval models such as ColBERT [Khattab and Zaharia, 2020...
Many natural language processing tasks benefit from long inputs, but
pro...
Facial attractiveness prediction (FAP) aims to assess the facial
attract...
Multi-vector retrieval models improve over single-vector dual encoders o...
Recent work has improved language models remarkably by equipping them wi...
In this work, we explore whether modeling recurrence into the Transforme...
Sparsely-activated Mixture-of-experts (MoE) models allow the number of
p...
The Transformer architecture has been well adopted as a dominant archite...
Unsupervised Domain Adaptation (UDA) can transfer knowledge from labeled...
This report describes the technical details of our submission to the
EPI...
We present a method for generating comparative summaries that highlights...
We introduce Nutri-bullets, a multi-document summarization task for
heal...
Large language models have become increasingly difficult to train becaus...
Deep learning has been widely used for medical image segmentation and a ...
The performance of autoregressive models on natural language generation ...
Selecting input features of top relevance has become a popular method fo...
In this paper we present state-of-the-art (SOTA) performance on the
Libr...
Natural language systems often rely on a single, potentially ambiguous i...
Traditional text classifiers are limited to predicting over a fixed set ...
Large language models have recently achieved state of the art performanc...
Response suggestion is an important task for building human-computer
con...
Morphological reconstruction (MR) is often employed by seeded image
segm...
An edge computing environment features multiple edge servers and multipl...
We address the problem of detecting duplicate questions in forums, which...
Common recurrent neural network architectures scale poorly due to the
in...
This paper focuses on style transfer on the basis of non-parallel text. ...
The design of neural architectures for structured objects is typically g...
Prediction without justification has limited applicability. As a remedy,...
Question answering forums are rapidly growing in size with no effective
...
The success of deep learning often derives from well-chosen operational
...