Cross-domain Sequential Recommendation (CSR) which leverages user sequen...
In this paper, we propose a deep learning framework for solving
high-dim...
Training intelligent agents in Reinforcement Learning (RL) is much more
...
After discovering that Language Models (LMs) can be good in-context few-...
During the continuous evolution of one organism's ancestry, its genes
ac...
We propose to Transform Scene Graphs (TSG) into more descriptive caption...
Visual question answering (VQA) is a critical multimodal task in which a...
Offline Handwritten Mathematical Expression Recognition (HMER) has been
...
We design a novel global-local Transformer named Ada-ClustFormer
(ACF) t...
Aligning objects with words plays a critical role in Image-Language BERT...
Spiking neural networks (SNNs) have made great progress on both performa...
Automatic interpretation on smartphone-captured chest X-ray (CXR) photog...
Humans tend to decompose a sentence into different parts like sth do
sth...
Image-goal navigation is a challenging task, as it requires the agent to...
Compositional Zero-Shot Learning (CZSL) aims to recognize unseen composi...
Online exams via video conference software like Zoom have been adopted i...
To achieve accurate and robust object detection in the real-world scenar...
Segmenting unseen objects is a crucial ability for the robot since it ma...
We propose an end-to-end image compression and analysis model with
Trans...
Action quality assessment (AQA) from videos is a challenging vision task...
This paper develops a physics-informed neural network (PINN) based on th...
Pedestrian trajectory prediction is a key technology in many application...
Seismic tomography solves high-dimensional optimization problems to imag...
Sensory and emotional experiences such as pain and empathy are relevant ...
We propose an Auto-Parsing Network (APN) to discover and exploit the inp...
Although much progress has been made in visual emotion recognition,
rese...
In this paper, we propose a deep unfitted Nitsche method for computing
e...
Deep clustering successfully provides more effective features than
conve...
We present a novel attention mechanism: Causal Attention (CATT), to remo...
Modern deep learning methods have achieved great success in machine lear...
Change Captioning is a task that aims to describe the difference between...
In this paper, we propose an unfitted Nitsche's method to compute the ba...
Computed Tomography (CT) takes X-ray measurements on the subjects to
rec...
The dataset bias in vision-language tasks is becoming one of the main
pr...
The varying-mass Schrödinger equation (VMSE) has been successfully appli...
In this paper, we propose an unfitted Nitsche's method for computing wav...
It has been previously observed that training Variational Recurrent
Auto...
The clustering methods have recently absorbed even-increasing attention ...
We do not speak word by word from scratch; our brain quickly structures ...
Deep neural networks have achieved great success on the image captioning...
Aggregating extra features of novel modality brings great advantages for...
We propose Scene Graph Auto-Encoder (SGAE) that incorporates the languag...
We propose Scene Graph Auto-Encoder (SGAE) that incorporates the languag...
Due to the fact that it is prohibitively expensive to completely annotat...
We propose a weighted common subgraph (WCS) matching algorithm to find t...