There has been a recent explosion of impressive generative models that c...
The recent success of transformer models in language, such as BERT, has
...
Learning how to localize and separate individual object sounds in the au...
We consider the problem of Visual Question Answering (VQA). Given an ima...
Multi-modal learning, particularly among imaging and linguistic modaliti...
The goal of video-based person re-identification is to match two input
v...
We consider the problem of video-based person re-identification. The goa...