
-
Future-Aware Diverse Trends Framework for Recommendation
In recommender systems, modeling user-item behaviors is essential for us...
read it
-
MGD-GAN: Text-to-Pedestrian generation through Multi-Grained Discrimination
In this paper, we investigate the problem of text-to-pedestrian synthesi...
read it
-
Regularized Two-Branch Proposal Networks for Weakly-Supervised Moment Retrieval in Videos
Video moment retrieval aims to localize the target moment in an video ac...
read it
-
PopMAG: Pop Music Accompaniment Generation
In pop music, accompaniments are usually played by multiple instruments ...
read it
-
Object-Aware Multi-Branch Relation Networks for Spatio-Temporal Video Grounding
Spatio-temporal video grounding aims to retrieve the spatio-temporal tub...
read it
-
DeVLBert: Learning Deconfounded Visio-Linguistic Representations
In this paper, we propose to investigate the problem of out-of-domain vi...
read it
-
Poet: Product-oriented Video Captioner for E-commerce
In e-commerce, a growing number of user-generated videos are used for pr...
read it
-
FastLR: Non-Autoregressive Lipreading Model with Integrate-and-Fire
Lipreading is an impressive technique and there has been a definite impr...
read it
-
Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation
Non-autoregressive translation (NAT) achieves faster inference speed but...
read it
-
DeepSinger: Singing Voice Synthesis with Data Mined From the Web
In this paper, we develop DeepSinger, a multi-lingual multi-singer singi...
read it
-
Comprehensive Information Integration Modeling Framework for Video Titling
In e-commerce, consumer-generated videos, which in general deliver consu...
read it
-
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Advanced text to speech (TTS) models such as FastSpeech can synthesize s...
read it
-
A Study of Non-autoregressive Model for Sequence Generation
Non-autoregressive (NAR) models generate all the tokens of a sequence in...
read it
-
A Generic Network Compression Framework for Sequential Recommender Systems
Sequential recommender systems (SRS) have become the key technology in c...
read it
-
Grounded and Controllable Image Completion by Incorporating Lexical Semantics
In this paper, we present an approach, namely Lexical Semantic Image Com...
read it
-
Convolutional Hierarchical Attention Network for Query-Focused Video Summarization
Previous approaches for video summarization mainly concentrate on findin...
read it
-
Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences
In this paper, we consider a novel task, Spatio-Temporal Video Grounding...
read it
-
Bi-Decoder Augmented Network for Neural Machine Translation
Neural Machine Translation (NMT) has become a popular technology in rece...
read it
-
Weakly-Supervised Video Moment Retrieval via Semantic Completion Network
Video moment retrieval is to search the moment that is most relevant to ...
read it
-
Multi-Modal Attention Network Learning for Semantic Source Code Retrieval
Code retrieval techniques and tools have been playing a key role in faci...
read it
-
A Better Way to Attend: Attention with Trees for Video Question Answering
We propose a new attention model for video question answering. The main ...
read it
-
Personalized Hashtag Recommendation for Micro-videos
Personalized hashtag recommendation methods aim to suggest users hashtag...
read it
-
MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models
Machine Comprehension (MC) is one of the core problems in natural langua...
read it
-
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference
Natural Language Inference (NLI), also known as Recognizing Textual Enta...
read it
-
Weak Supervision Enhanced Generative Network for Question Generation
Automatic question generation according to an answer within the given pa...
read it
-
Localizing Unseen Activities in Video via Image Query
Action localization in untrimmed videos is an important topic in the fie...
read it
-
Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks
Open-ended video question answering aims to automatically generate the n...
read it
-
Beyond Product Quantization: Deep Progressive Quantization for Image Retrieval
Product Quantization (PQ) has long been a mainstream for generating an e...
read it
-
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
Query-based moment retrieval aims to localize the most relevant moment i...
read it
-
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
Recent developments in modeling language and vision have been successful...
read it
-
FastSpeech: Fast, Robust and Controllable Text to Speech
Neural network based end-to-end text to speech (TTS) has significantly i...
read it
-
Almost Unsupervised Text to Speech and Automatic Speech Recognition
Text to speech (TTS) and automatic speech recognition (ASR) are two dual...
read it
-
Multilingual Neural Machine Translation with Knowledge Distillation
Multilingual machine translation, which translates multiple languages wi...
read it
-
Improving Automatic Source Code Summarization via Deep Reinforcement Learning
Code summarization provides a high level natural language description of...
read it
-
Improved Dynamic Memory Network for Dialogue Act Classification with Adversarial Training
Dialogue Act (DA) classification is a challenging problem in dialogue in...
read it
-
Dial2Desc: End-to-end Dialogue Description Generation
We first propose a new task named Dialogue Description (Dial2Desc). Unli...
read it
-
Textually Guided Ranking Network for Attentional Image Retweet Modeling
Retweet prediction is a challenging problem in social media sites (SMS)....
read it
-
Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
Visual grounding aims to localize an object in an image referred to by a...
read it
-
Investigating Capsule Networks with Dynamic Routing for Text Classification
In this study, we explore capsule networks with dynamic routing for text...
read it
-
Leveraging Long and Short-term Information in Content-aware Movie Recommendation
Movie recommendation systems provide users with ranked lists of movies b...
read it
-
Dialogue Act Recognition via CRF-Attentive Structured Network
Dialogue Act Recognition (DAR) is a challenging problem in dialogue inte...
read it
-
Keyword-based Query Comprehending via Multiple Optimized-Demand Augmentation
In this paper, we consider the problem of machine reading task when the ...
read it
-
Smarnet: Teaching Machines to Read and Comprehend Like Human
Machine Comprehension (MC) is a challenging task in Natural Language Pro...
read it
-
MEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension
Machine comprehension(MC) style question answering is a representative p...
read it
-
Video Question Answering via Attribute-Augmented Attention Network Learning
Video Question Answering is a challenging problem in visual information ...
read it
-
The Forgettable-Watcher Model for Video Question Answering
A number of visual question answering approaches have been proposed rece...
read it
-
Question Retrieval for Community-based Question Answering via Heterogeneous Network Integration Learning
Community based question answering platforms have attracted substantial ...
read it
-
User Personalized Satisfaction Prediction via Multiple Instance Deep Learning
Community based question answering services have arisen as a popular kno...
read it