Qi Wu

research

∙ 07/23/2022

Instant Neural Representation for Interactive Volume Rendering

Neural networks have shown great potential in compressing volumetric dat...

0 Qi Wu, et al. ∙

research

∙ 05/07/2022

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

Recently, online shopping has gradually become a common way of shopping ...

0 Zhipeng Zhang, et al. ∙

research

∙ 04/18/2022

BSRT: Improving Burst Super-Resolution with Swin Transformer and Flow-Guided Deformable Alignment

This work addresses the Burst Super-Resolution (BurstSR) task using a ne...

0 Ziwei Luo, et al. ∙

research

∙ 04/08/2022

Custom Sine Waves Are Enough for Imitation Learning of Bipedal Gaits with Different Styles

Not until recently, robust bipedal locomotion has been achieved through ...

0 Qi Wu, et al. ∙

research

∙ 03/22/2022

Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions

A long-term goal of AI research is to build intelligent agents that can ...

0 Jing Gu, et al. ∙

research

∙ 03/22/2022

HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation

Pre-training has been adopted in a few of recent works for Vision-and-La...

0 Yanyuan Qiao, et al. ∙

research

∙ 03/17/2022

MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering

Knowledge-based visual question answering requires the ability of associ...

0 Yang Ding, et al. ∙

research

∙ 03/05/2022

Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation

Most existing works in vision-and-language navigation (VLN) focus on eit...

0 Yicong Hong, et al. ∙

research

∙ 02/15/2022

An Automated FPGA-based Framework for Rapid Prototyping of Nonbinary LDPC Codes

Nonbinary LDPC codes have shown superior performance close to the Shanno...

0 Yaoyu Tao, et al. ∙

research

∙ 12/19/2021

LocFormer: Enabling Transformers to Perform Temporal Moment Localization on Long Untrimmed Videos With a Feature Sampling Approach

We propose LocFormer, a Transformer-based model for video grounding whic...

0 Cristian Rodriguez Opazo, et al. ∙

research

∙ 12/17/2021

Unified 2D and 3D Pre-training for Medical Image classification and Segmentation

Self-supervised learning (SSL) opens up huge opportunities for better ut...

0 Yutong Xie, et al. ∙

research

∙ 12/07/2021

Adaptive Mimic: Deep Reinforcement Learning of Parameterized Bipedal Walking from Infeasible References

Not until recently, robust robot locomotion has been achieved by deep re...

0 Chong Zhang, et al. ∙

research

∙ 11/25/2021

V2C: Visual Voice Cloning

Existing Voice Cloning (VC) tasks aim to convert a paragraph text to a s...

0 Qi Chen, et al. ∙

research

∙ 11/19/2021

Medical Visual Question Answering: A Survey

Medical Visual Question Answering (VQA) is a combination of medical arti...

10 Zhihong Lin, et al. ∙

research

∙ 09/18/2021

Memory Regulation and Alignment toward Generalizer RGB-Infrared Person

The domain shift, coming from unneglectable modality gap and non-overlap...

11 Feng Chen, et al. ∙

research

∙ 08/13/2021

Data-driven advice for interpreting local and global model predictions in bioinformatics problems

Tree-based algorithms such as random forests and gradient boosted trees ...

0 Markus Loecher, et al. ∙

research

∙ 08/05/2021

Communicative Learning with Natural Gestures for Embodied Navigation Agents with Human-in-the-Scene

Human-robot collaboration is an essential research topic in artificial i...

10 Qi Wu, et al. ∙

research

∙ 07/20/2021

Data Hiding with Deep Learning: A Survey Unifying Digital Watermarking and Steganography

Data hiding is the process of embedding information into a noise-toleran...

0 Olivia Byrnes, et al. ∙

research

∙ 07/15/2021

Neighbor-view Enhanced Model for Vision and Language Navigation

Vision and Language Navigation (VLN) requires an agent to navigate to a ...

0 Dong An, et al. ∙

research

∙ 05/05/2021

Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention

Referring Expression Comprehension (REC) has become one of the most impo...

0 Wei Suo, et al. ∙

research

∙ 04/30/2021

Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads

Vision-and-Language (VL) pre-training has shown great potential on many ...

0 Chenyu Gao, et al. ∙

research

∙ 03/30/2021

Diagnosing Vision-and-Language Navigation: What Really Matters

Vision-and-language navigation (VLN) is a multimodal task where an agent...

10 Wanrong Zhu, et al. ∙

research

∙ 03/26/2021

Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation

Semantic segmentation aims to classify every pixel of an input image. Co...

11 Yazhou Yao, et al. ∙

research

∙ 03/24/2021

Jo-SRC: A Contrastive Approach for Combating Noisy Labels

Due to the memorization effect in Deep Neural Networks (DNNs), training ...

0 Yazhou Yao, et al. ∙

research

∙ 03/22/2021

Higher-Order Orthogonal Causal Learning for Treatment Effect

Most existing studies on the double/debiased machine learning method con...

0 Yiyan Huang, et al. ∙

research

∙ 01/24/2021

Multi-intersection Traffic Optimisation: A Benchmark Dataset and a Strong Baseline

The control of traffic signals is fundamental and critical to alleviate ...

0 Hu Wang, et al. ∙

research

∙ 01/04/2021

How to Train Your Agent to Read and Write

Reading and writing research papers is one of the most privileged abilit...

0 Li Liu, et al. ∙

research

∙ 01/02/2021

Semantics for Robotic Mapping, Perception and Interaction: A Survey

For robots to navigate and interact more richly with the world around th...

0 Sourav Garg, et al. ∙

research

∙ 12/24/2020

Memory-Gated Recurrent Networks

The essence of multivariate sequential learning is all about how to extr...

8 Yaquan Zhang, et al. ∙

research

∙ 12/17/2020

The Causal Learning of Retail Delinquency

This paper focuses on the expected difference in borrower's repayment wh...

0 Yiyan Huang, et al. ∙

research

∙ 12/09/2020

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps

Texts appearing in daily scenes that can be recognized by OCR (Optical C...

0 Qi Zhu, et al. ∙

research

∙ 12/07/2020

Confidence-aware Non-repetitive Multimodal Transformers for TextCaps

When describing an image, reading text in the visual scene is crucial to...

0 Zhaokai Wang, et al. ∙

research

∙ 12/04/2020

P3-LOAM: PPP/LiDAR Loosely Coupled SLAM with Accurate Covariance Estimation and Robust RAIM in Urban Canyon Environment

Light Detection and Ranging (LiDAR) based Simultaneous Localization and ...

0 Tao Li, et al. ∙

research

∙ 11/26/2020

Generative Learning of Heterogeneous Tail Dependence

We propose a multivariate generative model to capture the complex depend...

0 Xiangqian Sun, et al. ∙

research

∙ 11/26/2020

A Recurrent Vision-and-Language BERT for Navigation

Accuracy of many visiolinguistic tasks has benefited significantly from ...

0 Yicong Hong, et al. ∙

research

∙ 11/22/2020

Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning

The emerging vision-and-language navigation (VLN) problem aims at learni...

1 Weixia Zhang, et al. ∙

research

∙ 10/19/2020

Language and Visual Entity Relationship Graph for Agent Navigation

Vision-and-Language Navigation (VLN) requires an agent to navigate in a ...

0 Yicong Hong, et al. ∙

research

∙ 10/16/2020

Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning

We propose a parsimonious quantile regression framework to learn the dyn...

0 Xing Yan, et al. ∙

research

∙ 09/20/2020

MARS: Mixed Virtual and Real Wearable Sensors for Human Activity Recognition with Multi-Domain Deep Learning Model

Human activity recognition (HAR) using wearable Inertial Measurement Uni...

17 Ling Pei, et al. ∙

research

∙ 09/16/2020

CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation

Scene graphs are semantic abstraction of images that encourage visual un...

0 Jing Yu, et al. ∙

research

∙ 09/15/2020

Attention-SLAM: A Visual Monocular SLAM Learning from Human Gaze

This paper proposes a novel simultaneous localization and mapping (SLAM)...

5 Jinquan Li, et al. ∙

research

∙ 08/06/2020

Data-driven Meta-set Based Fine-Grained Visual Classification

Constructing fine-grained image datasets typically requires domain-speci...

0 Chuanyi Zhang, et al. ∙

research

∙ 07/29/2020

Object-and-Action Aware Model for Visual Language Navigation

Vision-and-Language Navigation (VLN) is unique in that it requires turni...

2 Yuankai Qi, et al. ∙

research

∙ 07/21/2020

Soft Expert Reward Learning for Vision-and-Language Navigation

Vision-and-Language Navigation (VLN) requires an agent to find a specifi...

0 Hu Wang, et al. ∙

research

∙ 07/19/2020

Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering

Visual Question Answering (VQA) has achieved great success thanks to the...

0 Ruixue Tang, et al. ∙

research

∙ 07/19/2020

Length-Controllable Image Captioning

The last decade has witnessed remarkable progress in the image captionin...

12 Chaorui Deng, et al. ∙

research

∙ 07/19/2020

Referring Expression Comprehension: A Survey of Methods and Datasets

Referring expression comprehension (REC) aims to localize a target objec...

0 Yanyuan Qiao, et al. ∙

research

∙ 07/07/2020

DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue

Visual Dialogue task requires an agent to be engaged in a conversation w...

0 Xiaoze Jiang, et al. ∙

research

∙ 06/16/2020

Foreground-Background Imbalance Problem in Deep Object Detectors: A Review

Recent years have witnessed the remarkable developments made by deep lea...

0 Joya Chen, et al. ∙

research

∙ 06/16/2020

Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering

Fact-based Visual Question Answering (FVQA) requires external knowledge ...

0 Zihao Zhu, et al. ∙

Qi Wu

Featured Co-authors

Sign in with Google

Consider DeepAI Pro