Hongyuan Zhu

research

∙ 09/17/2023

Towards Debiasing Frame Length Bias in Text-Video Retrieval via Causal Intervention

Many studies focus on improving pretraining or developing new backbones ...

0 Burak Satar, et al. ∙

research

∙ 09/06/2023

Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning

3D dense captioning requires a model to translate its understanding of a...

0 Sijin Chen, et al. ∙

research

∙ 07/19/2023

Self-Supervised Learning for WiFi CSI-Based Human Activity Recognition: A Systematic Study

Recently, with the advancement of the Internet of Things (IoT), WiFi CSI...

0 Ke Xu, et al. ∙

research

∙ 06/07/2023

An Overview of Challenges in Egocentric Text-Video Retrieval

Text-video retrieval contains various challenges, including biases comin...

0 Burak Satar, et al. ∙

research

∙ 04/20/2023

Multi-view Vision-Prompt Fusion Network: Can 2D Pre-trained Model Boost 3D Point Cloud Data-scarce Learning?

Point cloud based 3D deep model has wide applications in many applicatio...

0 Haoyang Peng, et al. ∙

research

∙ 03/31/2023

A Closer Look at Few-Shot 3D Point Cloud Classification

In recent years, research on few-shot learning (FSL) has been fast-growi...

0 Chuangguan Ye, et al. ∙

research

∙ 03/31/2023

What Makes for Effective Few-shot Point Cloud Classification?

Due to the emergence of powerful computing resources and large-scale ann...

0 Chuangguan Ye, et al. ∙

research

∙ 01/06/2023

End-to-End 3D Dense Captioning with Vote2Cap-DETR

3D dense captioning aims to generate multiple captions localized with th...

0 Sijin Chen, et al. ∙

research

∙ 06/29/2022

Exploiting Semantic Role Contextualized Video Features for Multi-Instance Text-Video Retrieval EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022

In this report, we present our approach for EPIC-KITCHENS-100 Multi-Inst...

0 Burak Satar, et al. ∙

research

∙ 06/26/2022

Semantic Role Aware Correlation Transformer for Text to Video Retrieval

With the emergence of social media, voluminous video clips are uploaded ...

2 Burak Satar, et al. ∙

research

∙ 06/26/2022

RoME: Role-aware Mixture-of-Expert Transformer for Text-to-Video Retrieval

Seas of videos are uploaded daily with the popularity of social channels...

0 Burak Satar, et al. ∙

research

∙ 05/23/2022

OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization

As Deep Neural Networks (DNNs) usually are overparameterized and have mi...

0 Peng Hu, et al. ∙

research

∙ 03/31/2022

CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow

Optical flow estimation aims to find the 2D motion field by identifying ...

4 Xiuchao Sui, et al. ∙

research

∙ 11/30/2021

Point Cloud Instance Segmentation with Semi-supervised Bounding-Box Mining

Point cloud instance segmentation has achieved huge progress with the em...

3 Yongbin Liao, et al. ∙

research

∙ 03/08/2021

A Survey of Embodied AI: From Simulators to Research Tasks

There has been an emerging paradigm shift from the era of "internet AI" ...

1 Jiafei Duan, et al. ∙

research

∙ 09/24/2019

6D Pose Estimation with Correlation Fusion

6D object pose estimation is widely applied in robotic tasks such as gra...

0 Yi Cheng, et al. ∙

research

∙ 05/21/2019

Clustering with Similarity Preserving

Graph-based clustering has shown promising performance in many tasks. A ...

12 Zhao Kang, et al. ∙

research

∙ 01/26/2019

Scene Text Synthesis for Efficient and Effective Deep Network Training

A large amount of annotated training images is critical for training acc...

0 Fangneng Zhan, et al. ∙

research

∙ 12/14/2018

Spatial Fusion GAN for Image Synthesis

Recent advances in generative adversarial networks (GANs) have shown gre...

25 Fangneng Zhan, et al. ∙

research

∙ 11/12/2018

Holistic Multi-modal Memory Network for Movie Question Answering

Answering questions according to multi-modal context is a challenging pr...

12 Anran Wang, et al. ∙

research

∙ 08/22/2018

k-meansNet: When k-means Meets Differentiable Programming

In this paper, we study how to make clustering benefiting from different...

0 Xi Peng, et al. ∙

research

∙ 06/26/2017

YoTube: Searching Action Proposal via Recurrent and Static Regression Networks

In this paper, we present YoTube-a novel network fusion framework for se...

0 Hongyuan Zhu, et al. ∙

research

∙ 06/17/2017

Truly Multi-modal YouTube-8M Video Classification with Video, Audio, and Text

The YouTube-8M video classification challenge requires teams to classify...

0 Zhe Wang, et al. ∙

research

∙ 07/16/2015

Diagnosing State-Of-The-Art Object Proposal Methods

Object proposal has become a popular paradigm to replace exhaustive slid...

0 Hongyuan Zhu, et al. ∙

research

∙ 02/03/2015

Beyond Pixels: A Comprehensive Survey from Bottom-up to Semantic Image Segmentation and Cosegmentation

Image segmentation refers to the process to divide an image into nonover...

0 Hongyuan Zhu, et al. ∙

Hongyuan Zhu

Featured Co-authors

Sign in with Google

Consider DeepAI Pro