Hang Zhao

research

∙ 09/11/2023

Robot Parkour Learning

Parkour is a grand challenge for legged locomotion that requires robots ...

0 Ziwen Zhuang, et al. ∙

research

∙ 08/24/2023

StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map Construction

High-Definition (HD) maps are essential for the safety of autonomous dri...

0 Tianyuan Yuan, et al. ∙

research

∙ 08/16/2023

Radio2Text: Streaming Speech Recognition Using mmWave Radio Signals

Millimeter wave (mmWave) based speech recognition provides more possibil...

0 Running Zhao, et al. ∙

research

∙ 07/26/2023

Learning-based Control for PMSM Using Distributed Gaussian Processes with Optimal Aggregation Strategy

The growing demand for accurate control in varying and unknown environme...

0 Zhenxiao Yin, et al. ∙

research

∙ 06/29/2023

Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models

The Video-to-Audio (V2A) model has recently gained attention for its pra...

0 Simian Luo, et al. ∙

research

∙ 06/20/2023

BEVScope: Enhancing Self-Supervised Depth Estimation Leveraging Bird's-Eye-View in Dynamic Scenarios

Depth estimation is a cornerstone of perception in autonomous driving an...

11 Yucheng Mao, et al. ∙

research

∙ 06/06/2023

ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory

Large language models (LLMs) with memory are computationally universal. ...

5 Chenxu Hu, et al. ∙

research

∙ 05/15/2023

GeoMAE: Masked Geometric Target Prediction for Self-supervised Point Cloud Pre-Training

This paper tries to address a fundamental question in point cloud self-s...

4 Xiaoyu Tian, et al. ∙

research

∙ 05/02/2023

On Uni-Modal Feature Learning in Supervised Multi-Modal Learning

We abstract the features (i.e. learned representations) of multi-modal d...

7 Chenzhuang Du, et al. ∙

research

∙ 04/27/2023

Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving

Robotic perception requires the modeling of both 3D geometry and semanti...

8 Xiaoyu Tian, et al. ∙

research

∙ 04/26/2023

What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging

Inferring past human motion from RGB images is challenging due to the in...

8 Zitian Tang, et al. ∙

research

∙ 04/17/2023

Neural Map Prior for Autonomous Driving

High-definition (HD) semantic maps are crucial for autonomous vehicles n...

7 Xuan Xiong, et al. ∙

research

∙ 03/30/2023

SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer

High-resolution images enable neural networks to learn richer visual rep...

1 Xuanyao Chen, et al. ∙

research

∙ 12/07/2022

AsyInst: Asymmetric Affinity with DepthGrad and Color for Box-Supervised Instance Segmentation

The weakly supervised instance segmentation is a challenging task. The e...

7 Siwei Yang, et al. ∙

research

∙ 12/05/2022

Learning Physically Realizable Skills for Online Packing of General 3D Shapes

We study the problem of learning online packing skills for irregular 3D ...

3 Hang Zhao, et al. ∙

research

∙ 11/21/2022

PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework

Visual object tracking is an essential capability of intelligent robots....

8 Bowen Li, et al. ∙

research

∙ 11/03/2022

P4P: Conflict-Aware Motion Prediction for Planning in Autonomous Driving

Motion prediction is crucial in enabling safe motion planning for autono...

2 Qiao Sun, et al. ∙

research

∙ 10/26/2022

InterSim: Interactive Traffic Simulation via Explicit Relation Modeling

Interactive traffic simulation is crucial to autonomous driving systems ...

0 Qiao Sun, et al. ∙

research

∙ 08/09/2022

VectorFlow: Combining Images and Vectors for Traffic Occupancy and Flow Prediction

Predicting future behaviors of road agents is a key task in autonomous d...

2 Xin Huang, et al. ∙

research

∙ 08/02/2022

ViP3D: End-to-end Visual Trajectory Prediction via 3D Agent Queries

Existing autonomous driving pipelines separate the perception module fro...

40 Junru Gu, et al. ∙

research

∙ 07/13/2022

Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech

Some recent studies have demonstrated the feasibility of single-stage ne...

0 Zhengxi Liu, et al. ∙

research

∙ 07/03/2022

Beyond Visual Field of View: Perceiving 3D Environment with Echoes and Vision

This paper focuses on perceiving and navigating 3D environments using ec...

2 Lingyu Zhu, et al. ∙

research

∙ 06/22/2022

Radio2Speech: High Quality Speech Recovery from Radio Frequency Signals

Considering the microphone is easily affected by noise and soundproof ma...

0 Running Zhao, et al. ∙

research

∙ 06/18/2022

Augmented Imagefication: A Data-driven Fault Detection Method for Aircraft Air Data Sensors

In this paper, a novel data-driven approach named Augmented Imageficatio...

3 Hang Zhao, et al. ∙

research

∙ 06/17/2022

VectorMapNet: End-to-end Vectorized HD Map Learning

Autonomous driving systems require a good understanding of surrounding e...

6 Yicheng Liu, et al. ∙

research

∙ 06/13/2022

The Modality Focusing Hypothesis: On the Blink of Multimodal Knowledge Distillation

Multimodal knowledge distillation (KD) extends traditional knowledge dis...

5 Zihui Xue, et al. ∙

research

∙ 06/10/2022

R4D: Utilizing Reference Objects for Long-Range Distance Estimation

Estimating the distance of objects is a safety-critical task for autonom...

2 Yingwei Li, et al. ∙

research

∙ 06/08/2022

Depth Estimation Matters Most: Improving Per-Object Depth Estimation for Monocular 3D Detection and Tracking

Monocular image-based 3D perception has become an active research area i...

6 Longlong Jing, et al. ∙

research

∙ 05/10/2022

Learning Visual Styles from Audio-Visual Associations

From the patter of rain to the crunch of snow, the sounds we hear often ...

2 Tingle Li, et al. ∙

research

∙ 05/06/2022

Sound2Synth: Interpreting Sound via FM Synthesizer Parameters Estimation

Synthesizer is a type of electronic musical instrument that is now widel...

8 Zui Chen, et al. ∙

research

∙ 05/02/2022

MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D Queries

Accurate and consistent 3D tracking from multiple cameras is a key compo...

8 Tianyuan Zhang, et al. ∙

research

∙ 03/24/2022

Egocentric Prediction of Action Target in 3D

We are interested in anticipating as early as possible the target locati...

1 Yiming Li, et al. ∙

research

∙ 03/22/2022

Self-supervision through Random Segments with Autoregressive Coding (RandSAC)

Inspired by the success of self-supervised autoregressive representation...

1 Tianyu Hua, et al. ∙

research

∙ 03/20/2022

FUTR3D: A Unified Sensor Fusion Framework for 3D Detection

Sensor fusion is an essential topic in many perception systems, such as ...

4 Xuanyao Chen, et al. ∙

research

∙ 02/24/2022

M2I: From Factored Marginal Trajectory Prediction to Interactive Prediction

Predicting future motions of road participants is an important task for ...

0 Qiao Sun, et al. ∙

research

∙ 02/21/2022

S3T: Self-Supervised Pre-training with Swin Transformer for Music Classification

In this paper, we propose S3T, a self-supervised pre-training method wit...

0 Hang Zhao, et al. ∙

research

∙ 01/17/2022

AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection

Object detection through either RGB images or the LiDAR point clouds has...

10 Zehui Chen, et al. ∙

research

∙ 12/13/2021

Embracing Single Stride 3D Object Detector with Sparse Transformer

In LiDAR-based 3D object detection for autonomous driving, the ratio of ...

3 Lue Fan, et al. ∙

research

∙ 12/10/2021

IFR-Explore: Learning Inter-object Functional Relationships in 3D Indoor Scenes

Building embodied intelligent agents that can interact with 3D indoor en...

5 Qi Li, et al. ∙

research

∙ 12/09/2021

SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations

Pre-training has become a standard paradigm in many computer vision task...

2 Zhenyu Li, et al. ∙

research

∙ 10/15/2021

Neural Dubber: Dubbing for Videos According to Scripts

Dubbing is a post-production process of re-recording actors' dialogues, ...

2 Chenxu Hu, et al. ∙

research

∙ 10/13/2021

DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries

We introduce a framework for multi-camera 3D object detection. In contra...

9 Vitor Guizilini, et al. ∙

research

∙ 08/31/2021

Learning Practically Feasible Policies for Online 3D Bin Packing

We tackle the Online 3D Bin Packing Problem, a challenging yet practical...

3 Hang Zhao, et al. ∙

research

∙ 08/22/2021

DenseTNT: End-to-end Trajectory Prediction from Dense Goal Sets

Due to the stochasticity of human behaviors, predicting the future traje...

5 Junru Gu, et al. ∙

research

∙ 07/13/2021

HDMapNet: An Online HD Map Construction and Evaluation Framework

High-definition map (HD map) construction is a crucial problem for auton...

3 Qi Li, et al. ∙

research

∙ 06/28/2021

HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps

High Definition (HD) maps are maps with precise definitions of road lane...

12 Lu Mi, et al. ∙

research

∙ 06/27/2021

DenseTNT: Waymo Open Dataset Motion Prediction Challenge 1st Place Solution

In autonomous driving, goal-based multi-trajectory prediction methods ar...

7 Junru Gu, et al. ∙

research

∙ 06/23/2021

Co-advise: Cross Inductive Bias Distillation

Transformers recently are adapted from the community of natural language...

7 Sucheng Ren, et al. ∙

research

∙ 06/21/2021

Improving Multi-Modal Learning with Uni-Modal Teachers

Learning multi-modal representations is an essential step towards real-w...

8 Chenzhuang Du, et al. ∙

research

∙ 06/08/2021

What Makes Multimodal Learning Better than Single (Provably)

The world provides us with data of multiple modalities. Intuitively, mod...

4 Yu Huang, et al. ∙

Hang Zhao

Featured Co-authors

Sign in with Google

Consider DeepAI Pro