LidarCLIP or: How I Learned to Talk to Point Clouds

12/13/2022
by   Georg Hess, et al.
0

Research connecting text and images has recently seen several breakthroughs, with models like CLIP, DALL-E 2, and Stable Diffusion. However, the connection between text and other visual modalities, such as lidar data, has received less attention, prohibited by the lack of text-lidar datasets. In this work, we propose LidarCLIP, a mapping from automotive point clouds to a pre-existing CLIP embedding space. Using image-lidar pairs, we supervise a point cloud encoder with the image CLIP embeddings, effectively relating text and lidar data with the image domain as an intermediary. We show the effectiveness of LidarCLIP by demonstrating that lidar-based retrieval is generally on par with image-based retrieval, but with complementary strengths and weaknesses. By combining image and lidar features, we improve upon both single-modality methods and enable a targeted search for challenging detection scenarios under adverse sensor conditions. We also use LidarCLIP as a tool to investigate fundamental lidar capabilities through natural language. Finally, we leverage our compatibility with CLIP to explore a range of applications, such as point cloud captioning and lidar-to-image generation, without any additional training. We hope LidarCLIP can inspire future work to dive deeper into connections between text and point cloud understanding. Code and trained models available at https://github.com/atonderski/lidarclip.

READ FULL TEXT

page 3

page 6

page 7

page 8

page 14

page 15

page 16

page 17

research
09/08/2022

Learning to Generate Realistic LiDAR Point Clouds

We present LiDARGen, a novel, effective, and controllable generative mod...
research
07/26/2023

Deep Robust Multi-Robot Re-localisation in Natural Environments

The success of re-localisation has crucial implications for the practica...
research
09/01/2023

Point-Bind Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following

We introduce Point-Bind, a 3D multi-modality model aligning point clouds...
research
09/17/2023

LiDAR Data Synthesis with Denoising Diffusion Probabilistic Models

Generative modeling of 3D LiDAR data is an emerging task with promising ...
research
12/16/2022

Point-E: A System for Generating 3D Point Clouds from Complex Prompts

While recent work on text-conditional 3D object generation has shown pro...
research
09/02/2022

Fiducial Marker Detection in Multi-Viewpoint Point Cloud

The existing LiDAR fiducial marker systems have usage restrictions. Espe...
research
04/19/2019

LATTE: Accelerating LiDAR Point Cloud Annotation via Sensor Fusion, One-Click Annotation, and Tracking

LiDAR (Light Detection And Ranging) is an essential and widely adopted s...

Please sign up or login with your details

Forgot password? Click here to reset