SUMMIT: Source-Free Adaptation of Uni-Modal Models to Multi-Modal Targets

08/23/2023
by   Cody Simons, et al.
0

Scene understanding using multi-modal data is necessary in many applications, e.g., autonomous navigation. To achieve this in a variety of situations, existing models must be able to adapt to shifting data distributions without arduous data annotation. Current approaches assume that the source data is available during adaptation and that the source consists of paired multi-modal data. Both these assumptions may be problematic for many applications. Source data may not be available due to privacy, security, or economic concerns. Assuming the existence of paired multi-modal data for training also entails significant data collection costs and fails to take advantage of widely available freely distributed pre-trained uni-modal models. In this work, we relax both of these assumptions by addressing the problem of adapting a set of models trained independently on uni-modal data to a target domain consisting of unlabeled multi-modal data, without having access to the original source dataset. Our proposed approach solves this problem through a switching framework which automatically chooses between two complementary methods of cross-modal pseudo-label fusion – agreement filtering and entropy weighting – based on the estimated domain gap. We demonstrate our work on the semantic segmentation problem. Experiments across seven challenging adaptation scenarios verify the efficacy of our approach, achieving results comparable to, and in some cases outperforming, methods which assume access to source data. Our method achieves an improvement in mIoU of up to 12 Our code is publicly available at https://github.com/csimo005/SUMMIT.

READ FULL TEXT

page 2

page 9

research
05/02/2023

On Uni-Modal Feature Learning in Supervised Multi-Modal Learning

We abstract the features (i.e. learned representations) of multi-modal d...
research
12/06/2022

Union-set Multi-source Model Adaptation for Semantic Segmentation

This paper solves a generalized version of the problem of multi-source m...
research
03/15/2023

MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving

LiDAR and camera are two modalities available for 3D semantic segmentati...
research
03/14/2023

Data-Free Sketch-Based Image Retrieval

Rising concerns about privacy and anonymity preservation of deep learnin...
research
01/11/2023

A Multi-Modal Geographic Pre-Training Method

As a core task in location-based services (LBS) (e.g., navigation maps),...
research
06/21/2017

Multi-Modal Trip Hazard Affordance Detection On Construction Sites

Trip hazards are a significant contributor to accidents on construction ...
research
07/03/2023

End-To-End Prediction of Knee Osteoarthritis Progression With Multi-Modal Transformers

Knee Osteoarthritis (KOA) is a highly prevalent chronic musculoskeletal ...

Please sign up or login with your details

Forgot password? Click here to reset