Multi-modal Machine Learning for Vehicle Rating Predictions Using Image, Text, and Parametric Data

05/24/2023
by   Hanqi Su, et al.
0

Accurate vehicle rating prediction can facilitate designing and configuring good vehicles. This prediction allows vehicle designers and manufacturers to optimize and improve their designs in a timely manner, enhance their product performance, and effectively attract consumers. However, most of the existing data-driven methods rely on data from a single mode, e.g., text, image, or parametric data, which results in a limited and incomplete exploration of the available information. These methods lack comprehensive analyses and exploration of data from multiple modes, which probably leads to inaccurate conclusions and hinders progress in this field. To overcome this limitation, we propose a multi-modal learning model for more comprehensive and accurate vehicle rating predictions. Specifically, the model simultaneously learns features from the parametric specifications, text descriptions, and images of vehicles to predict five vehicle rating scores, including the total score, critics score, performance score, safety score, and interior score. We compare the multi-modal learning model to the corresponding unimodal models and find that the multi-modal model's explanatory power is 4 the unimodal models. On this basis, we conduct sensitivity analyses using SHAP to interpret our model and provide design and optimization directions to designers and manufacturers. Our study underscores the importance of the data-driven multi-modal learning approach for vehicle design, evaluation, and optimization. We have made the code publicly available at http://decode.mit.edu/projects/vehicleratings/.

READ FULL TEXT

page 5

page 6

page 8

page 11

research
01/26/2021

A Case Study of Deep Learning Based Multi-Modal Methods for Predicting the Age-Suitability Rating of Movie Trailers

In this work, we explore different approaches to combine modalities for ...
research
10/25/2021

"So You Think You're Funny?": Rating the Humour Quotient in Standup Comedy

Computational Humour (CH) has attracted the interest of Natural Language...
research
09/01/2022

Zero-Shot Multi-Modal Artist-Controlled Retrieval and Exploration of 3D Object Sets

When creating 3D content, highly specialized skills are generally needed...
research
02/14/2022

BROOK Dataset: A Playground for Exploiting Data-Driven Techniques in Human-Vehicle Interactive Designs

Emerging Autonomous Vehicles (AV) breed great potentials to exploit data...
research
07/08/2023

Ariadne's Thread:Using Text Prompts to Improve Segmentation of Infected Areas from Chest X-ray images

Segmentation of the infected areas of the lung is essential for quantify...
research
05/18/2020

Building BROOK: A Multi-modal and Facial Video Database for Human-Vehicle Interaction Research

With the growing popularity of Autonomous Vehicles, more opportunities h...
research
04/26/2020

On the Limits to Multi-Modal Popularity Prediction on Instagram – A New Robust, Efficient and Explainable Baseline

The predictability of social media popularity is a topic of much scienti...

Please sign up or login with your details

Forgot password? Click here to reset