Applying Spatiotemporal Attention to Identify Distracted and Drowsy Driving with Vision Transformers

07/22/2022
by   Samay Lakhani, et al.
0

A 20 result of increased distraction and drowsiness. Drowsy and distracted driving are the cause of 45 distracted driving, detection methods using computer vision can be designed to be low-cost, accurate, and minimally invasive. This work investigated the use of the vision transformer to outperform state-of-the-art accuracy from 3D-CNNs. Two separate transformers were trained for drowsiness and distractedness. The drowsy video transformer model was trained on the National Tsing-Hua University Drowsy Driving Dataset (NTHU-DDD) with a Video Swin Transformer model for 10 epochs on two classes – drowsy and non-drowsy simulated over 10.5 hours. The distracted video transformer was trained on the Driver Monitoring Dataset (DMD) with Video Swin Transformer for 50 epochs over 9 distraction-related classes. The accuracy of the drowsiness model reached 44 test set, indicating overfitting and poor model performance. Overfitting indicates limited training data and applied model architecture lacked quantifiable parameters to learn. The distracted model outperformed state-of-the-art models on DMD reaching 97.5 data and a strong architecture, transformers are suitable for unfit driving detection. Future research should use newer and stronger models such as TokenLearner to achieve higher accuracy and efficiency, merge existing datasets to expand to detecting drunk driving and road rage to create a comprehensive solution to prevent traffic crashes, and deploying a functioning prototype to revolutionize the automotive safety industry.

READ FULL TEXT

page 2

page 3

research
05/30/2023

Vision Transformers for Mobile Applications: A Short Survey

Vision Transformers (ViTs) have demonstrated state-of-the-art performanc...
research
10/11/2021

Investigating Transfer Learning Capabilities of Vision Transformers and CNNs by Fine-Tuning a Single Trainable Block

In recent developments in the field of Computer Vision, a rise is seen i...
research
06/08/2021

Scaling Vision Transformers

Attention-based neural networks such as the Vision Transformer (ViT) hav...
research
09/08/2022

Video Vision Transformers for Violence Detection

Law enforcement and city safety are significantly impacted by detecting ...
research
08/26/2023

Fixating on Attention: Integrating Human Eye Tracking into Vision Transformers

Modern transformer-based models designed for computer vision have outper...
research
09/03/2022

Vision Transformers and YoloV5 based Driver Drowsiness Detection Framework

Human drivers have distinct driving techniques, knowledge, and sentiment...
research
05/08/2023

Robust Traffic Light Detection Using Salience-Sensitive Loss: Computational Framework and Evaluations

One of the most important tasks for ensuring safe autonomous driving sys...

Please sign up or login with your details

Forgot password? Click here to reset