Cross-domain Deep Feature Combination for Bird Species Classification with Audio-visual Data

11/26/2018
by   Bold Naranchimeg, et al.
12

In recent decade, many state-of-the-art algorithms on image classification as well as audio classification have achieved noticeable successes with the development of deep convolutional neural network (CNN). However, most of the works only exploit single type of training data. In this paper, we present a study on classifying bird species by exploiting the combination of both visual (images) and audio (sounds) data using CNN, which has been sparsely treated so far. Specifically, we propose CNN-based multimodal learning models in three types of fusion strategies (early, middle, late) to settle the issues of combining training data cross domains. The advantage of our proposed method lies on the fact that We can utilize CNN not only to extract features from image and audio data (spectrogram) but also to combine the features across modalities. In the experiment, we train and evaluate the network structure on a comprehensive CUB-200-2011 standard data set combing our originally collected audio data set with respect to the data species. We observe that a model which utilizes the combination of both data outperforms models trained with only an either type of data. We also show that transfer learning can significantly increase the classification performance.

READ FULL TEXT

page 4

page 5

page 6

page 7

page 8

research
04/30/2023

Deep Learning Based Multimodal with Two-phase Training Strategy for Daily Life Video Classification

In this paper, we present a deep learning based multimodal system for cl...
research
02/03/2021

Deep CNNs for large scale species classification

Large Scale image classification is a challenging problem within the fie...
research
08/10/2021

An empirical investigation into audio pipeline approaches for classifying bird species

This paper is an investigation into aspects of an audio classification p...
research
10/07/2016

Distributed Averaging CNN-ELM for Big Data

Increasing the scalability of machine learning to handle big volume of d...
research
04/04/2019

Biometric Fish Classification of Temperate Species Using Convolutional Neural Network with Squeeze-and-Excitation

Our understanding and ability to effectively monitor and manage coastal ...
research
05/11/2023

Deep Visual-Genetic Biometrics for Taxonomic Classification of Rare Species

Visual as well as genetic biometrics are routinely employed to identify ...
research
06/11/2021

Diseño y desarrollo de aplicación móvil para la clasificación de flora nativa chilena utilizando redes neuronales convolucionales

Introduction: Mobile apps, through artificial vision, are capable of rec...

Please sign up or login with your details

Forgot password? Click here to reset