A Deep Representation for Invariance And Music Classification

04/01/2014
by   Chiyuan Zhang, et al.
0

Representations in the auditory cortex might be based on mechanisms similar to the visual ventral stream; modules for building invariance to transformations and multiple layers for compositionality and selectivity. In this paper we propose the use of such computational modules for extracting invariant and discriminative audio representations. Building on a theory of invariance in hierarchical architectures, we propose a novel, mid-level representation for acoustical signals, using the empirical distributions of projections on a set of templates and their transformations. Under the assumption that, by construction, this dictionary of templates is composed from similar classes, and samples the orbit of variance-inducing signal transformations (such as shift and scale), the resulting signature is theoretically guaranteed to be unique, invariant to transformations and stable to deformations. Modules of projection and pooling can then constitute layers of deep networks, for learning composite representations. We present the main theoretical and computational aspects of a framework for unsupervised learning of invariant audio representations, empirically evaluated on music genre classification.

READ FULL TEXT

page 1

page 3

page 5

research
11/17/2013

Unsupervised Learning of Invariant Representations in Hierarchical Architectures

The present phase of Machine Learning is characterized by supervised lea...
research
03/01/2017

Graph-based Isometry Invariant Representation Learning

Learning transformation invariant representations of visual data is an i...
research
03/09/2022

Resource-Efficient Invariant Networks: Exponential Gains by Unrolled Optimization

Achieving invariance to nuisance transformations is a fundamental challe...
research
08/05/2015

Deep Convolutional Networks are Hierarchical Kernel Machines

In i-theory a typical layer of a hierarchical architecture consists of H...
research
07/13/2019

Learning Complex Basis Functions for Invariant Representations of Audio

Learning features from data has shown to be more successful than using h...
research
03/02/2023

Deep Neural Networks with Efficient Guaranteed Invariances

We address the problem of improving the performance and in particular th...
research
11/27/2014

Visual Representations: Defining Properties and Deep Approximations

Visual representations are defined in terms of minimal sufficient statis...

Please sign up or login with your details

Forgot password? Click here to reset