We present an approach to learn voice-face representations from the talk...
High quality labeled datasets have allowed deep learning to achieve
impr...
Sound event detection (SED) is typically posed as a supervised learning
...
Audio tagging aims to infer descriptive labels from audio clips. Audio
t...
Motivated by the fact that characteristics of different sound classes ar...
Motivated by the fact that characteristics of different sound classes ar...
Audio scene classification, the problem of predicting class labels of au...
Deep learning has dramatically improved the performance of sounds
recogn...