Deep Convolutional Recurrent Neural Networks for Rare Sound Event Detection
* Presenting author
Rare acoustic event detection, as evidenced by the recent IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE 2017), is a growing field of acoustic classification research. Rare audio events often possess unique spectral and temporal structures which can aid their identification. In this regard, we investigate the advantages of a hybrid combination of convolutional neural network and a recurrent neural network to classify rare occurring sound events in audio streams. Our developed system uses log-Mel spectrograms, together with their derivatives, fed into convolutional layers to first extract high-level, shift-invariant spectral features. Recurrent layers are then used to learn the long-term temporal context from the obtained high-level features. Finally, using a feed forward neural network with sigmoid activations, a sequence of probability estimations are used to predict the onset and presence of the rare sounds. We develop and test our system on the Detection of Rare Sound Events task of the DCASE 2017 challenge. Key results presented indicate that our proposed approach outperforms the challenge baseline, improving the F-score from 72.7% to 90.3% and reducing detection error rate from 0.53 to 0.18.