This is a contribution to the IEEE AASP Challenge on classification of acoustic scenes. From the 30 second long highly variable recordings, spectral, cepstral, energy and voicing-related audio features are extracted. A sliding window approach is used to obtain statistical functionals of the low-level features on short segments. SVM are used for classification of these short segments, and a majority voting scheme is employed to get a decision for the whole recording. On the official development set of the challenge, an accuracy of 73 % is achieved. A feature analysis using the t-statistic showed that mainly Mel spectra were the most relevant features.

Related publications

J. T. Geiger, B. Schuller, and G. Rigoll, “Recognising acoustic scenes with large-scale audio feature extraction and SVM,” 2013.
[More Details] [BIBTEX] [URL (ext.)]
J. T. Geiger, B. Schuller, and G. Rigoll, “Large-Scale Audio Feature Extraction and SVM for Acoustic Scene Classification,” in WASPAA, 2013, p. 4.
[More Details] [BIBTEX] [URL (ext.)]