We propose a method to effectively embed general objects, like audio samples, into a vectorial feature space, suitable for classification problems. From the practical point of view, the researcher adopting the proposed method is just required to provide two ingredients: an efficient compressor for those objects, and a way to combine two objects into a new one. The proposed method is based on two main elements: the dissimilarity representation and the normalized compression distance (NCD). The dissimilarity representation is an Euclidean embedding algorithm, i.e. a procedure to map generic objects into a vector space, which requires the definition of a distance function between the objects. The quality of the resulting embedding is strictly dependent on the choice of this distance. The NCD is a distance between objects based on the concept of Kolmogorov complexity. In practice the NCD is based on two building blocks: a compression function and a method to combine two objects into a new one. We claim that, as soon as a good compressor and a meaningful way to combine two objects are available, then it is possible to build an effective feature space in which classification algorithms can be accurate. As our submission to the IEEE AASP Challenge, we show a practical application of the proposed method in the context of acoustic scene classification where the compressor is the free and open source Vorbis lossy audio compressor and the combination of two audio samples is their simple concatenation.

Related publications

E. Olivetti, “The wonders of the normalized compression dissimilarity representation,” 2013.
[More Details] [BIBTEX]