MeetingQMUL11May2011

AudioDB as Software Sustainability Challenge Meeting Notes

People: Chris Cannam, Christophe Rhodes, Tim Crawford, Benjamin Fields, Luis Figueira
Where: QMUL, 11 May 2011

Current state of AudioDB

  • AudioDB is some way from being ready for general use; it may be classified as research-grade code.
  • Christophe has some pending fixes which need to be committed to the new repository.

What works/state of tools and documentation

  • C API (Christophe)
    • This is an acceptable design and working correctly
  • Command-line interface (Christophe)
    • Pre-dates the current C API. Deprecated; should be redesigned
    • Although designed as an introspection tool, it is now (sometimes) used to generate real results which involves taking output that was not intended to be parseable and parsing it using e.g. awk – this is extremely brittle
  • Indexing code using locality-sensitive hashing (LSH) (Michael Casey)
    • Implementation is hurried, provisional, unreliable
    • Code is currently omitted entirely from our repository in order to avoid conflict with potential commercial applications
    • Useful for large collections of songs (>1Million) but not typically necessary for small-collection use cases where an exhaustive search is practicable
  • Unit tests (Christophe)
    • Tests exist for most of the library API, and also covering the same ground with the command-line interface
  • Language bindings
    • Common Lisp (Christophe): complete, with unit tests
    • Python (Ben): incomplete, with some unit tests
    • pd (Ian Knopke, rewritten by Christophe): suspect
    • Java (Mike Jewell): probably incomplete, some unit tests using JUnit
    • ActionScript (Mike Jewell): probably incomplete
  • Tools
    • “Fake” RDF triple store that invents triples on the fly (Mike Jewell): working, with limitations
    • Cocoa app (Mike Jewell, Ben): only example of an application that covers the complete range from low-level API to end-user interface; not a very pleasant interface however
    • Demonstration Web interface using PHP and command line scripts: probably broken
    • SOAP service (Michael Casey): deprecated and probably broken
  • Feature extraction
    • By far the most resource-intensive part of populating a database
    • Recommended default method is to use Sonic Annotator with the NNLS Chroma plugin
    • There's a C script (populate) to load databases from Sonic Annotator output
    • Former audioDB feature extractor (fftExtract) is deprecated and broken
  • Documentation
    • Very little:
      • Obsolete database population tutorial: http://omras2.org/audioDB/tutorial1
      • Tiny and not very helpful query tutorial: http://omras2.org/audioDB/tutorial2
      • Not especially successful hip-hop example: http://omras2.org/audioDB/tutorial3
      • Man page included with software
      • Probably the best overview of the purpose and design of audioDB currently is in “Investigating Music Collections at Different Scales with AudioDB ” (Christophe Rhodes, Tim Crawford, Michael Casey, Mark d’Inverno, JNMR 2010). A number of earlier publications also refer to audioDB
      • Christophe is working on a formal specification for the database semantics, but this is of little use to end users
  • Database filesystem notes
    • Currently the database is just one big sparse file
    • Portability problems with this: for example, HFS+ (Mac default filesystem) doesn't support sparse files
    • Also maintenance problems: database needs to be sized in advance and resized as needed, cannot just append data – this is certainly doable but is not very convenient
    • Christophe would like to change this so as to have one file per column, append-only

Aspirational use cases

  • Christophe: Intelligent classical music player – find examples of Bach reusing this particular theme
  • Ben: Treating playlists (rather than songs) as sequences of features where social tags constitute a feature; he published on this in “Using Song Social Tags and Topic Models to Describe and Compare Playlists ” (WOMRAD 2010)
  • Christophe: Observes that other Goldsmiths researchers are interested in treating multimedia data also (e.g. streetview-like scene data)
  • Tim: Studying Wagner's use of leitmotif
  • Tim: Live querying from musical input device
  • Tim: Categorising musical samples for performance use
  • Tim: Film music analysis – is all John Williams the same?
  • Chris: Is the start of the old Red Dwarf opening credit a quote from Mahler or Bruckner, or does it just sound like one?
  • Also noted: Not just music databases: bird songs, street sounds, etc

Possible easy gains

(But for which target user? Consumer-level or API-level?)

  • Christophe: Suggests aiming at the Red Dwarf example, improve data load for a “standard” data set and query by example
  • Ben: Build a standard database using either the Million-Song Dataset or the subset of it that was provided alongside it

Next steps

  • Christophe: Commit pending bug fixes to new repository
  • Chris and Luis: Build the code and run the unit tests, possibly under coverage analysis
  • Chris and Luis: Test and improve the Java and Python bindings
  • Chris and Luis: Produce a more user-friendly import tool using the Python bindings (but still running Sonic Annotator behind the scenes)

Note: there's a project grant submission on indexing and PostgresSQL integration.