"Polyphonic Singing Datasets for MIR Research"

Helene Cuesta and Sebastian Rosenzweig

Abstract:

We showcase two recently published datasets of polyphonic a
cappella vocal music. Both publicly available datasets form exciting
research scenarios for music information retrieval (MIR) tasks such as
(multiple) F0-estimation or source separation in vocal music. First, we
present the Erkomaishvili dataset, a curated corpus of traditional
Georgian vocal music. The corpus is based on historic tape recordings of
three-voice Georgian songs performed by the former master chanter Artem
Erkomaishvili. Second, we present Dagstuhl ChoirSet, a multitrack
dataset of "Western" a cappella choral music. The dataset includes
recordings of an amateur vocal ensemble performing two choir pieces in
full choir and quartet settings. The audio data was recorded during an
MIR seminar at Schloss Dagstuhl using different close-up microphones to
capture the individual singers' voices.