Theme and variation encodings with roman numerals (TaVERn): A new data set for symbolic music analysis. Devaney, J., Arthur, C., Condit-Schultz, N., & Nisula, K. In Proceedings of the 16th International Society for Music Information Retrieval Conference, ISMIR 2015, pages 728–734, Málaga, Spain, 2015. International Society for Music Information Retrieval.
Theme and variation encodings with roman numerals (TaVERn): A new data set for symbolic music analysis [pdf]Paper  abstract   bibtex   
The Theme And Variation Encodings with Roman Numerals (TAVERN) dataset consists of 27 complete sets of theme and variations for piano composed between 1765 and 1810 by Mozart and Beethoven. In these theme and variation sets, comparable harmonic structures are realized in different ways. This facilitates an evaluation of the effectiveness of automatic analysis algorithms in generalizing across different musical textures. The pieces are encoded in standard **kern format, with analyses jointly encoded using an extension to **kern. The harmonic content of the music was analyzed with both Roman numerals and function labels in duplicate by two different expert analyzers. The pieces are divided into musical phrases, allowing for multiple-levels of automatic analysis, including chord labeling and phrase parsing. This paper describes the content of the dataset in detail, including the types of chords represented, and discusses the ways in which the analyzers sometimes disagreed on the lower-level harmonic content (the Roman numerals) while converging at similar high-level structures (the function of the chords within the phrase).

Downloads: 0