Not All Roads Lead to Rome: Pitch Representation and Model Architecture for Automatic Harmonic Analysis. Micchi, G., Gotham, M., & Giraud, M. Transactions of the International Society for Music Information Retrieval, 3(1):42–54, 2020.
doi  abstract   bibtex   
Automatic harmonic analysis has been an enduring focus of the MIR community, and has enjoyed a particularly vigorous revival of interest in the machine-learning age. We focus here on the specific case of Roman numeral analysis which, by virtue of requiring key/functional information in addition to chords, may be viewed as an acutely challenging use case. We report on three main developments. First, we provide a new meta-corpus bringing together all existing Roman numeral analysis datasets; this offers greater scale and diversity, not only of the music represented, but also of human analytical viewpoints. Second, we examine best practices in the encoding of pitch, time, and harmony for machine learning tasks. The main contribution here is the introduction of full pitch spelling to such a system, an absolute must for the comprehensive study of musical harmony. Third, we devised and tested several neural network architectures and compared their relative accuracy. In the best-performing of these models, convolutional layers gather the local information needed to analyse the chord at a given moment while a recurrent part learns longer-range harmonic progressions. Altogether, our best representation and architecture produce a small but significant improvement on overall accuracy while simultaneously integrating full pitch spelling. This enables the system to retain important information from the musical sources and provide more meaningful predictions for any new input.
@Article{          micchi.ea2020-not,
    author       = {Micchi, Gianluca and Gotham, Mark and Giraud, Mathieu},
    year         = {2020},
    title        = {Not All Roads Lead to Rome: Pitch Representation and
                   Model Architecture for Automatic Harmonic Analysis},
    abstract     = {Automatic harmonic analysis has been an enduring focus of
                   the MIR community, and has enjoyed a particularly vigorous
                   revival of interest in the machine-learning age. We focus
                   here on the specific case of Roman numeral analysis which,
                   by virtue of requiring key/functional information in
                   addition to chords, may be viewed as an acutely
                   challenging use case. We report on three main
                   developments. First, we provide a new meta-corpus bringing
                   together all existing Roman numeral analysis datasets;
                   this offers greater scale and diversity, not only of the
                   music represented, but also of human analytical
                   viewpoints. Second, we examine best practices in the
                   encoding of pitch, time, and harmony for machine learning
                   tasks. The main contribution here is the introduction of
                   full pitch spelling to such a system, an absolute must for
                   the comprehensive study of musical harmony. Third, we
                   devised and tested several neural network architectures
                   and compared their relative accuracy. In the
                   best-performing of these models, convolutional layers
                   gather the local information needed to analyse the chord
                   at a given moment while a recurrent part learns
                   longer-range harmonic progressions. Altogether, our best
                   representation and architecture produce a small but
                   significant improvement on overall accuracy while
                   simultaneously integrating full pitch spelling. This
                   enables the system to retain important information from
                   the musical sources and provide more meaningful
                   predictions for any new input.},
    doi          = {10.5334/tismir.45},
    journal      = {Transactions of the International Society for Music
                   Information Retrieval},
    keywords     = {1,1 key,chords and functional harmony,computational
                   musicology,corpus,functional harmony,introduction,is
                   common to a,machine learning,motivation,pitch
                   encoding,previous work,roman numeral analysis,some sense
                   of,tonal harmony,very wide},
    mendeley-tags= {computational musicology},
    number       = {1},
    pages        = {42--54},
    volume       = {3}
}

Downloads: 0