Computational Methods for Tonality-Based Style Analysis of Classical Music Audio Recordings. Weiß, C. Ph.D. Thesis, Technische Universität Ilmenau, 2016.
Computational Methods for Tonality-Based Style Analysis of Classical Music Audio Recordings [pdf]Paper  doi  abstract   bibtex   
With the tremendously growing impact of digital technology, the ways of accessing music crucially changed. Nowadays, streaming services, download platforms, and private archives provide a large amount of music recordings to listeners. As tools for organizing and browsing such collections, automatic methods have become important. In the area of Music Informa- tion Retrieval, researchers are developing algorithms for analyzing and comparing music data with respect to musical characteristics. One typical application scenario is the classification of music recordings according to categories such as musical genres. In this thesis, we approach such classification problems with the goal of discriminating subgenres within Western classical music. In particular, we focus on typical categories such as historical periods or individual composers. From a musicological point of view, this classi- fication problem relates to the question of musical style, which constitutes a rather ill-defined and abstract concept. Usually, musicologists analyze musical scores in a manual fashion in order to acquire knowledge about style and its determining factors. This thesis contributes with computational methods for realizing such analyses on comprehensive corpora of audio recordings. Though it is hard to extract explicit information such as note events from audio data, the computational analysis of audio recordings might bear great potential for musi- cological research. One reason for this is the limited availability of symbolic scores in high quality. The style analysis experiments presented in this thesis focus on the fields of harmony and tonality. In the first step, we use signal processing techniques for computing chroma representations of the audio data. These semantic “mid-level” representations capture the pitch class content of an audio recording in a robust way and, thus, constitute a suitable starting point for subsequent processing steps. From such chroma representations, we derive measures for quantitatively describing stylistic properties of the music. Since chroma features suppress timbral characteristics to a certain extent, we hope to achieve invariance to timbre and instrumentation for our analysis methods. Inspired by the characteristics of the chroma representations, we model in this thesis specific concepts from music theory and propose algorithms to measure the occurence of certain tonal structures in audio recordings. One of the proposed methods aims at estimating the global key of a piece by considering the particular role of the final chord. Another contribution of this thesis is an automatic method to visualize modulations regarding diatonic scales as well as scale types over the course of a piece. Furthermore, we propose novel techniques for estimating the presence of specific interval and chord types and for measuring more abstract notions such as tonal complexity. In first experiments, we show the features' behavior for individual pieces and discuss their musical meaning. On the basis of these novel types of audio features, we perform comprehensive experiments for analyzing and classifying audio recordings regarding musical style. For this purpose, we apply methods from the field of machine learning. Using unsupervised clustering methods, we investigate the similarity of musical works across composers and composition years. Even though the underlying feature representations may be imprecise and error-prone in some cases, we can observe interesting tendencies that may exhibit some musical meaning when analyzing large databases. For example, we observe an increase of tonal complexity during the 19th and 20th century on the basis of our features. As an essential contribution of this dissertation, we perform automatic classification experiments according to historical periods (“eras”) and composers. We compile two datasets, on which we test common classifiers using both our tonal features and standardized audio features. Despite the vagueness of the task and the complexity of the data, we obtain good results for the classification with respect to historical periods. This indicates that the tonal features proposed in this thesis seem to robustly capture some stylistic properties. In contrast, using standardized timbral features for classification often leads to overfitting to the training data resulting in worse performance. Comparing different types of tonal features revealed that features relating to interval types, tonal complexity, and chord progressions are useful for classifying audio recordings with respect to musical style. This seems to validate the hypothesis that tonal characteristics can be discriminative for style analysis and that we can measure such characteristics directly from audio recordings. In summary, the interplay between musicology and audio signal processing can be very promising. When applied to a specific example, we have to be careful with the results of computational methods, which, of course, cannot compete with the experienced judgement of a musicologist. For analyzing comprehensive corpora, however, computer-assisted techniques provide interesting opportunities to recognize fundamental trends and to verify hypotheses.

Downloads: 0