Learning, Probability and Logic: Toward a Unified Approach for Content-Based Music Information Retrieval. Crayencour, H. & Cella, C. Frontiers in Digital Humanities, 6(April):1–25, 2019.
doi  abstract   bibtex   
Within the last fifteen years, the field of Music Information Retrieval (MIR) has made tremendous progress in the development of algorithms for organizing and analyzing the ever-increasing large and varied amount of music and music-related data available digitally. However, the development of content-based methods to enable or improve multimedia retrieval still remains a central challenge. In this perspective paper, we critically look at the problem of automatic chord estimation from audio recordings as a case study of content-based algorithms, and point out several bottlenecks in current approaches: expressiveness and flexibility are obtained to the expense of robustness and vice-versa; available multimodal sources of information are little exploited; modeling multi-faceted and strongly interrelated musical information is limited with current architectures; models are typically restricted to short-term analysis that does not account for the hierarchical temporal structure of musical signals. Dealing with music data requires the ability to handle both uncertainty and complex relational structure at multiple levels of representation. Traditional approaches have generally treated these two aspects separately, probability and learning being the standard way to represent uncertainty in knowledge, while logical representation being the standard way to represent knowledge and complex relational information. We advocate that the identified hurdles of current approaches could be overcome by recent developments in the area of Statistical Relational Artificial Intelligence (StarAI) that unifies probability, logic and (deep) learning. We show that existing approaches used in MIR find powerful extensions and unifications in StarAI, and we explain why we think it is time to consider the new perspectives offered by this promising research field.
@Article{          crayencour.ea2019-learning,
    author       = {Crayencour, Helene-Camille and Cella, Carmine-Emanuele},
    year         = {2019},
    title        = {Learning, Probability and Logic: Toward a Unified
                   Approach for Content-Based Music Information Retrieval},
    abstract     = {Within the last fifteen years, the field of Music
                   Information Retrieval (MIR) has made tremendous progress
                   in the development of algorithms for organizing and
                   analyzing the ever-increasing large and varied amount of
                   music and music-related data available digitally. However,
                   the development of content-based methods to enable or
                   improve multimedia retrieval still remains a central
                   challenge. In this perspective paper, we critically look
                   at the problem of automatic chord estimation from audio
                   recordings as a case study of content-based algorithms,
                   and point out several bottlenecks in current approaches:
                   expressiveness and flexibility are obtained to the expense
                   of robustness and vice-versa; available multimodal sources
                   of information are little exploited; modeling
                   multi-faceted and strongly interrelated musical
                   information is limited with current architectures; models
                   are typically restricted to short-term analysis that does
                   not account for the hierarchical temporal structure of
                   musical signals. Dealing with music data requires the
                   ability to handle both uncertainty and complex relational
                   structure at multiple levels of representation.
                   Traditional approaches have generally treated these two
                   aspects separately, probability and learning being the
                   standard way to represent uncertainty in knowledge, while
                   logical representation being the standard way to represent
                   knowledge and complex relational information. We advocate
                   that the identified hurdles of current approaches could be
                   overcome by recent developments in the area of Statistical
                   Relational Artificial Intelligence (StarAI) that unifies
                   probability, logic and (deep) learning. We show that
                   existing approaches used in MIR find powerful extensions
                   and unifications in StarAI, and we explain why we think it
                   is time to consider the new perspectives offered by this
                   promising research field.},
    doi          = {10.3389/fdigh.2019.00006},
    issn         = {2297-2668},
    journal      = {Frontiers in Digital Humanities},
    keywords     = {audio,chord recognition,content-based,mir,music
                   information retrieval,music information retrieval
                   (MIR),statistical relational artificial,statistical
                   relational artificial intelligence},
    mendeley-tags= {music information retrieval},
    number       = {April},
    pages        = {1--25},
    volume       = {6}
}

Downloads: 0