![]() The automatic detection of musical structure in audio recordings is one of the most challenging problems in the field of music information retrieval, since even human experts tend to disagree on the structural decomposition of a piece of music. Musical structures are indeed best retained by listeners when they form hierarchical patterns, with consequent implications on the appreciation of music and its performance. Human perception of musical structure is supposed to depend on the generation of hierarchies, which is inherently related to the actual organisation of sounds in music. To promote better, transparent evaluation, we propose new metrics and establish a large open and public repository containing evaluation code, reference annotations, and estimates. ![]() Additionally to creating or deriving new datasets for both tasks, we evaluate the quality and suitability of popular tempo datasets and metrics, and conclude that there is ample room for improvement. Datasets are therefore another focus of this work. For training and evaluation the proposed data-driven approaches rely on curated datasets covering certain key and tempo ranges as well as genres. In particular, we investigate the effects of learning on different splits of cross-version datasets, i.e., datasets that contain multiple recordings of the same pieces. To improve our understanding of these systems, we systematically explore network architectures for both global and local estimation, with varying depths and filter shapes, as well as different ways of splitting datasets for training, validation, and testing. ![]() We find that the same kinds of networks can also be used for key estimation by changing the orientation of directional filters. This allows us to take a purely data-driven approach using supervised machine learning (ML) with convolutional neural networks (CNN). We then re-formulate the signal-processing pipeline as a deep computational graph with trainable weights. We first propose novel methods using digital signal processing and traditional feature engineering. To improve tempo estimation, we focus mainly on shortcomings of existing approaches, particularly estimates on the wrong metrical level, known as octave errors. Both tasks are well established in MIR research. Key estimation labels music recordings with a chord name describing its tonal center, e.g., C major. Tempo estimation is often defined as determining the number of times a person would “tap” per time interval when listening to music. In this thesis, we propose, explore, and analyze novel data-driven approaches for the two MIR analysis tasks tempo and key estimation for music recordings. Creating such methods is a central part of the research area Music Information Retrieval (MIR). Efficient retrieval from such collections, which goes beyond simple text searches, requires automated music analysis methods. In recent years, we have witnessed the creation of large digital music collections, accessible, for example, via streaming services. We also give an overview of the un- derlying MIR techniques of audio matching, music syn- chronization, and text-based retrieval that are incorporated in the current version of the system. In this paper, we describe the technical background of the SyncPlayer framework in detail. ![]() SyncPlayer has al- ready proved to be a valuable tool for evaluating algo- rithms in MIR research on a larger scale. To the best of our knowledge, our systemismoreover the first to systematically exploit automatically generated synchronization data for content-based symbolic brows- ing in high quality audio recordings. In addition to vi- sualization, the system provides functionality for content- based music retrieval and semi-manual content annota- tion. Subsequently, the server delivers music-related data like scores or lyrics to the client, which are then displayed synchronously with audio playback us- ing a multimodal visualization plug-in. The recording is then identified by the SyncPlayer server, a process which is performed entirely content-based. Using the SyncPlayer client interface, a user may play back an audio recording that is locally available on his computer. In this paper, we present the SyncPlayer system for mul- timodal presentation of high quality audio and associated music-related data.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |