Los Alamos National Laboratory
Phone| Search
T-5 HomeResearchPublications › cogliati-2017-piano
› Contact › People › Research
› Projects › Highlights › Publications
› Jobs › Visitor Info

Cite Details

Andrea Cogliati, Zhiyao Duan and Brendt Wohlberg, "Piano Transcription with Convolutional Sparse Lateral Inhibition", Signal Processing Letters, vol. 24, no. 4, doi:10.1109/LSP.2017.2666183, pp. 392--396, Apr 2017


This paper extends our prior work on context-dependent piano transcription to estimate the length of the notes in addition to their pitch and onset. This approach employs convolutional sparse coding along with lateral inhibition constraints to approximate a musical signal as the sum of piano note waveforms (dictionary elements) convolved with their temporal activations. The waveforms are pre-recorded for the specific piano to be transcribed in the specific environment. A dictionary containing multiple waveforms per pitch is generated by truncating a long waveform for each pitch to different lengths. During transcription, the dictionary elements are fixed and their temporal activations are estimated and post-processed to obtain the pitch, onset and note length estimation. A sparsity penalty promotes globally sparse activations of the dictionary elements, and a lateral inhibition term penalizes concurrent activations of different waveforms corresponding to the same pitch within a temporal neighborhood, to achieve note length estimation. Experiments on the MAPS dataset show that the proposed approach significantly outperforms a state-of-the-art music transcription method trained in the same context-dependent setting in transcription accuracy.

BibTeX Entry

author = {Andrea Cogliati and Zhiyao Duan and Brendt Wohlberg},
title = {Piano Transcription with Convolutional Sparse Lateral Inhibition},
year = {2017},
month = Apr,
urlpdf = {http://math.lanl.gov/~brendt/Publications/Docs/cogliati-2017-piano.pdf},
journal = {Signal Processing Letters},
volume = {24},
number = {4},
doi = {10.1109/LSP.2017.2666183},
pages = {392--396}