HarmoF0: logarithmic scale dilated convolution for pitch estimation

HarmoF0: logarithmic scale dilated convolution for pitch estimation. Wei, W., Li, P., Yu, Y., & Li, W. undefined, 2022.

Paper doi abstract bibtex

Sounds, especially music, contain various harmonic components scattered in the frequency dimension. It is difﬁcult for normal convolutional neural networks to observe these overtones. This paper introduces a multiple rates dilated causal convolution (MRDC-Conv) method to capture the harmonic structure in logarithmic scale spectrograms efﬁciently. The harmonic is helpful for pitch estimation, which is important for many sound processing applications. We propose HarmoF0, a fully convolutional network, to evaluate the MRDC-Conv and other dilated convolutions in pitch estimation. The results show that this model outperforms the DeepF0, yields state-of-the-art performance in three datasets, and simultane-ously reduces more than 90% parameters. We also ﬁnd that it has stronger noise resistance and fewer octave errors.

@article{wei_harmof0_2022,
	title = {{HarmoF0}: logarithmic scale dilated convolution for pitch estimation},
	shorttitle = {{HarmoF0}},
	url = {https://www.semanticscholar.org/paper/HarmoF0%3A-Logarithmic-Scale-Dilated-Convolution-For-Wei-Li/915b1a0554d02fe9a79ae8956f1382395e315336},
	doi = {10.48550/arXiv.2205.01019},
	abstract = {Sounds, especially music, contain various harmonic components scattered in the frequency dimension. It is difﬁcult for normal convolutional neural networks to observe these overtones. This paper introduces a multiple rates dilated causal convolution (MRDC-Conv) method to capture the harmonic structure in logarithmic scale spectrograms efﬁciently. The harmonic is helpful for pitch estimation, which is important for many sound processing applications. We propose HarmoF0, a fully convolutional network, to evaluate the MRDC-Conv and other dilated convolutions in pitch estimation. The results show that this model outperforms the DeepF0, yields state-of-the-art performance in three datasets, and simultane-ously reduces more than 90\% parameters. We also ﬁnd that it has stronger noise resistance and fewer octave errors.},
	language = {en},
	urldate = {2022-05-09},
	journal = {undefined},
	author = {Wei, Weixing and Li, P. and Yu, Yi and Li, Wei},
	year = {2022},
	keywords = {\#nosource},
}

Downloads: 0

{"_id":"uZJvh37Cm5vrEEeMF","bibbaseid":"wei-li-yu-li-harmof0logarithmicscaledilatedconvolutionforpitchestimation-2022","author_short":["Wei, W.","Li, P.","Yu, Y.","Li, W."],"bibdata":{"bibtype":"article","type":"article","title":"HarmoF0: logarithmic scale dilated convolution for pitch estimation","shorttitle":"HarmoF0","url":"https://www.semanticscholar.org/paper/HarmoF0%3A-Logarithmic-Scale-Dilated-Convolution-For-Wei-Li/915b1a0554d02fe9a79ae8956f1382395e315336","doi":"10.48550/arXiv.2205.01019","abstract":"Sounds, especially music, contain various harmonic components scattered in the frequency dimension. It is difﬁcult for normal convolutional neural networks to observe these overtones. This paper introduces a multiple rates dilated causal convolution (MRDC-Conv) method to capture the harmonic structure in logarithmic scale spectrograms efﬁciently. The harmonic is helpful for pitch estimation, which is important for many sound processing applications. We propose HarmoF0, a fully convolutional network, to evaluate the MRDC-Conv and other dilated convolutions in pitch estimation. The results show that this model outperforms the DeepF0, yields state-of-the-art performance in three datasets, and simultane-ously reduces more than 90% parameters. We also ﬁnd that it has stronger noise resistance and fewer octave errors.","language":"en","urldate":"2022-05-09","journal":"undefined","author":[{"propositions":[],"lastnames":["Wei"],"firstnames":["Weixing"],"suffixes":[]},{"propositions":[],"lastnames":["Li"],"firstnames":["P."],"suffixes":[]},{"propositions":[],"lastnames":["Yu"],"firstnames":["Yi"],"suffixes":[]},{"propositions":[],"lastnames":["Li"],"firstnames":["Wei"],"suffixes":[]}],"year":"2022","keywords":"#nosource","bibtex":"@article{wei_harmof0_2022,\n\ttitle = {{HarmoF0}: logarithmic scale dilated convolution for pitch estimation},\n\tshorttitle = {{HarmoF0}},\n\turl = {https://www.semanticscholar.org/paper/HarmoF0%3A-Logarithmic-Scale-Dilated-Convolution-For-Wei-Li/915b1a0554d02fe9a79ae8956f1382395e315336},\n\tdoi = {10.48550/arXiv.2205.01019},\n\tabstract = {Sounds, especially music, contain various harmonic components scattered in the frequency dimension. It is difﬁcult for normal convolutional neural networks to observe these overtones. This paper introduces a multiple rates dilated causal convolution (MRDC-Conv) method to capture the harmonic structure in logarithmic scale spectrograms efﬁciently. The harmonic is helpful for pitch estimation, which is important for many sound processing applications. We propose HarmoF0, a fully convolutional network, to evaluate the MRDC-Conv and other dilated convolutions in pitch estimation. The results show that this model outperforms the DeepF0, yields state-of-the-art performance in three datasets, and simultane-ously reduces more than 90\\% parameters. We also ﬁnd that it has stronger noise resistance and fewer octave errors.},\n\tlanguage = {en},\n\turldate = {2022-05-09},\n\tjournal = {undefined},\n\tauthor = {Wei, Weixing and Li, P. and Yu, Yi and Li, Wei},\n\tyear = {2022},\n\tkeywords = {\\#nosource},\n}\n\n\n\n","author_short":["Wei, W.","Li, P.","Yu, Y.","Li, W."],"key":"wei_harmof0_2022","id":"wei_harmof0_2022","bibbaseid":"wei-li-yu-li-harmof0logarithmicscaledilatedconvolutionforpitchestimation-2022","role":"author","urls":{"Paper":"https://www.semanticscholar.org/paper/HarmoF0%3A-Logarithmic-Scale-Dilated-Convolution-For-Wei-Li/915b1a0554d02fe9a79ae8956f1382395e315336"},"keyword":["#nosource"],"metadata":{"authorlinks":{}},"html":""},"bibtype":"article","biburl":"https://bibbase.org/zotero/fsimonetta","dataSources":["pzyFFGWvxG2bs63zP"],"keywords":["#nosource"],"search_terms":["harmof0","logarithmic","scale","dilated","convolution","pitch","estimation","wei","li","yu","li"],"title":"HarmoF0: logarithmic scale dilated convolution for pitch estimation","year":2022}