Real-time vibration control of an electrolarynx based on statistical F0 contour prediction. Tanaka, K., Toda, T., Neubig, G., & Nakamura, S. In 2016 24th European Signal Processing Conference (EUSIPCO), pages 1333-1337, Aug, 2016.
Real-time vibration control of an electrolarynx based on statistical F0 contour prediction [pdf]Paper  doi  abstract   bibtex   
An electrolarynx is a speaking aid device to artificially generate excitation sounds to help laryngectomees produce electrolaryngeal (EL) speech. Although EL speech is quite intelligible, its naturalness significantly suffers from the unnatural fundamental frequency (F0) patterns of the mechanical excitation sounds. To make it possible to produce more naturally sounding EL speech, we have proposed a method to automatically control F0 patterns of the excitation sounds generated from the electrolarynx based on the statistical F0 prediction, which predicts F0 patterns from the produced EL speech in real-time. In our previous work, we have developed a prototype system by implementing the proposed real-time prediction method in an actual, physical electrolarynx, and through the use of the prototype system, we have found that improvements of the naturalness of EL speech yielded by the prototype system tend to be lower than that yielded by the batch-type prediction. In this paper, we examine negative impacts caused by latency of the real-time prediction on the F0 prediction accuracy, and to alleviate them, we also propose two methods, 1) modeling of segmented continuous F0 (CF0) patterns and 2) prediction of forthcoming F0 values. The experimental results demonstrate that 1) the conventional real-time prediction method needs a large delay to predict CF0 patterns and 2) the proposed methods have positive impacts on the real-time prediction.

Downloads: 0